Creating Desired State Configuration Resource - Interesting challenges

fausto-nascimento · April 19, 2015, 7:22pm

I am looking at improving the DSC Hyper-V resource, but am encountering some interesting challenges. Bear in mind these challenges will exist for most DSC resources and not just Hyper-V or switches, but I need to give a clear example.

Currently, a network switch is represented in DSC Hyper-V resource as a combination of its name and type, and is expected to be unique.

However, Hyper-V allows for multiple switches to have the same name, even the same name and type! This creates some peculiar scenarios:

Imagine I create a switch of type internal called test. I then associate 10 VMs to that switch and everything is good with the world. DSC is tracking configuration drifts, so if something in my switch configuration changes it will set it back to ‘the right way’.

Now let’s say someone accidentally modifies my test switch from internal to external. When DSC next runs, it will realize it’s not in a desired state, and it will attempt to fix it. But since what uniquely defines a switch in the resource is a combination of the name and type, it will simply say it can’t find a switch called ‘test’ of type internal and… create one!

So now I have two test switches: one internal - with no VMs attached to it and doing absolutely nothing, but being tracked by DSC and DSC happily saying ‘everything is OK’, and the original test switch which is now external by accident and has my 10 VMs attached - which are now broken.

It sounds like this has an easy solution: just track the switch purely based on name, not name and type. Sure, that’s a solution, but bearing in mind that Hyper-V itself allows multiple switches of same name to co-exist (and even same name and type), isn’t this setting the bar quite low by having the DSC resource enforce unique names?

Even more importantly, bearing in mind that DSC does not prevent users from manually doing what they want, how should the resource behave if it’s tracking a switch named ‘test’ (regardless of switch type) and a new switch of name ‘test’ is manually created?

Attempting to bring both in line with the configuration that was obviously meant for just one of those switches is very likely to cause problems - for example if the DSC was to set them as ‘external’ that won’t even work, since both would be attempting to use the same physical NIC.

And then there’s yet another problem… Even if we track the switch based purely on name, and we assume that there won’t be multiple switches with the same name on an environment (bold assumption - two words that should never go together), what happens if I have a single switch called test with 10 VMs attached and someone decides to change that switch’s name to helloWorld?

DSC is looking for a switch with name test, so when it runs it won’t find one… and create a brand new one. And monitor that one for drifts. Sure, so far only the name was changed, so the VMs are still working. But if someone accidentally changes the helloWorld switch’s type to something else? Boom, 10 VMs broken.

DSC tracking configuration drifts means to me that it should follow the specified object no matter what - this means if any property changes (including name, since it’s a property that can be changed) -, it should always follow the object and set it back to how it should be.

An ‘easy’ way to fix this would be to, when DSC runs for the first time against a switch (either to create it or to verify that it is as it should be) get the switch’s GUID and store it somewhere, let’s say the registry for the sake of argument. Now DSC knows that switch named test has the GUID 123. So when DSC next runs, the first thing it does is go to this registry key and see if there’s a value called test (name of the switch) and get the GUID from there. Then, instead of searching for a switch named test, it searches for a switch where GUID = 123. This way even if the name of the switch changes, it will still be able to find it and bring it back in-line with what the configuration should be (including changing its name back).

Accepting the GUID as a parameter rather than the name is not really an option, because DSC needs to also create the switch if it does not exist, and I can’t specify the GUID I would like a switch to have.

This fixes some problems:

-It allows for switch names to change and still bring switches back to a desired state (but needs to already know the GUID of said switch)
-It allows for multiple switches with the same name to co-exist (but only works correctly if the name is unique the very first time DSC runs so it can collect the GUID of that switch)
-If a switch with that GUID is not found (someone manually deleted it for example) it will create a new switch, set it up as it should be and replace the old registry stored GUID with the new one.-

But it also introduces some questions, doesn’t fix all problems (and might generate new problems I haven’t considered yet):

-Where to store this information? WMI? Registry? File?
-What if Ensure is set to Absent and a GUID is known? Does it only delete the switch with that GUID, allowing other switches with the same name to stay behind? What if the GUID is not known? Does it delete all switches with the given name?
-Should it only track switches based on name, or back to a combination of name and type? Does it make a difference?

I’m sure there are more questions it raises, but these are the ones I can think of at the moment.

Does anyone have similar thoughts around this or ideas on how to work around these issues? As I said above this is not limited to Hyper-V or switches and most other DSC resources will suffer from the same problems, something that probably needs to be considered and ‘fixed’.

david_obrien · April 19, 2015, 11:31pm

Hi,

I don’t have a solution to your overall issue, but a comment on “what if someone does X manually / outside of DSC?”.

Well, this shouldn’t happen. DSC is not a solution, DSC is part of a mindset I would say. If a system is “DSC managed”, then all changes go through DSC. Chef / Puppet are great tools that will make your life a bit easier here. Even without them, if you decided to use DSC, then do everything via DSC (on those systems).
Someone needs to create a new switch? As long as it is managed through DSC, should be ok and easy to track what’s happening.

Like I said, not really a solution, but maybe something that should need to be changed on a “processes” side of things.

Cheers
David

fausto-nascimento · April 20, 2015, 1:07am

Hi David,

Thanks for the reply. Hope you don’t mind that I both agree and disagree with your comment

For me (and I think I’m not alone here) one of the main advantages of DSC is the ability to track configuration drifts and rectify them. You remove that, and how does DSC differ from any other PowerShell script (or any other language for that matter) you run once and that’s it?

And while I completely agree that DSC managed systems should have little to no human interaction, we’re not even remotely close to being able to accomplish this yet. Here’s a few examples:

Want to set IovEnabled to true on a Hyper-V switch? Can’t currently do it via DSC.
Want to set VLANs on a Hyper-V switch? Can’t currently do it in DSC.
Want to set a note on a Hyper-V switch? Can’t currently do it in DSC.

And it’s not just the Hyper-V module either:

Want to set the e-mail address of a user in AD? Disable the user? Enable it? DSC can’t currently do it.

I could keep going for days, but the point is that currently DSC is not enough to configure everything, and human interaction is still required. And humans make mistakes.

What’s to prevent me from accidentally changing the type of a Hyper-V switch when editing it to set a note? We’ve all miss-clicked and clicked on things we didn’t mean to, sometimes (most times) we notice it, but not always.

That’s where DSC comes in, to fix those issues (theoretically). To have a rigid set of rules of what a particular object should look like and ‘make it so’.

Besides, even if you completely disallow all manual changes (which currently you can’t), what about applications that might attempt to make changes to objects being maintained by DSC? You can’t ‘disallow’ those from making changes.

A way to ‘fix’ this would be to say that if an object is managed by DSC then only DSC is allowed to change it, but then there would never be a drift, since only DSC is allowed to change it. But then wouldn’t DSC just become a glorified run once script?

Regardless of these hypothetical questions the fact is that DSC’s current mindset and implementation is that if the configuration drifts it should be fixed. But in many cases that’s not how the modules are set to behave…

Cheers,
Fausto

fausto-nascimento · April 20, 2015, 1:25am

By the way, when I say ‘Can’t currently do it in DSC’ I actually mean… currently released DSC modules are not configured to allow this.

donj · April 20, 2015, 10:58pm

[blockquote]Now let’s say someone accidentally modifies my test switch from internal to external. When DSC next runs, it will realize it’s not in a desired state, and it will attempt to fix it. But since what uniquely defines a switch in the resource is a combination of the name and type, it will simply say it can’t find a switch called ‘test’ of type internal and… create one!
[/blockquote]

I think this gets more into some questionable design decisions on the Hyper-V side. DSC isn’t going to be able to, in all cases, make poor design choices not suck. At the end of the day, DSC needs some way to uniquely refer to a configurable element; if the underlying technology doesn’t provide a means, then it’s going to get squirrely. Not much you can do about that.

[blockquote]
Even more importantly, bearing in mind that DSC does not prevent users from manually doing what they want, how should the resource behave if it’s tracking a switch named ‘test’ (regardless of switch type) and a new switch of name ‘test’ is manually created?
[/blockquote]

Well, maybe don’t overstate the case. “Users” can’t run around modifying virtual switches; “administrators” can. DSC isn’t mean to prevent a determined administrator from screwing up the universe. But sure, you’re suggesting someone could accidentally screw up, and DSC wouldn’t necessarily be able to fix it. Agreed.

[blockquote]
An ‘easy’ way to fix this would be to, when DSC runs for the first time against a switch (either to create it or to verify that it is as it should be) get the switch’s GUID and store it somewhere, let’s say the registry for the sake of argument. Now DSC knows that switch named test has the GUID 123. So when DSC next runs, the first thing it does is go to this registry key and see if there’s a value called test (name of the switch) and get the GUID from there. Then, instead of searching for a switch named test, it searches for a switch where GUID = 123. This way even if the name of the switch changes, it will still be able to find it and bring it back in-line with what the configuration should be (including changing its name back).
[/blockquote]

Sure, you could create additional management data and use that. I think you start to get into a delicate situation pretty easily - people could screw with your GUID as easily as screwing with the name. People could create switches that wouldn’t have a GUID, so you’d have to have a regular scan to look for un-managed switches, and probably some system-level setting about whether you wanted to just stomp those or allow them to continue existing.

[blockquote]
Accepting the GUID as a parameter rather than the name is not really an option, because DSC needs to also create the switch if it does not exist, and I can’t specify the GUID I would like a switch to have.
[/blockquote]

Well, it’s an option. You could certainly provide a GUID that you wanted it to use. There’s no reason the DSC resource has to create the GUID.

[blockquote]
But it also introduces some questions, doesn’t fix all problems (and might generate new problems I haven’t considered yet):

-Where to store this information? WMI? Registry? File?
-What if Ensure is set to Absent and a GUID is known? Does it only delete the switch with that GUID, allowing other switches with the same name to stay behind? What if the GUID is not known? Does it delete all switches with the given name?
-Should it only track switches based on name, or back to a combination of name and type? Does it make a difference?
[/blockquote]

Well, if you don’t somehow store it as a property of the switch, then you’ll always have the problem of maintaining the GUID-to-name mapping. But that’d mean a difficult programming exercise, since WMI wouldn’t be natively providing that information to you. But yeah, there’s a lot of logic to think through. You’re almost making an argument that the name (which is an existing property) should be a GUID, and that DSC should “kill” any switches that don’t have a GUID-based name. I’m not sure “Ensure=Absent” has a strong semantic meaning here, for example. I think you’re more looking at a systemwide setting of “allow unknown switches or not,” since you can’t “ensure absent” on something you don’t know will exist (either a GUID or a name).

fausto-nascimento · April 20, 2015, 11:49pm

Hi Don,

Thank you for replying.

I think you missunderstood what I suggested as an alternative.

Among other things, switches have two interesting properties:

Id (guid) {Get;},
Name (string) {Get;Set;}

So when I referred to a GUID, I meant the one that’s stored on the Id property, which is generated automatically when the switch is created and cannot be modified. The Id field is the unique identifier of a switch object - the name is not.

I could set the DSC resource to receive [GUID]$Id instead of [String]$Name, but that creates a problem: it would certainly be able to manage existing switches (and set them back to desired state even in a case where the name has changed), but would not be able to create it if one did not exist with the specified GUID (as we can’t specify which GUID to use).

This also raises other problems: if I have a single configuration I want to apply to 10 Hyper-V servers, seeing as the GUIDs are unique per VM and per switch… I now have to have 10 different DSC configurations - if I use the Name property as the identifier I don’t can have just the one. Also, how user friendly is it to say ‘if you want to configure a switch you must first retrieve the unique Id of the switch and paste it onto the DSC configuration’

What I meant about the name to GUID mapping was:

Imagine I specify the following configuration on a brand new Hyper-V server with no switches configured at all:

xVMSwitch WAN
{
    Name = 'WAN'
    Ensure = 'Present'
    SwitchType = 'External'
    NetAdapterName = 'Ethernet'
    IovEnabled = $true
}

DSC runs for the first time and Test-TargetResource realizes the switch is not there, so Set-TargetResource creates a switch called ‘WAN’ with the other attributes specified. As part of the Set-TargetResource, it stores the Id of the switch it just created somewhere (let’s say registry) and internally maps that Id to a switch of name ‘WAN’. So DSC now knows what the unique identifier for the switch with name WAN is.

The next time Test-TargetResource runs, it will read the registry key where it stores the mapping between the switch name and the Id to see if there’s an Id in there for the name provided. In this case there is. So instead of doing Get-VMSwitch -Name ‘WAN’, it does … Get-VMSwitch | Where-Object {$_.Id -eq ‘whateverWasWrittenInRegistry’} and then performs its checks. This way, even if the Name of the switch changed and it is no longer in line with the DSC configuration provided, since it’s looking for the switch based on its unique identifier rather than a property that’s {get;set;}, it will be able to set it back to how it was meant to be: ‘make it so’.

If the switch is manually deleted, the next time DSC runs it will check its mapping table, see that switch ‘WAN’ is supposed to have Id ‘whatever’ and perform a search for a switch with Id = ‘whatever’. Since it was deleted, it doesn’t find one, so it creates one and overwrites the mapping that it previously had for [Name] ‘WAN’ = [Id] ‘whatever’ to [Name] ‘WAN’ = [Id] ‘newIdentifier’ and everything is good with the world again.

Using this method would now allow people to screw it up since the Id field is not visible and cannot be set manually or changed after a switch is created, unless they manually delete the mapping table that the DSC resource relies on, but then we’d be getting into really a really pedantic realm: if you delete %Windir%\System32 folder does the OS boot?

This is the best method I could think of, but as I said it raises other questions, main two are:

-Where to store this information? WMI? Registry? File?
-What if Ensure is set to Absent and a GUID is known? Does it only delete the switch with that GUID, allowing other switches with the same name to stay behind? What if the GUID is not known? Does it delete all switches with the given name?

More importantly though, this is not limited to Hyper-V or switches.

Take the xADUser DSC resource. Currently it looks for users based on (among other things) the DN. This means if I move a user to a different OU… it will say it’s unable to find it, and create another one, rather than setting that one back to a desired state by moving it back to where it came from.

Take the Website DSC resource. If I rename my IIS website, what’s DSC going to try to do? Create a new one with the name given as it couldn’t find one with that name. Only it will fail because my renamed website is still bound to port 80 and it can’t bind the newly created one also to port 80.

Currently there is no DSC resource for SCCM, but if there was? I imagine managing applications/packages/collections/deployments would be done based on the Name of the application/package/collection/deployment, which is never the unique identifier. If the name changes? I’ll have another application/package/collection/deployment being created.

This affects any resource where the object identifier to DSC is either not unique, or can be changed (or both).

Regards,
Fausto

craig · April 21, 2015, 9:26am

I feel your pain. FIM has the same challenge. I’ve been writing DSC resources for FIM and have the same problem. In FIM when you create an object it gets a new GUID stamped on it (the GUID cannot be specified). That GUID is the only thing guaranteed to be unique. I could extend the FIM schema to include a new attribute, but I decided against it because I want people to be able to use the DSC resource on any deployment, and some people will not be willing to extend their schema for this purpose.

It’d be nice if DSC had a way to track the DSC item ‘key’ properties after the object was created.

My workaround today is to search for the object, then:
if it does not exist, create it
if it does exist, update it
if multiple copies of it exist, delete them all and create a new one (destructive, I know, but mostly OK in FIM)

fausto-nascimento · April 21, 2015, 6:47pm

Hi Craig,

Thank you for the reply. It is interesting to see that I am not the only one already experiencing this particular issue, though I still think we’ll see this being a bigger issue as DSC starts getting used more and more resources start being released.

Seeing as DSC does not currently have a native mechanism for dealing with this, I am working on a proof of concept to work around this issue. I’ll post my findings here when I’m done, but I would love to keep getting feedback from the community.

Regards,
Fausto

Topic		Replies	Views
How are people managing their configuration? DSC	14	343	May 16, 2024
DSC in a vmware template? DSC	10	251	May 16, 2024
Why are the resource\key combination required to be unique DSC	10	235	May 16, 2024
Selling DSC DSC	18	240	May 16, 2024
Managing Hostnames vs. GUIDs on a Pull Server DSC	23	470	May 16, 2024

Creating Desired State Configuration Resource - Interesting challenges

Related topics