I am looking at improving the DSC Hyper-V resource, but am encountering some interesting challenges. Bear in mind these challenges will exist for most DSC resources and not just Hyper-V or switches, but I need to give a clear example.
Currently, a network switch is represented in DSC Hyper-V resource as a combination of its name and type, and is expected to be unique.
However, Hyper-V allows for multiple switches to have the same name, even the same name and type! This creates some peculiar scenarios:
Imagine I create a switch of type internal called test. I then associate 10 VMs to that switch and everything is good with the world. DSC is tracking configuration drifts, so if something in my switch configuration changes it will set it back to ‘the right way’.
Now let’s say someone accidentally modifies my test switch from internal to external. When DSC next runs, it will realize it’s not in a desired state, and it will attempt to fix it. But since what uniquely defines a switch in the resource is a combination of the name and type, it will simply say it can’t find a switch called ‘test’ of type internal and… create one!
So now I have two test switches: one internal - with no VMs attached to it and doing absolutely nothing, but being tracked by DSC and DSC happily saying ‘everything is OK’, and the original test switch which is now external by accident and has my 10 VMs attached - which are now broken.
It sounds like this has an easy solution: just track the switch purely based on name, not name and type. Sure, that’s a solution, but bearing in mind that Hyper-V itself allows multiple switches of same name to co-exist (and even same name and type), isn’t this setting the bar quite low by having the DSC resource enforce unique names?
Even more importantly, bearing in mind that DSC does not prevent users from manually doing what they want, how should the resource behave if it’s tracking a switch named ‘test’ (regardless of switch type) and a new switch of name ‘test’ is manually created?
Attempting to bring both in line with the configuration that was obviously meant for just one of those switches is very likely to cause problems - for example if the DSC was to set them as ‘external’ that won’t even work, since both would be attempting to use the same physical NIC.
And then there’s yet another problem… Even if we track the switch based purely on name, and we assume that there won’t be multiple switches with the same name on an environment (bold assumption - two words that should never go together), what happens if I have a single switch called test with 10 VMs attached and someone decides to change that switch’s name to helloWorld?
DSC is looking for a switch with name test, so when it runs it won’t find one… and create a brand new one. And monitor that one for drifts. Sure, so far only the name was changed, so the VMs are still working. But if someone accidentally changes the helloWorld switch’s type to something else? Boom, 10 VMs broken.
DSC tracking configuration drifts means to me that it should follow the specified object no matter what - this means if any property changes (including name, since it’s a property that can be changed) -, it should always follow the object and set it back to how it should be.
An ‘easy’ way to fix this would be to, when DSC runs for the first time against a switch (either to create it or to verify that it is as it should be) get the switch’s GUID and store it somewhere, let’s say the registry for the sake of argument. Now DSC knows that switch named test has the GUID 123. So when DSC next runs, the first thing it does is go to this registry key and see if there’s a value called test (name of the switch) and get the GUID from there. Then, instead of searching for a switch named test, it searches for a switch where GUID = 123. This way even if the name of the switch changes, it will still be able to find it and bring it back in-line with what the configuration should be (including changing its name back).
Accepting the GUID as a parameter rather than the name is not really an option, because DSC needs to also create the switch if it does not exist, and I can’t specify the GUID I would like a switch to have.
This fixes some problems:
-It allows for switch names to change and still bring switches back to a desired state (but needs to already know the GUID of said switch)
-It allows for multiple switches with the same name to co-exist (but only works correctly if the name is unique the very first time DSC runs so it can collect the GUID of that switch)
-If a switch with that GUID is not found (someone manually deleted it for example) it will create a new switch, set it up as it should be and replace the old registry stored GUID with the new one.-
But it also introduces some questions, doesn’t fix all problems (and might generate new problems I haven’t considered yet):
-Where to store this information? WMI? Registry? File?
-What if Ensure is set to Absent and a GUID is known? Does it only delete the switch with that GUID, allowing other switches with the same name to stay behind? What if the GUID is not known? Does it delete all switches with the given name?
-Should it only track switches based on name, or back to a combination of name and type? Does it make a difference?
I’m sure there are more questions it raises, but these are the ones I can think of at the moment.
Does anyone have similar thoughts around this or ideas on how to work around these issues? As I said above this is not limited to Hyper-V or switches and most other DSC resources will suffer from the same problems, something that probably needs to be considered and ‘fixed’.