Integrating PowerShell DSC with ITSM/Change management

Hello All,

We are in the process of implementing DSC in our organization. We are able to create and deploy configurations successfully in our non-prod test environment.

Before we implement DSC in PROD environment, our management need to integrate this with ITSM/Change management. So that everything has a Change Ticket (we are using ServiceNow). We can take care of this during the creation and deployment of DSC Configurations.

However, the actual problem is when DSC Configuration is deployed and it is in action. How do we integrate the ITSM/Change management and logging mechanism?

 

Let me give an example, suppose we have SERVER1 for which we have created a configuration to make sure that ‘TapiSrv’ is always in ‘Stopped’ state. Now due to some requirement user X has created a change ticket to Start this service. He has started the service as per the Change Ticket successfully. Now, when the LCM triggers DSC Configuration to restore the service to it’s original state i.e. ‘Stopped’. User don’t have a clue why this has happened and we don’t have any Change ticket before LCM restores the service to it’s original state. This change was happened without a Change Ticket or any logging mechanism.

Can we integrate some code to be executed before LCM restores/reverts the changes made to the Service so that we can do two things, create a Change Ticket programmatically and creating a DB entry before actually restoring the configuration.

We can take care of writing code to create a change ticket and making DB entry but how we can trigger that code before LCM restores the configuration.

This will also help us generate a report about how many times the server drifted from the configuration and LCM has restored it back.

Not sure whether it can be done before the LCM acts, but would definitely be able to do this checking the event logs, but that will be post LCM action.
Maybe @gaelcolas will be able to give you the best answer.

Ah, tricky one because ITSM implementations are massaged into “traditional” ways of working, not necessarily compatible with Infra as code principles.

What’s important in ITSM is what is “intentionally changed”, by people. The DNS cache of an URL may change without you knowing about it, and ITSM does not care, because the intent has not changed at a human level.

To record the change of intent in infrastructure as code we use source code management such as git.
Each change that goes in the system should be done by a Pull Request, and you should not even need a corresponding ticket (but you can if it’s not done by the same person, or if you have other reasons to).
That means the user who creates the ticket should not start the service. They (or a technician) should just send a pull request to set that service as “running” instead of stopped. That user should probably not even have the privilege to start that service in the first place.

So the LCM resetting the config to what it should be is not the issue, because the intent should be recorded in the code (git).
The usage of tickets and servicenow on top of that is to add more “business” features (workflow, tracking, user interface…).
When the LCM picks up that change, it will have the updated configuration.
I’m sure you can easily catch changes to the git repo that don’t have an exiting/approved ticket in servicenow.

How you manage such change deployment is another story, but your main branch of your repository will always reflect the current intention.
Your “monitoring job” from there is to make sure the delta between the Intent and currently applied is minimum (if you change the intent in the week but only apply changes in weekend your delta will not be null). The bigger the delta the more risk you’re taking when applying the change.
Making smaller changes more often reduces the risks.

This process of moving the controls earlier in the change process is called “shift left”.

Thank you everyone for your inputs on this topic.

That means the user who creates the ticket should not start the service. They (or a technician) should just send a pull request to set that service as “running” instead of stopped. That user should probably not even have the privilege to start that service in the first place.
This was just an example that I have given. However, even if we want to track or log these instances (when something was drifted and brought back to it's original state by LCM) in to a database table so that we can generate a report latter to see history of when and what has happened in past and was auto-corrected by DSC. We need to get hold of the event that can help us trigger this code before LCM actually triggers.

I am just trying to find-out if technically this is possible? May be by adding some code in the PowerShell while defining Configuration (just a thought). I don’t yet know if this is technically possible.

technically possible? yes I think so.
easy? no
How? by enabling logs and tracing to see if a Set has not been skipped.

and more specifically Troubleshooting DSC - PowerShell | Microsoft Docs

Be careful about retention and performance when enabling all those logs.
I don’t think that should be necessary if you have the right management model, because no change should be possible outside of the pipeline (unless break glass situation that should be recorded anyway to get the break glass credential).