I’m a keen to learn novice at this so I wonder if someone could help. I have a number of .xml files that I wish to extract certain variables from in order to identify the ones that need updating.
A previous colleague had written the formula below to extract equation values so I thought I’d edit this one to extract sub batch types and associated variables but it doesn’t work. Bizarrely the old one also doesn’t work anymore. What am I missing if I wish to extract subbatchtype name, labeltemplate and layouttemplate for each of the worksheets I have?
Even formatted ‘as code’, that code IMO is not easy to follow as written. It really would benefit from a small refactor/cleanup.
What that code is doing is grabbing the XML and turning into a PowerShell Object. The ‘nodes’ essentially become properties. You can access the ‘root’ property by using the ‘dot’, then typing it, in this case worksheet. If there’s a sub-node under worksheet, you access it by doing the dot once more, so, for example basicData. In some examples I’m not seeing the property in the file (for example, where is code or WorksheetVersion?)
As mentioned in two, each of those nodes is turned into a hierarchy of properties. To get the data, you’d need to get the XML file like you’re doing in your code then drill down. That said I don’t deal with XML enough to know how it handles those nodes with the same name so you might just need to look at the data manually in PowerShell to see what it looks like. For Example, if you have the XML content in a variable named X you should be able to get to the subBatchTypes node like this:
$x.worksheet.basicData.subBatchTypes
Then you can see what data is being displayed to see how you can access it.
Thanks dotnVo
The first piece relating to the basicdata “parameters” works fine. And they iterate for each of the 300 xml files I have as they are single line entries. The problem lies when I try to get it to ‘loop’ through a number of parameters such as subBatchTypes. It almost seems to me that I’m missing an additional command of the Foreach type to make it ‘loop’ through all the subBatchType entries before it moves on to the next .xml file.
Sorry for the odd terminology. Like I say I’m a novice with this.
In XML those are ‘nodes’ I believe, when translated into a PS object, they are properties, my previous posts has some suggestions to try and play with it. I personally would just import a single document as an XML doc, and play around with by accessing the properties, etc.
You might need to do some looping depending on what you or you might actually be able to access the ‘array’ of the sub objects by using the method I previously posted. I can’t tell you for sure, I don’t have your data set, and I haven’t worked a ton with XML and PowerShell. For me to tell you me concretely, I’d need a copy of one of the XML files so I can see what it looks like (not a screenshot, the actual file so I can pull it into PS). I’d be doing exactly what I suggested in my previous post, just toying around w/ the object to see what data is exposed and how it looks.
This worked pretty well, but does include all the properties, If you only wanted some of the properties of subbatch type i’d just pipe that ( | ) to Select-Object and type in the property names you want.
I know you’re writing more to a file than that, is it in a particular format or are you just trying to get the data into a file? I sorta re-wrote the code a bit you shared, because I noticed it didn’t seem to be correct, there were brackets and missing double quotes for the write-output:
HSS-02-QAS
12
Hydrocortisone in Sodium Chloride 0.9% w/v (Manual) – I.V./I.M.
X R
0 50
ml mg/ml
Is that what is expected, outside the additional data? If so I think that should work you just need to update the file path of all ur xml docs (my code uses my local test path), add in the bit about subbatch types with a select-object to grab only the properties you want, and update the Out-File path. Be sure to reference the same variable name (I used $Data in one example and $x in your slight code re-factor). Note I also changed to Out-File , as I rarely use > or >>. Lastly because I just was working with a single example I didn’t need the -Append on the Out-File but I went ahead and added it, in case you did not to append the data to the existing file instead of overwriting it.
This is brilliant. Thanks very much. In an ideal world having the outputs X, R etc with their associated values and units on one line would be fantastic eg
X 0 ml
R 50 mg/ml
but I bet this is difficult and time consuming to code.
I assume that the code as above literally outputs all the individual variables (name, value, unit) without having to do the first one and then the next one but my looping idea would be needed to do the X 0 ml thing until it ran out of variables and then move on to the next xml file.
In normal usage its just about extracting the data into a new file and then comparing the contents of this file with an older one to flag up what has changed. Its only something we do intermittently usually this time of the year to produce annual report but occasional if something goes wrong and we want to pin point a potential failure point.
With PS you can pretty much do anything you just have to toy with it some . That’s one of the great things about PowerShell, especially with work like this, as its really powerful for easy or ‘quick’ data parsing, IMO.
I think the best thing you can do further toy with the output is pull in one xml file, and explore it with powershell:
[xml]$Data = Get-Content -Path '.\23451.XML'
This will load a variable in your PS named $Data (again, update the path to your actual xml file). Then just type $x.worksheet it’ll output something like:
Then drill down further. $x.worksheet.equations and you’ll get:
Then continue going down that path… if you want fixed you add .fixed
That let’s you explore the data at each node and get a feel for what ‘PowerShell’ sees.
You talked about wanting the value/units on one line and you can do that, but you just need to manipulate the data further. at the ‘fixed’ property level, there’s X and R. Because of that you need to loop through it and output a one line string instead:
You can add separators as well if you want to do that. or maybe you don’t, but this is where playing around with the data will help you understand what’s happening under the hood, but how to manipulate it for your business needs.
I think between the last post and this post that probably gives you enough code to play around with.