Help with XML manipulation script.

by avelis26 at 2012-10-10 16:17:59

Here is what I’m trying to do:
Every day we have a report generated with XML and XSLT. THis XML report contains all the different tests and if they passed or failed. I want to write a script that will take the report, cut out all the successful tests, leaving me with a report that only contains failures. Please don’t try to suggest “why not just change the report generator” I’ve all ready thought of this :stuck_out_tongue:

Here is sample XML I’m trying to manipulate:


<?xml-stylesheet type=“text/xsl” href=“Result.xslt”?>
<opcr Version=“14.0.5624.xxxx” Wiki=“http://thisplace.aspx”>
<Settings>
<LogFolder>\server\logfolder\environment</LogFolder>
<RunEnvironments>Rendering, Search</RunEnvironments>
<Build>\server\buildlocation</Build>
<Environment>environment</Environment>
</Settings>
<Version Name=“vCurrent”>
<Tests Environment=“Rendering”>
<Test Name=“CertificateVerification” Owner=“person1” LogFile=“Rendering\CertificateVerification\Result.xml”>
<Result Name=“Certificate health check on thisplace.net:port” Owner=“person1” Result=“Pass” />
<Result Name=“Certificate health check on thisplace.net:port” Owner=“person1” Result=“Pass” />
<Summary Result=“Pass” TimeElapsed=“1” PassRate=“100.00” />
</Test>
<Test Name=“ConnectionsVerification” Owner=“person1” LogFile=“Rendering\ConnectionsVerification\Result.xml”>
<Result Name=“Farm#1 connection test ‘Public thisplace.com API’” Owner=“person1” Result=“Pass” />
<Result Name=“Farm#1 connection test ‘stuff’” Owner=“person1” Result=“Pass” />
<Summary Result=“Pass” TimeElapsed=“15” PassRate=“100.00” />
</Test>
<Test Name=“ThisTestFails” Owner=“person2” LogFile=“Rendering\LiveIdSignInVerification\Result.xml”>
<Result Name=“http://thisplace.net:port/ - someplace” Owner=“person2” Result=“Fail”>System.Exception: Form not found on current page! at ErrorINFO</Result>
<Summary Result=“Fail” TimeElapsed=“2” PassRate=“0.00” />
</Test>
<Summary TimeElapsed=“3” />
</Tests>
</Version>
<Summary Result=“Fail”>
<StartTime>9/10/2012 8:45:16 PM</StartTime>
<EndTime>9/10/2012 8:46:23 PM</EndTime>
<TimeElapsed>00:01:06.8156566</TimeElapsed>
<PassRate>99.09</PassRate>
</Summary>
</opcr>


I don’t need someone to write the full script for me. I have a good idea of how to do it … I just can’t seem to select the tests that fail only.
I’m using the following to load the XML into a variable:
$file = [xml](Get-Content ‘\server\folder\folder2\Result.xml’)
Then I know I can do something like:
$file.opcr.GetElementsByTagName(“Test”)
And its shows me all the tests but the results just say "Summary"

If anyone can get me pointed in the right direction I would be VERY appreciative.
by JeffH at 2012-10-10 18:47:23
Is this the complete XML file? If so you could try this:

$file.opcr.Summary.Result

Or something like this:

PS C:&gt; Select-Xml -Path c:\work\testresults.xml -xpath “/opcr/Summary” | select -ExpandProperty Node | where {$.result -eq ‘fail’}
by avelis26 at 2012-10-11 12:45:01
Yes that is the entire XML … well minus some other tests but it is the complete XML structure.
I guess I need more help than I thought … I tried your command and its not quite giving me the output I want.
Basically all I want to do is input the XML file from above, strip out the settings and tests that passed, and output to an xml file.
Below is a sample of what I want the finished product to look like:
<?xml-stylesheet type=“text/xsl” href=“Result.xslt”?>
<opcr Version=“14.0.5624.xxxx” Wiki=“http://thisplace.aspx”>
<Version Name=“vCurrent”>
<Tests Environment=“Rendering”>
<Test Name=“ThisTestFails” Owner=“person2” LogFile=“Rendering\LiveIdSignInVerification\Result.xml”>
<Result Name=“http://thisplace.net:port/ - someplace” Owner=“person2” Result=“Fail”>System.Exception: Form not found on current page! at ErrorINFO</Result>
<Summary Result=“Fail” TimeElapsed=“2” PassRate=“0.00” />
</Test>
<Summary TimeElapsed=“3” />
</Tests>
</Version>
<Summary Result=“Fail”>
<StartTime>9/10/2012 8:45:16 PM</StartTime>
<EndTime>9/10/2012 8:46:23 PM</EndTime>
<TimeElapsed>00:01:06.8156566</TimeElapsed>
<PassRate>99.09</PassRate>
</Summary>
</opcr>

I’ve tried with the suggestion you gave me but it’s not giving me all the data…
When I run:
Select-Xml -Path $path -XPath $xpath | Select-Object -ExpandProperty Node
I get the following output:
Name : CertificateVerification
Owner : person1
LogFile : Rendering\CertificateVerification\Result.xml
Result : {Certificate health check on thisaddress.net:0000, Certificate health check on thisaddress.net:0000}
Summary : Summary


The summary part is where I’m hitting issues … shouldn’t it say “Summary : Result = Pass/Fail” ?

Thank you again in advance for your help !!!
by avelis26 at 2012-10-11 12:55:58
Ahhh… I think I made some progress…

This:

$xpath = ‘opcr/Version/Tests/Test/Summary’
$path = '\server\folder\folder\folder\Result.xml’
Select-Xml -Path $path -XPath $xpath | Select-Object -ExpandProperty Node | Where-Object {$
.Result -like ‘Fail’}

Gives me these results:

Result TimeElapsed PassRate
------ ----------- --------
Fail 2 0.00
Fail 0 50.00


Ok so now I have it only selecting the tests that fail… this is good … but I want the entire node to output to a new XML file … not just the result/timeelapsed/passrate
Any thoughts on the best way to do this ?
by JeffH at 2012-10-11 13:00:32
Well the output object is a System.xml.xmlElement so you should be able to use it to construct a new XML file. But you want more than just the Summary node don’t you?
by avelis26 at 2012-10-11 13:09:56
I want the entire “Test” element where the element node Test.Summary.Result ="Fail"
by JeffH at 2012-10-11 13:16:28
Hmmm…I’ll have to dig into this. We are reaching the limits of my XML experience but I’m always up for learning something new.
by avelis26 at 2012-10-11 13:20:50
If this was SQL I would be set … this would be easy … thanks Jeff, I really do appreciate it.
by JeffH at 2012-10-11 13:24:18
Getting there. First, here’s a better xpath filter example. Should only get the Fail entries.

select-xml -Path C:\work\testresults.xml -xp “/opcr/Version/Tests/Test/Summary[@Result=‘Fail’]“
by JeffH at 2012-10-11 13:27:45
closer…

$test = select-xml -Path C:\work\testresults.xml -xp “/opcr/Version/Tests/Test/Summary[@Result=‘Fail’]”
$test.node.ParentNode


Name : ThisTestFails
Owner : person2
LogFile : Rendering\LiveIdSignInVerification\Result.xml
Result : Result
Summary : Summary

Now all we need to do is take this node and build the XML document you want.
by trondhindenes at 2012-10-11 13:28:40
I would use .Net’s xml methods, like removeNode.
http://msdn.microsoft.com/en-us/library … child.aspx
by avelis26 at 2012-10-11 13:33:15
Thank you… you are on the right path for sure. Now we just need it to write out the entire element containing the node with result = fail.

I’ve been trying to use converto-xml and throwing errors and out-file is just giving me what is on the screen … I know I’m missing something …
by JeffH at 2012-10-11 13:40:51
I am wondering if we can just remove the test nodes that pass. Would that suffice?
by avelis26 at 2012-10-11 13:45:48
I’m getting a little confused … do you mean Test nodes or Test elements?
If I’m using the correct terminology… this entire ELEMENT can be removed yes… that is what I’m trying to do, remove all “Test” elements/nodes where result = pass
<Test Name=“ConnectionsVerification” Owner=“person1” LogFile=“Rendering\ConnectionsVerification\Result.xml”>
<Result Name=“Farm#1 connection test ‘Public thisplace.com API’” Owner=“person1” Result=“Pass” />
<Result Name=“Farm#1 connection test ‘stuff’” Owner=“person1” Result=“Pass” />
<Summary Result=“Pass” TimeElapsed=“15” PassRate=“100.00” />
</Test>
by JeffH at 2012-10-11 13:48:00
I think he wants the entire Test removed.

This code removes tests that pass.

$file = “C:\work\testresults.xml”
[xml]$xml = Get-Content $file

#select passing tests
$nodes = $xml.SelectNodes(”/opcr/Version/Tests/Test/Summary[@Result=‘Pass’]”)
foreach ($node in $nodes) {
$testnode = $node.parentnode
#remove the testnode from its parent
$testnode.ParentNode.RemoveChild($testnode)
}

#save the result
$xml.Save(“c:\work\failed.xml”)


This is what the final xml file looks like:


<?xml-stylesheet type=“text/xsl” href=“Result.xslt”?>
<opcr Version=“14.0.5624.xxxx” Wiki=“http://thisplace.aspx”>
<Settings>
<LogFolder>\server\logfolder\environment</LogFolder>
<RunEnvironments>Rendering, Search</RunEnvironments>
<Build>\server\buildlocation</Build>
<Environment>environment</Environment>
</Settings>
<Version Name=“vCurrent”>
<Tests Environment=“Rendering”>
<Test Name=“ThisTestFails” Owner=“person2” LogFile=“Rendering\LiveIdSignInVerification\Result.xml”>
<Result Name=“http://thisplace.net:port/ - someplace” Owner=“person2” Result=“Fail”>System.Exception: Form not found on current page! at ErrorINFO</Result>
<Summary Result=“Fail” TimeElapsed=“2” PassRate=“0.00” />
</Test>
<Summary TimeElapsed=“3” />
</Tests>
</Version>
<Summary Result=“Fail”>
<StartTime>9/10/2012 8:45:16 PM</StartTime>
<EndTime>9/10/2012 8:46:23 PM</EndTime>
<TimeElapsed>00:01:06.8156566</TimeElapsed>
<PassRate>99.09</PassRate>
</Summary>
</opcr>

How close are we?
by avelis26 at 2012-10-11 14:09:07
That’s PERFECT thank you!!!

Questions … I want to learn from this and not just solve the issue:
1. $node never seems to be initialized, why is it not throwing an error?
oh … that is all lol
by JeffH at 2012-10-11 14:21:47
I am using a Foreach construct: Foreach something in a group of somethings, do something. In this structure you can define the variable for the ‘something’. I used $node

foreach ($node in $nodes) {
$testnode = $node.parentnode
#remove the testnode from its parent
$testnode.ParentNode.RemoveChild($testnode)
}

I could have used foo

foreach ($foo in $nodes) {
$testnode = $foo.parentnode
#remove the testnode from its parent
$testnode.ParentNode.RemoveChild($testnode)
}

The alternative is to use pipe the nodes to ForEach-Object, then the $_ would stand in for each item in $nodes:

$nodes | foreach-object {
$testnode = $.parentnode
#remove the testnode from its parent
$testnode.ParentNode.RemoveChild($testnode)
}

both techniques do the same thing. It is a matter of style, some people don’t like $
. It can also depend on performance and how many objects you need to process and what you want to do with them. If you situation, Foreach () works just fine I think.
by avelis26 at 2012-10-11 15:07:11
haha actually now that I think about that, I knew that … brain fart on the foreach syntax … thanks again… big help