Weird XML

I am trying to edit this XML file that I believe is not true xml…

I have used powershell in the past to edit xml files and delete nodes/etc… but this one I am a bit confused on…

Here is a little of what the XML sheet consist of…

I want it to delete the entry that consist of the description containing 8.1.1.1 in it.

If you could give me some direction where I could run with it… I would greatly appreciate it.

So, we don’t use the [code] tag; it’s an HTML PRE tag. But, XML in particular - because browsers see it as HTML - doesn’t render well in the forums. Make a Gist out of your XML, and then paste the Gist URL here, and we’ll display it properly. There’s a link to Gist right above the reply textbox, and it just uses your GitHub account.

thanks for the heads up Don… I fixed it.

So, that is “valid” XML, but it isn’t especially well designed. All PowerShell is going to let you do easily is get to the custom_item elements. From there, it’s just going to see the contents alone undifferentiated string. That means probably using a -replace operator to replace the text with an empty string. Honestly, I’m not even sure it’s worth treating this as XML, given what you want to do. It might be easier to use Get-Content to read it line by line, and just replace what you want, and then output it to a new file.

if I read it line by line… how would I delete the entire record of “Custom_Item” that contains 8.1.1.1 in the description… seems kind of hard to do…

Well, it’s not a “record,” it’s a line of text, right? So, while this isn’t the exact code you’ll need, it’s maybe the basic logic.

Get-Content input.xml |
ForEach-Object {
  if ($_ -like '*Custom_Item*') {
    $_ -replace 'Custom Item','Noncustom Item'
  }
} | Out-File output.xml

So, read each line. For each line, see if the line contains the search text. If it does, replace the search text with something else (maybe you’d use a blank string?). Let the output fall through to a new file.

Sorry if I am not explaining this right… but if you were to look back at the xml code it has

Custom_item (in xml form)
system : linux
description : 8.1.1.1
(close)Custom_item (in xml form)
Custom_item (in xml form)
system : linux
description : 8.1.1.2
(close)Custom_item (in xml form)
Custom_item (in xml form)
system : linux
description : 8.1.1.1
(close)Custom_item (in xml form)

after running the script it should look like

Custom_item (in xml form)
system : linux
description : 8.1.1.2
(close)Custom_item (in xml form)

the entire record was deleted… I don’t want to delete just “custom_item” nor do I want to delete just “description”… the entire entry should be deleted. containing custom_item, description, system, etc… if description contains 8.1.1.1

Oh. No, I wasn’t getting that. Yeah, that’s harder. Honestly, I’d basically have to write a parser for you that read in the file, looked for section beginnings, stashed the whole section in a variable, analyzed it, and output it if it wasn’t to be deleted. Not technically difficult, but it’s not something I’m going to be able to do for you right here. Sorry :(.

Yeah… seems really hard… :frowning:

Well, “hard” depends on your experience with programming. I don’t think it’s technically hard at all; it’s just not something I can hack out for you in 1 minute.

Ill look it up… I have decent experience… this is nothing ive done before is all.

So think of it this way.

Read through the content line by line.

When you see a start-of-section line, you start adding all subsequent lines to a variable, until you see an end-of-section line. When you see end-of section, evaluate the contents of the variable, decide if you’re going to output it or not, clear the variable, and keep going.

You can’t do it easily as a one-liner, because you’ve got to have some variables to track what you’re doing. Like, “currently accumulating lines and waiting for end-of-section tag.”

here is a quick way to remove the entries

[xml]$xml = gc d:\test\test.txt
foreach($child in $xml.check_type.ChildNodes){
    if($child.'#text' -match 'description     : "8.1.1.1'){$child.RemoveAll()}
    
}
$xml.check_type.ChildNodes.'#text'

I appreciate your attempt to help, but that is not what i am looking for it to do…

if description contains 8.1.1.1
it would need to delete the ENTIRE custom_item including system, type, description…

Also, I have researched parsing and it does not correlate on what i would like it to do either… Do you know of any tutorials?

With the xml you provided the system, type, and description is a single text item contained in each custom_item and the code does remove each one if the text ‘description : "8.1.1.1’ is found. What the sample code does not do is write the results back to the file, it only changes the variable in memory. Here is a more complete example.
I do not know of any specific tutorials to point you to for parsing but Don laid out the steps you would have to take.

[xml]$xml = gc d:\test\test.txt
Write-host "Before"
$xml.check_type.ChildNodes.'#text'
Write-host "-----------------------------------------------"
foreach($child in $xml.check_type.ChildNodes){
    if($child.'#text' -match 'description     : "8.1.1.1'){$child.RemoveAll()}
    
}
Write-host "After"
$xml.check_type.ChildNodes.'#text'
Write-host "-----------------------------------------------"
$xml.Save("d:\test\test2.txt")
[xml]$xml2 = gc d:\test\test2.txt
Write-host "From new file"
$xml2.check_type.ChildNodes.'#text'




Before

    system          : "Linux"
    type            : "Check"
    description     : "8.1.1.1 this is other text"
  

    system          : "Linux"
    type            : "Check"
    description     : "8.1.1.1 more text"
  

    system          : "Linux"
    type            : "Check"
    description     : "8.1.1.2 this is other text"
  
-----------------------------------------------
After

    system          : "Linux"
    type            : "Check"
    description     : "8.1.1.2 this is other text"
  
-----------------------------------------------
From new file

    system          : "Linux"
    type            : "Check"
    description     : "8.1.1.2 this is other text"

I’m not too up to speed on what XML should look like, but as an alternative to the brute force string manipulation maybe you could have a script that cleans up the junk xml into clean and proper xml, then you can deal with it that way. Might or might not be any easier, just an idea.