Regex Woes

I have a script that pulls information from an open source IPAM tool. The script, using the native API, returns an xml string like so:

<HostList>
   <Host>172.20.1.49,FH-INET-TK-FW-P,FH,,Default gateway for segment,</Host>
   <Host>172.20.3.1,FW1FHNGXINT,FH,Network - Firewall,Cluster IP. Default Gateway for segment
,</Host>
   <Host>172.24.56.1,FW1OKNGXINT,OK,Network - Firewall,Default Gateway for the
segment,Reserved (Network)</Host>
   <Host>172.25.80.1,Available,OK,,CAUTION!! This is NOT the default gateway. The real default gateway is 172.2
5.85.1,See Comment</Host>
   <Host>172.31.200.254,FW1OKNGXINT,OK,,Default Gateway for the segment,</Host>
</HostList>

This is all returned in a single string $bn. I am attempting to use regex to pull the various items out, with the intent to eventually be able to find the default gateway for a given network segment. This is what I am using currently for the regex:

   $hosts = [regex]::new("(?<=Host>)(.*)(?=</Host>)")
      $hosts.Matches($bn) |
         ForEach-Object {
	    write-host "HostName: $(($_.Groups[1].Value).Split(",")[1])"
	 }

The problem is I am only getting some of the return. I get three of the 5 records returned, but the others don’t show up for some reason. I tried changing the regex from greedy (.) to non (.?) but that is truly the limit of my regex skills (still makes my eyes bleed). Just curious if anyone sees anything I am doing wrong on the regex before I think of another way to get the data like casting to xml.

Your exampla data does not look like XML. I assume it’s because the forum api does not like XML. :wink: If you like to show XML data here you could put it on Github and post it here as GithubGist.
If you have XML data you should tread it as such and not try to deal with it as plain text. Here a short overview on how to work with XML: Mastering everyday XML tasks in PowerShell.

This forum can’t display xml unless it’s linked to a gist. Turning xml into an object is easy once you know how. There isn’t a “import-xml” cmdlet.

PS C:\users\js> [xml]$xml = get-content AppAssoc.xml
PS C:\users\js> $xml

xml                            DefaultAssociations
---                            -------------------
version="1.0" encoding="UTF-8" DefaultAssociations

Do a direct import of the XML file and use the properties it returns.

Using the Import-Clixml cmdlet

Import-Clixml -Path 'D:\Temp\SomeFileName.xml'

# This should give you ...

Name                           Value                                                                                                                       
----                           -----  
SomePropertyName               SomeValue or arrary of values

# Then just use the variable from the property shown

$SomePropertyName.something

Unless the xml was made with export-clixml, import-clixml won’t work with it.

PS /Users/js> import-clixml example.xml                                               
import-clixml : Element 'Objs' with namespace name 'http://schemas.microsoft.com/powershell/2004/04' was not found. Line 6, position 2.
At line:1 char:1
+ import-clixml example.xml
+ ~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (:) [Import-Clixml], XmlException
+ FullyQualifiedErrorId : System.Xml.XmlException,Microsoft.PowerShell.Commands.ImportClixmlCommand

Select-xml however can get you the object too. The first argument is an xpath. The node property has the object.

$xml = (select-xml / file.xml).node
$xml

xml                             Task
---                             ----
version="1.0" encoding="UTF-16" Task

It just has to be a properly formatted xml file, no matter what was used to create it.
I deal with XML everyday from many different tools, API’s etc. This is not dependent on the Export-Clixml cmdlet.

Sorry for the late reply. The text is not from a file; it comes back as a return from a curl web request to the product’s API. If at all possible, I would like to avoid having to save it to a file and then read which is why I was trying the regex.

If it’s valid XML you still can treat it as such. If you like to go the regex way anyway you could start with something like this:

$BN = @’
172.20.1.49,FH-INET-TK-FW-P,FH,Default gateway for segment,
172.20.3.1,FW1FHNGXINT,FH,Network - Firewall,Cluster IP. Default Gateway for segment
,
172.24.56.1,FW1OKNGXINT,OK,Network - Firewall,Default Gateway for the
segment,Reserved (Network)
172.25.80.1,Available,OK,CAUTION!! This is NOT the default gateway. The real default gateway is 172.2
5.85.1,See Comment
172.31.200.254,FW1OKNGXINT,OK,Default Gateway for the segment,
'@

foreach($line in $BN -split “`n”){
$line -match ‘((?:\d{1,3}.){3}\d{1,3}),([\w-]+),([\w-]+),([\s\w-]*),(.*Default gateway for (?:the )*segment)’ | Out-Null
If($Matches){
[PSCustomObject]@{
IP = $Matches[1]
String1 = $Matches[2]
String2 = $Matches[3]
String3 = $Matches[4]
String4 = $Matches[5]
}
}
Remove-Variable -Name Matches -Force -ErrorAction SilentlyContinue
}


… the output from the sample data you provided yould be this:
IP      : 172.20.1.49
String1 : FH-INET-TK-FW-P
String2 : FH
String3 :
String4 : Default gateway for segment

IP : 172.20.3.1
String1 : FW1FHNGXINT
String2 : FH
String3 : Network - Firewall
String4 : Cluster IP. Default Gateway for segment

IP : 172.31.200.254
String1 : FW1OKNGXINT
String2 : OK
String3 :
String4 : Default Gateway for the segment

Have you tried invoke-restmethod? I think it converts xml or json to an object automatically.
Invoke-RestMethod (Microsoft.PowerShell.Utility) - PowerShell | Microsoft Docs