extracting lines between two strings - multiple occurrences of strings

Hi ,

I have a text file which contains below kind of text in it (multiple occurrences of “<TestResult” and “</LineInfo>” and lines between these tags).

<TestResult abc=10 abc=20 …
one
two
three
</LineInfo>

xyz
jkk
lkh

<TestResult abc=30 abc=40 …
four
five
six
</LineInfo>

kkk
ddd
bnm

<TestResult abc=30 abc=40 …
seven
eight
nine
</LineInfo>

Now, I want to extract only the lines from the above said text file which are included between the strings “<TestResult” and “</LineInfo>” . I have written below script for it, but it is returning only the lines which are between the first match, it is not returning the lines between other matches. Also all the matching lines are coming in the same line, new line is missing between the lines.

The current output i am getting is like below:

<TestResult abc=10 abc=20 … one two thee </LineInfo>

The expected output is like below:

<TestResult abc=10 abc=20 …
one
two
three
</LineInfo>

<TestResult abc=30 abc=40 …
four
five
six
</LineInfo>

<TestResult abc=30 abc=40 …
seven
eight
nine
</LineInfo>

----------------------------- Script I Written --------------------------------------------

$importPath = “C:\content\inputfile.txt”
$pattern = “<UnitTestResult(.*?)</ErrorInfo>”

$string = Get-Content $importPath
$result = [regex]::match($string, $pattern).Groups[1].Value
$result | out-file -FilePath “C:\content\outputfile.txt”


Please help

This is not a PowerShell specific issue, it is a pure RegEx pattern match syntax error.

As for this…

$pattern = ""

… this is an empty string, not a pattern to match

If you are after removing special characters, then you need to specifically target those.
Also, that is not the patter, and you say your text is the…

"" and ""

It’s double quote double quote and a right space, the reversed on the right. So, you match must be for both.

# RegEx pattern to match.
""\s|\s""
($String = '"" and ""')
$String -Replace('""\s|\s""')

# Resutls

"" and ""
and