Which just returns false. I imagine it’s only returning one line and not reading until EOF. How can I get it to read everything between [table…[/table]?
-match is supposed to return True/False, but it also creates the $matches collection, which is what you’d look at to see what it matched. Whether it matches the first instance or continues to look for additional instances depends on whether your regular expression was written to do that. And honestly, for this purpose, you might find Select-String to be a bit more useful than -match.
But to go further, -match is only designed to tell you if it found a match or not. If you want to capture what it matched, you need to write a capturing (group) subexpression in your regex. That will populate $matches with what it captured. You can even give your capture group a name in your regex, and $matches will use that name, making it easier to reference what it found.
Well, a couple of things. -SimpleMatch isn’t a regular expression; it’s just a wildcard match. And, by default, letting you know you have a match is all the cmdlet is supposed to do.
Also, if you delimit your pattern in single quotes, you can use double quotes within and not have to escape them ;).
You should also know a bit about how regular expressions and patterns work. They’re fairly literal - meaning if the attributes in that TABLE tag are in a different order, it won’t match them. I’m assuming you already thought of that, and that the HTML you’re using is consistent. But a -SimpleMatch isn’t intended to capture anything. As I wrote earlier, you need a capturing subexpression in a regex.
That means using -Pattern to specify your pattern. And, instead of "" to match the inside of the TABLE, you’re probably going to want to use something like (+). Keep in mind that * only matches a single character; *+ means match more than one. The (parentheses) create a capturing subexpression. However, that example is a greedy subexpression. That means, if your HTML contains more than one TABLE, it’ll match from the beginning of the first one to the end of the last one, and everything in between. I’m not sure what your HTML looks like, or what your goal is, but you may need to modify it to be a non-greedy subexpression.
You probably want to use the -AllMatches switch, also.
What you’re trying to do is certainly straightforward, I think, but regular expressions aren’t as straightforward as I wish they were ;). It’d be worth some time to read up on capturing subexpressions and greedy vs. non-greedy subexpressions, so you can figure out what the right technique is to meet your goal.
But its but I cant get it to return until it hits [/table].
But I’ve wasted more than enough of your time and I’ll do some more research on my own, I’m sure experienced users are saying ‘HE TOLD YOU WHAT TO DO ALREADY!!’
The entire HTML will actually be the body of an email that was retrieved through powershell, never makes it to a file. And I’m not sure if it always starts on 747, but the table header should be unique.
If $body is the entire powershell, and I do a
$body -match ‘.*’ it only matches the first line, how would I make it so it makes the entire string?
Hey @aaron-miller, this is the deal. Based on the description of your results, it appears that $body is of type System.String rather than System.String. Meaning it is an Array of strings, not a single string. RegEx does not process against an array like it would a string. You have two options here.
Note: Below is tested using provided sample input:
$body = @'
[html]
whatever
whatever
whatever
[table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" width="900" style="width:675.0pt;border:solid black 1.0pt"]
random text 1
[/table]
whatever
[/html]
'@ -split "`n"
If you don’t care about the content being on separate lines, make the body a single string using -join
[table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" width="900" style="width:675.0pt;border:solid black 1.0pt"]
random text 1
[/table]