Manipulating Text or HTML code

by grantc at 2012-10-24 21:30:34

Hi there,

I’m having problems trying to capture some data and perform another function.

Basically Im grabbing a HTML file with output looking like this…

<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.conn">
<IMG SRC="/bb/gifs/green.gif" ALT="conn:green"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>
<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.cpu">
<IMG SRC="/bb/gifs/green.gif" ALT="cpu:green"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>
<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.disk">
<IMG SRC="/bb/gifs/yellow.gif" ALT="disk:yellow"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>
<TD ALIGN=CENTER>-</TD>
<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.msgs">
<IMG SRC="/bb/gifs/green.gif" ALT="msgs:green"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>
<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.procs">
<IMG SRC="/bb/gifs/green.gif" ALT="procs:green"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>
<TD ALIGN=CENTER>
<A HREF="/cgi-bin/bb-hostsvc.sh?HOSTSVC=p00010.svcs">
<IMG SRC="/bb/gifs/green.gif" ALT="svcs:green"
HEIGHT="16" WIDTH="16" BORDER=0></A></TD>


I’m basically trying to grab the "yellow.gif" line and also the line above with the "A HREF" code.

PS$> Select-String .\webpage.txt -pattern "yellow.gif"

Will obviously show me the yellow.gif lines - But I also want to grab the line above yellow.gif. I’ve tried to work with $.linenumber But I dont believe select-string will allow be to choose a $.linenumber and output it.

Hope that makes sense…

Can anyone assist please?
by DonJ at 2012-10-24 21:50:10
I’d suggest reading the file line by line into an array. That way, once you know what line has your text, you can easily grab the preceding one.

Alternately, if the HTML is well-formed, you can treat it as XML. Much stronger manipulation capabilities there. I’d did that in the HTML Reports ebook I just released - www.powershellbooks.com. I used it to find TR tags within a table. Might be an idea there you could use.

Finally, where does the HTML come from? In v3, Invoke-WebRequest can return a very rich and manipulation friendly object. It can load a file, or a URL.