Capture text from .csv

by GradyGoose at 2013-04-01 18:27:37

The text below is found in one cell of .csv, just as you see it. It is a large .csv but this is fairly similar to the other cells I am dealing with. I have not been able to find the right combination of regular expressions to properly and fully capture each of the users and groups below. Thank you in advance for your help.


The following users are members of the ‘Administrators’ group :

- XY520L07\Administrator (User)
- Company\Domain Admins (Group)
- Company\xyz_hbnc_Helpdesk Administration (Group)
- Company\xyz_hbnc_Network Analysts (Group)
- Company\sccminstall (User)
- Company\hyt_TEMPLocalAdmin (Alias)
by Nobody at 2013-04-01 19:36:55
So you have a CSV with a single column for input? Can you give an example of the output you’re after? I’m also curious about what’s generating your input CSV, there may be an easier way to capture that data in useable format.
by GradyGoose at 2013-04-02 04:08:59
No there are many columns, I have captured the two columns I need into a variable. The other column is the computer name. The input .csv is being generated by Nessus. Thanks.
by nohandle at 2013-04-02 04:30:42
GG: I am still not sure what you are looking for. You want to capture cell that contains all of the strings (group names, users, aliases) you posted?
by GradyGoose at 2013-04-02 06:28:14
The text I posted above is in a single cell. There are many single cells this is just a representation. I want to capture, individually "Administrator" "Domain Admins" "xyz_hbnc_Helpdisk Administration" etc from each cell.
Does this help you understand what I am after? Thanks,
by nohandle at 2013-04-02 06:34:01
[quote="GradyGoose"]The text I posted above is in a single cell. There are many single cells this is just a representation. I want to capture, individually "Administrator" "Domain Admins" "xyz_hbnc_Helpdisk Administration" etc from each cell.
Does this help you understand what I am after? Thanks,[/quote]
Not really :frowning:
Is this what describes what you need?:
Take text from every cell. Split the text into lines. for each line output the text between "" and " ("
by mjolinor at 2013-04-02 07:10:29
Putting multi-line text into a csv column is something that usually seems like a good idea at the time, but turns out to be ugly to deal with later.

$cell = @'
XY520L07\Administrator (User)
Company\Domain Admins (Group)
Company\xyz_hbnc_Helpdesk Administration (Group)
Company\xyz_hbnc_Network Analysts (Group)
Company\sccminstall (User)
Company\hyt_TEMPLocalAdmin (Alias)
'@

$cell -split "n&quot; |<br>foreach {<br> if &#40;$_ -match &#39;&#40;.+&#41;\\&#40;.+&#41;\s\&#40;&#40;User|Group|Alias&#41;\&#41;&#39;&#41;<br> {<br> new-object psobject -Property @{<br> Domain = $matches[1]<br> Name = $matches[2]<br> Type = $matches[3]<br> }<br> }<br> }<br><br><br><br>Domain Name Type <br>------ ---- ---- <br>XY520L07 Administrator User <br>Company Domain Admins Group <br>Company xyz_hbnc_Helpdesk Administration Group <br>Company xyz_hbnc_Network Analysts Group <br>Company sccminstall User <br>Company hyt_TEMPLocalAdmin Alias</code></blockquote>by MasterOfTheHat at 2013-04-02 07:20:06<blockquote>I think he's saying that he has a CSV with multiple columns. One of those columns contains data like the list above. From that one column of data, he wants to extract the user/group name, which would be between '\' and &quot; (&quot;. <br><br>Try this:<br><code># $colNbr would be the 0-indexed column number for the column containing the data in your list<br>$colNbr = 0<br># matches to everything between the &quot;\&quot; and the &quot;&#40;&quot;<br>$regex = [regex] &quot;\\&#40;.+&#41;\&#40;&quot;<br><br>Import-Csv file.csv | ForEach-Object { <br> $names = $regex.match&#40;$_[$colNbr]&#41;;<br> $names.groups[1].Value<br>}</code></blockquote>by GradyGoose at 2013-04-02 09:29:29<blockquote>Yes and no, unfortunately the challenge is that the entire cell is read in as a single line. Other than that yes I want to capture the group and user names to compare against a source list. Basically Nessus is outputting this file that shows who is in the local admin group and I need to get that in a form to compare to a &quot;source&quot; list and I am looking to powershell to automate this. Sorry if I am not explaining well, does this help? Thanks,</blockquote>by GradyGoose at 2013-04-02 09:30:32<blockquote>Thanks to both of you, I will look at the regx you have posted and see if this does it. It will be tonight before I can do that.</blockquote>by GradyGoose at 2013-04-02 19:15:36<blockquote>The regx suggested by mjolinor worked. I have not had a chance to try MasterOfTheHat's suggestion but I will. I have a couple more pieces to put together for the entire solution, but I want to try to do so myself before asking for more help. Either way I will post and let you know. Thanks!</blockquote>by GradyGoose at 2013-04-04 13:02:22<blockquote>I thought I had posted Tuesday night but do not see it. Mjolinor’s logic and regx worked, thanks! I have not tried MasterOfTheHat’s suggestion yet but I will!<br>I am close but not quite there. Below is the code I have so far which modifies Mjolinor’s code slightly and I will explain what I have left to accomplish.<br>In the .csv I am importing there are only two columns I care about, the “host”column (contains computer name) and the “pluginoutput” column which contains the text I previously posted that Mjolinor’s regx helped with. I believe what I need to do is to get these two into a hash with the host as the key and the users and groups (pluginoutput) as an array of values. Then take a “source” list of what users and groups should be there (an array) and compare that array to each array value in the Hash and when there is a difference write out the key (host). <br>I have not got to the compare part yet. Below I am trying to get the hash table built from the .csv. I am getting an error about the if statement on line 8 – <br>The term 'if' is not recognized as the name of a cmdlet...<br>Have not been able to find a solution for that. <br>I have done some scripting in Perl but I do not really understand object oriented programming. Mjolinor, I changed your code slightly because I understand better how to work with a hash table rather than the psobject hash. <br>Thanks in advance for any help!<br><br>$hash=@{}<br>$counter=0<br>$userlist=import-csv localadmin.csv <br><br>foreach($_ in $userlist)<br>{<br> $counter++<br> $_.pluginoutput -split &quot;n" | if ($_ -match (.+)\(.+)\s((User|Group|Alias))')

{
$hash.add($.host,$matches[$counter])
}
}
else{}
by mjolinor at 2013-04-04 13:44:49
[quote]
I believe what I need to do is to get these two into a hash with the host as the key and the users and groups (pluginoutput) as an array of values.
[/quote]

If that’s what your after I think this might work:
$hash=@{}
import-csv localadmin.csv |
foreach{
$hash[$
.host] = @($.pluginoutput -split "n&quot;&#41;<br> }</code></blockquote>by GradyGoose at 2013-04-04 18:03:16<blockquote>I sorry I don't totally understand, where does the regex match part go that grabs the part of the new lines we need?</blockquote>by mjolinor at 2013-04-04 18:15:59<blockquote><code>I sorry I don&#39;t totally understand, where does the regex match part go that grabs the part of the new lines we need?</code><br><br>If we're going to separate the values, it needs to happen after the split. <br><br>I guess I don't quite understand what the array you want to assign as the value of that host name key in the hash table is supposed to look like. Right now we have a multi-line string. We can separate that into an array of lines. Are you also wanting to separate each line into an array of separate values, and produce a 2D array?</blockquote>by GradyGoose at 2013-04-05 07:34:34<blockquote>I am attaching a spreadsheet. One worksheet is labled &quot;Imported csv&quot; this is a representation of the file I am importing (of course it has many rows). I need to capture column E &quot;host&quot; and column L &quot;pluginoutput&quot;. For column L I only want the part after the \ and before the ( . I would like to get this csv into a hash where the hostname is the key and the other stuff would be an array of values:<br><br>sh435.company. com = (administrator, Domain Admins, xyz_hbnc_Helpdesk Administrator, xyz_hbnc_Network Analysts, sccminstall, hyt_TEMPLocalAdmin)<br><br>Then I would take a &quot;source&quot; list (i have included that in the worksheet labled &quot;Source List&quot;, and compare that to the array values of each key. Anytime there is a difference I would like to identify that key (write that key out to a text file or whatever way to identify).<br><br>Does this help? Thank you for your help.</blockquote>by mjolinor at 2013-04-05 07:41:24<blockquote>That helps, but I question the omission of the domain part of the identity. You'll have a list of user\group names but you won't know if they're local or domain identities. Some will be implicit in the name, like Domain Admins but you can't trust that will be reliable for all the entries.</blockquote>by MasterOfTheHat at 2013-04-05 08:24:13<blockquote>If the only output you really need is the hostname and user/group for the user/groups that aren't on your source list, I don't think there is any reason for a hash table...<br><br>The basic flow would be:<br>[list]<br>[*]import the Source List into an array[/*]<br>[*]import the csv containing the data[/*]<br>[*]for each row in the data csv<br>[list]<br>[*]for each of the user/group names in the PluginOutput column<br>[list]<br>[*]see if the SourceList array contains the user/group, and write it the hostname and user/group out to a custom object array for output[/*][/list][/*][/list][/*]<br>[*]spit out the custom object array so that you can display to console, export to file, etc[/*][/list]<br><br>Make sense? <br>[list=1][*]You've got the import-csv stuff figured out. [/*][*]You'll need to pipe the data CSV object to a ForEach-Object loop. [/*][*]In that loop, save the hostname to a variable and then use the regex stuff that mjolinor and I have given you to pull the user/group names from the PluginOutput column. [/*][*]Pipe that array to another ForEach-Object loop that uses the -contains operator to see if the source list array contains each of user/group names piped in. [/*][*]If it does, go to the next user/group name. If it doesn't, push the hostname from the original csv and the user/group name from the pipeline to a new custom object and save that object as a new element of an array. [/*][*]Once you've processed all of the user/group names, it should fall back to CSV ForEach-Object loop. [/*][*]Once the last CSV object has been processed, just send the custom object array to the console as output[/*][/list]<br><br>Let us know which parts you need help with and let me know if I've missed something.</blockquote>by GradyGoose at 2013-04-05 10:48:27<blockquote>Thanks MasterOfTheHat, I think I have the high level logic (accent on think). Where I am running into an issue at this point is poweshell is throwing an error on the &quot;if&quot; when I am trying to &quot;match&quot;. In the below code (which is not exactly correct at this point I know), I am getting an error on the &quot;if&quot; statement when trying to match. For some reason it doesnt like the pipe to an if inside of the foreach?? Since the pluginoutput cell is all one line I have to split into lines and then match out the part I need. I think I have to do this whether I am outputing to an array or hash. I was using a hash because I thought that was easier to keep up with both the host and the offending user or group. Thanks for the help.<br><br><code>$hash=@{}<br>$counter=0<br>$userlist=import-csv localadmin&#46;csv <br><br>foreach($_ in $userlist)<br>{<br> $counter++<br> $_&#46;pluginoutput -split &quot;n" | if ($ -match ‘(.+)\(.+)\s((User|Group|Alias))’)

{
$hash.add($.host,$matches[$counter])
}
else{}
}
by mjolinor at 2013-04-05 11:28:11
Does this work for you?

$hash=@{}
$userlist=import-csv localadmin.csv
$regex = '(.+)\(.+\s)&#40;(User|Group|Alias)&#41;'

foreach($user in $userlist)
{
@($user.pluginoutput -split "n&quot;&#41; -match $regex |<br> foreach {<br> $_ -match $regex &gt; $nul<br> $hash[$_.host] += $matches[2] <br> }<br> <br> }</code></blockquote>by MasterOfTheHat at 2013-04-05 11:43:43<blockquote>Right. You can't pipe to an if statement. Look at how mjolinor did it back on page 1:<br><code>$cell -split &quot;n" |
foreach {
if ($
-match '(.+)\(.+)\s&#40;(User|Group|Alias)&#41;')
Pipe the split to a foreach and then do the if statement in that loop.

Edit: and he beat me to the Submit button… You could do it that way, too.
by mjolinor at 2013-04-05 11:53:51
[quote]
Edit: and he beat me to the Submit button… You could do it that way, too.
[/quote]

I might have been a little too quick. I’d copied his original code, and didn’t catch this:

foreach($_ in $userlist)
by GradyGoose at 2013-04-08 17:56:46
$hash=@{}
$userlist=import-csv localadmin.csv
$regex = ‘(.+)\(.+\s)((User|Group|Alias))’

foreach($user in $userlist)
{
@($user.pluginoutput -split "`n") -match $regex |
foreach
{
$_ -match $regex > $nul
$hash[$.host] += $matches[2]
}
}


Sorry for the delay responding, busy weekend and I have tried a few things, with no success. mjolinor, I tried the code you posted on Friday (reposted here) and get the following which I do not uderstand. If you type in something for the process it just goes tot he next number process and continues that way. Thanks.

cmdlet ForEach-Object at command pipeline position 1
Supply values for the following parameters:
Process[0]:
by mjolinor at 2013-04-08 18:04:11
That looks like a syntax typo on my part. Foreach will allow the opening brace on the next line, but foreach-object won’t. I wish they’d make that consistent…

Updated the orginal posted code.
by MasterOfTheHat at 2013-04-09 06:25:41
[quote="mjolinor"]That looks like a syntax typo on my part. Foreach will allow the opening brace on the next line, but foreach-object won’t. I wish they’d make that consistent…

Updated the orginal posted code.[/quote]
Really wish I could "like" a post…
by GradyGoose at 2013-04-21 16:16:47
Index operation failed; the array index evaluated to null.
At line:10 char:31
+ $hash[ <<<< $
.host] += $matches[2]
+ CategoryInfo : InvalidOperation: (:slight_smile: , RuntimeException
+ FullyQualifiedErrorId : NullArrayIndex


mjolinor, I was out of town for a while and then catching up. The corrected code you provided produced the following error. Is it the array in the first foreach that is the issue? Thanks.