General Coding Issue

by jdainty_91 at 2013-01-19 14:33:43

Hi All,

I would really appreciate your help solving an issue I have been having with extracting the data I want from an excel file. I am a total beginner when it comes to Powershell (and computer programming in general), so I would really value the experience you have to offer.

I will run what I am trying to do and explain where my problem arises. So I start off with data of the following form:

Column_1 Column_2 Column_3 Column_4 Column_5
1 1 30/11/2012 02:30 30/11/2012 02:27 PE
1 1 30/11/2012 02:30 30/11/2012 02:22 PE
1 1 30/11/2012 02:30 30/11/2012 02:16 PE
1 2 30/11/2012 02:30 30/11/2012 01:42 IP
1 2 30/11/2012 02:30 30/11/2012 01:42 PE
1 2 30/11/2012 02:30 30/11/2012 01:38 PE
1 3 30/11/2012 02:30 29/11/2012 23:57 PE
1 3 30/11/2012 02:30 29/11/2012 23:17 PE
1 3 30/11/2012 02:30 29/11/2012 17:26 PE
2 1 05/12/2012 19:45 05/12/2012 14:12 NI
2 1 05/12/2012 19:45 05/12/2012 09:33 PE
2 1 05/12/2012 19:45 05/12/2012 00:23 PE
2 2 05/12/2012 19:45 05/12/2012 19:35 PE
2 2 05/12/2012 19:45 05/12/2012 19:35 PE
2 2 05/12/2012 19:45 05/12/2012 11:40 PE
2 3 05/12/2012 19:45 05/12/2012 18:32 PE
2 3 05/12/2012 19:45 05/12/2012 17:01 IP
2 3 05/12/2012 19:45 05/12/2012 17:01 NI

I would like my output to satisfy a number of constraints:

1) There is a single row per unique combination of numbers in Column_1 and Column_2.
2) The time in Column_4 is the latest time before the time in Column_3.
3) All rows with “IP” in Column_5 are eliminated.

i.e I would like the following output:

Column_1 Column_2 Column_3 Column_4 Column_5
1 1 30/11/2012 02:30 30/11/2012 02:27 PE
1 2 30/11/2012 02:30 30/11/2012 01:42 PE
1 3 30/11/2012 02:30 29/11/2012 23:57 PE
2 1 05/12/2012 19:45 05/12/2012 14:12 NI
2 2 05/12/2012 19:45 05/12/2012 19:35 PE
2 3 05/12/2012 19:45 05/12/2012 18:32 PE

To do this, I have come up with the following code:

Import-CSV $args[0] | Where-Object { $.Column_5 -ne "IP" -and ($.Column_5 -eq "NI" -or $.Column_4 -eq "PE") } | <br>Sort-Object -property @{Expression=&quot;Column_1&quot;;Descending=$false},
@{Expression="Column_2";Descending=$false}, <br>@{Expression=&quot;Column_4&quot;;Descending=$true}
Sort-Object Column_1, Column_2 -unique `
| Export-CSV $args[0].Replace(".csv","SPs.csv") -NoTypeInformation

However, when I run the code, the output does not satisfy my second constraint, i.e. there is one row per unique combination of numbers in Column_1 and Column_2, but the time in Column_4 is not necessarily the latest time before the time in Column_3. The unique command seems to be randomly selecting rows rather than taken the first row where each combination of numbers appears. So, for example, I am getting an output like the following:

Column_1 Column_2 Column_3 Column_4 Column_5
1 1 30/11/2012 02:30 30/11/2012 02:22 PE
1 2 30/11/2012 02:30 30/11/2012 01:38 PE
1 3 30/11/2012 02:30 29/11/2012 23:57 PE
2 1 05/12/2012 19:45 05/12/2012 09:33 PE
2 2 05/12/2012 19:45 05/12/2012 19:35 PE
2 3 05/12/2012 19:45 05/12/2012 17:01 NI

Can anyone offer any suggestions as to where I am going wrong? As I said before, I am very inexperienced and keen to learn.

All input is greatly appreciated.

Thanks

Jordan
by nohandle at 2013-01-21 01:52:26
Hi,

1) There is a single row per unique combination of numbers in Column_1 and Column_2.
- create custom property that is column_1 + Column_2 to make it easy to automatically group-Object by it, something like this | select @{n=‘key’;e={$
.Column_1+$
.Column_2}} | Group-Object key
2) The time in Column_4 is the latest time before the time in Column_3.
-for each of the groups process the properties to eliminate every line that does not meet the condition (is older than the threshold time) sort-object the remaining items by appropriate column and select the first one.
3) All rows with “IP” in Column_5 are eliminated.
This should be done after the whole process? To implement it simply use where-object {$_.column_5 -eq "IP"} to filter them out and save the resulting collection.

Split the problem to smaller pieces, save the results of each problem in variable and progress to next one, doing it all in one pipeline may make it unnecessarily complicated since you are beginner. You can join it later if you need.