Extract Unique Values from 8th position of 16million row file?

Hi Folks -

I have a 16 million row file and I need to extract the unique values from the 8th position of my tab delimted file. Is this possible with PowerShell?

"Property Gross Royalties"	"Commercial"	"Actual"	"Final"	"FY14"	"Jan"	"Periodic"	"O-20003921"	"FDR"	"PA"	120
"Sides"	"Commercial"	"Actual"	"Final"	"FY14"	"Jan"	"Periodic"	"O-20003921"	"FDR"	"PA"	1
"Volume"	"Commercial"	"Actual"	"Final"	"FY14"	"Jan"	"Periodic"	"O-20003921"	"FDR"	"PA"	17000
"Property AGC"	"Commercial"	"Actual"	"Final"	"FY14"	"Jan"	"Periodic"	"O-20003921"	"FDR"	"PA"	2000

Short answer: Yes, it is. :wink:

1 Like

:slight_smile: Can you share a potential solution? I’m been trying different examples from various places with zero luck :frowning:

What does that mean? Please share your code and explain why or how it does not work as you expect it?

1 Like

Not sure how you are you are reading the file, but once you have your line defined in a variable, this might work. It worked for me based on the sample you gave.

($line.Split("`t"))[7]

1 Like

Yes, PowerShell can return the unique 8th element from each row.

@simms7400’s files, rows, and unique makes me think that the source file is CSV; and the formatting and mixed nature of sample data provided makes this file appear thus:

"0-20003921" is the eighth element on each row, which occurs four times in the sample data.

"0-20003921" satisfies the extraction need accurately.

Note: The double quotation marks were purposefully kept as a part of the string because I’m being pedantic.

Code

$alice = [System.Collections.ArrayList]::new()
foreach($i in (Get-Content A:\MysteriousFileType.csv))
{
    [void]$alice.Add( ($i -split"\t")[7])
}
echo "" # spacer
$alice | Select-Object -Unique;

Results

results

Remarks

Use an [ArrayList] and it’s Add() method. Please do not use the += operator on any designs for this 16M row file. Eeeeyouch!

Really? The OP didn’t even posted enough information to make a meaningful suggestion at all!?!?! Why can’t you wait until the OP answered the questions and actually explained what the challange is? :roll_eyes: :roll_eyes:

1 Like

Hi, thank you very much! So when reading on line, I was seeing feedback that Get-Content is rather slow with large files and stream reader is preffered? I can’t speak to that as I’m not well versed in PS, but what are your thoughts on that?

I"m running your logic now and its been 30 minutes and still chugging along. Will let you know how I make out! Thank you!

1 Like

So you was just waiting for someone doing you work for you, isn’t it? :thinking:

I was away all day yesterday, apologies for not being logged on. Have a great day, Olaf!

In your initial question you just asked if it’s possible - not about speed.

You might watch this if your actual question was about speed

…what are your thoughts on that?

@simms7400, I answered your original post, with a proof, based from what I’d read. Many approaches may be considered for the solution. Loosely—the future behavior of a system can be determined by an observation of its current state (Richard C. Tolman). However, it’s a rare thing to know all the items and events that constitute the system. This creates a demand to have sufficient detail in a problem to be solved. In the end, it’s only possible to choose what best fits the current situation while anticipating that future conditions will favor those choices.

I hope that you have fun exploring and making discoveries along the way! Somewhere, I have some pithy statement from some important person saying that there is no such thing as an expected discovery because such a thing is nonsensical. And so… adventure!