Recording File Names

I have thousands of unique strings that I need to capture from unstructured files.
These files are stored on 9 x UNC shares. The unique string that I need to capture is actually stored in the file names on these share so this should be possible if you have the knowhow.
The file names stored on these shares have only two naming formats::
PN_DDMMYYYY_HHMMSS_PARTNUMBER.txt
bookings_PARTNUMBER_randomtext.txt

e.g. \share1\customer\saleID\ Note that there can be another sub directory again underneath saleID. Each customer has their own directory, each saleID folder is unique.
I need to capture only every unique PARTNUMBER from these file names on the 9 shares.
There may be one or more files with the same part number but I only need to record it once i.e. remove duplicates - and save these out to a *.CSV file.
Help appreciated as I have no clue where to start!

Do ALL the part numbers have a fixed length? Or does it vary?

And is the part number strictly numeric? Or is it mixed?

Hi, thanks. Variable length, alpha numeric - but always in the same position of the file name examples given.

List your shares in the $shares variable and it will export a csv to user profile.

# Get Files    
$shares = "\\share1\","\\share2\"
$files = Get-ChildItem -Path $shares -Recurse -Include PN_*.txt,bookings_*.txt

# Match part numbers and add to collection
$partnumbers = {@()}.Invoke()
foreach ($file in $files){
If ($file -match "(bookings_(?'pn'.*)_.*_|bookings_(?'pn'.*)_|PN_.*_(?'pn'.*).txt)") {$partnumbers.Add($Matches.pn)}
} # End Foreach

# Export part numbers
$partnumbers | select -Unique | Out-File $env:USERPROFILE\partnumber.csv

In this sample I’ve simulated your converged filename list. I am assuming at this point you know how to get the list of files. I use RegEx to extract the part numbers from the file names per your example and information above. I collect all the part numbers in an array, then I sort it for unique numbers and dump it to a text file.

$data = @"
PN_31012016_112345_123456.txt
bookings_23456_randomtext.txt
PN_25062015_204500_345.txt
bookings_6666A6666_randomtext.txt
PN_02031999_153000_222Q1.txt
bookings_911zzz_randomtext.txt
PN_04071776_120000_345.txt
bookings_456789_randomtext.txt
"@ -split "`r`n"

$pnPattern = "PN_\d{8}_\d{6}_(.+)\."
$bookingsPattern = "bookings_(.+)_"
$partNumber = New-Object -TypeName System.Collections.ArrayList

foreach ($item in $data) {
    Write-Verbose "Working on $item"
    if ($item -match $pnPattern)
    {
        $partNumber.Add($Matches[1]) | Out-Null
    } elseif ($item -match $bookingsPattern)
    {
        $partNumber.Add($Matches[1]) | Out-Null
    }
}
$partNumber | sort -Unique | Out-File -FilePath .\foo.txt

This is great - thank you both (Bob McCoy and random commandline)

@Bob McCoy
The bookingsPattern is not working and is maybe because of the file names. I had said random text but they are date and time but written differently than the same recorded on PN*.txt files.

Here is an example:

bookings_SKU56780-13_11-29-2011_01-43-47.txt

Does the $bookingsPattern = “bookings_(.+)_” need to change to match this?

Thanks again.

@random commandline

Get Files section works as expected

Match part numbers and add to collection does not work as expected - and I have made the change you requested. The output for $partnumbers is:

ÿþ

Export part numbers works as expected

Ok, I modified my script. I forgot an underscore after the date. It should work for you now.

In my script make the following change.

$bookingsPattern = "bookings_(.+?)_"

Thank you both!