Outputting object from helper function

I’ve started on a script to find any Personal Identifiable Information (PII) in a set list of file types. I’m currently just testing .docx files. The object output from the Find-PIIWord helper function isn’t outputting as I expect it would. As shown at the very bottom the object outputs all at once at the very end after all the verbose and warning output and not each time it’s called.

Should I be returning the object from Find-PIIWord back to the main function and outputting the object from there?
Is a function call from a switch statement really the right approach here?
Any other critiques would be greatly appreciated.

Function Find-PII {


    Param (
        [Parameter(Mandatory = $true,
                   ValueFromPipeline = $true,
                   ValueFromPipelineByPropertyName = $true)]
        [string[]] $Path = $PWD

    Begin {

        #Converts relative path to absolute path
        $Path = Convert-Path $Path

        #Has 9 digits, may be split as xxx-xx-xxxx by dashes or spaces
        $patternSocial = '(\d{3}[-| ]\d{2}[-| ]\d{4})|(\d{9})'

        #Starts with a 4 and have 16 digits, may be split as xxxx-xxxx-xxxx-xxxx by dashes or spaces
        $patternVisa = '(4\d{3}[-| ]\d{4}[-| ]\d{4}[-| ]\d{4})|(4\d{15})'

        #Starts with 51-55 and have 16 digits, may be split as xxxx-xxxx-xxxx-xxxx by dashes or spaces
        $patternMC = '(5[1-5]\d{2}[-| ]\d{4}[-| ]\d{4}[-| ]\d{4})|(5[1-5]\d{14})'

        #Starts with 34 or 37 and have 15 digits, may be split as xxxx-xxxxxx-xxxxx by dashes or spaces
        $patternAMEX = '(3[47]\d{2}[-| ]\d{6}[-| ]\d{5})|(3[47]\d{13})'

        #Start with 6011 or 65 and have 16 digits, may be split as xxxx-xxxx-xxxx-xxxx by dashes or spaces
        $patternDiscover = '(6(?:011|5\d{2})[-| ]\d{4}[-| ]\d{4}[-| ]\d{4})|(6(?:011|5\d{2})\d{12})'

        $PIITemp = Get-Item -Path "$env:TEMP\FindPII"

    Process {
        $files = Get-ChildItem -Path $Path -Include '*.docx' -Recurse
        #$files = Get-ChildItem -Path $Path -Include '*.docx', '*.xlsx', '*.pdf', '*.pptx', '*.txt' -Recurse

        foreach ($file in $files) {
            switch ($file.Extension) {
                .docx {Find-PIIWord -InputObject $file}
                #.xlsx {Find-PIIExcel}
                #.pptx {Find-PIIPowerPoint}
                #.pdf {Find-PIIPdf}
                #.txt {Find-PIITxt}
                #default {break}

    End {


Function Find-PIIWord {
    param (
        [Parameter(ValueFromPipeline = $true)]
        [System.IO.FileInfo] $InputObject

    Write-Verbose "Looking for PII in $($InputObject.Name)"

    $docxTemp = "$PIITemp\$($InputObject.Name)"

    New-Item -Path "$PIITemp\docx" -ItemType Directory -Force | Out-Null
    Copy-Item -Path $InputObject.FullName -Destination "$docxTemp.zip" -Force
    Expand-Archive -Path "$docxTemp.zip" -DestinationPath "$PIITemp\docx\" -Force | Out-Null

    [xml] $docx = Get-Content -Path "$PIITemp\docx\word\document.xml"
    $PIIFound = $docx.document.body.p.r.t | Select-String -Pattern $patternSocial, $patternVisa -Quiet

    if ($PIIFound) {

        $obj = [pscustomobject] @{
            'Name' = $InputObject.Name;
            'Length' = $InputObject.Length;
            'LastWriteTime' = $InputObject.LastWriteTime;
            'FullName' = $InputObject.FullName

        Write-Warning "PII found in $($InputObject.name)"
        Write-Output $obj

    #Remove-Item -Path "$PIITemp\docx" -Recurse -Force

Function New-PIITempFolder {
    if (-not (Test-Path -Path "$env:TEMP\FindPII")) {
        New-Item -Path $env:TEMP -Name FindPII -ItemType Directory | Out-Null

Function Remove-PIITempFolder {
    if (Test-Path -Path "$env:TEMP\FindPII") {
        Remove-Item -Path "$env:TEMP\FindPII" -Recurse -Force

The output of my test is below:

PS G:\Microsoft\Powershell> Find-PII -Path . -Verbose
VERBOSE: Looking for PII in Resume.docx
WARNING: PII found in Resume.docx

VERBOSE: Looking for PII in test.docx
WARNING: PII found in test.docx
Name        Length LastWriteTime          FullName                           
----        ------ -------------          --------                           
Resume.docx  27689 12/12/2017 12:22:58 AM G:\Microsoft\Powershell\Resume.docx
test.docx    11970 12/12/2017 12:26:21 AM G:\Microsoft\Powershell\test.docx

QQ. What is your rationale for doing this manually?
Other than as a learning effort, or offline forensic effort.

Don’t get me wrong, there is nothing wrong with doing this as MS has specific articles on the topic…

Security Watch Where Is My PII?

but…the enterprise approach to doing this would be to use Windows Server FSRM/FCI deployment. This deployment, will scan your storage resources for whatever string you find prudent, and take action on it, move it, protect it with RMS policies, etc…

FSRM and FCI: Frequently Asked Questions


Classifying files based on location and content using the File Classification Infrastructure (FCI) in Windows Server 2008 R2

Automating the doc protection using FCI integration with RMS bulk protection tool

Protect everything: using FCI to protect files of any type with Windows Server 2012

You can even user AIP (basically RMS and FCI in the cloud) to do this without any additional server needs.

Even consumers can do this with the free Azure AIP service.
RMS for individuals and Azure Information Protection

all of the above allow you to scan data for content strings and do what you will from there.

As for you question…
‘Should I be returning the object from Find-PIIWord back to the main function and outputting the object from there?’

Are you saying you are not getting the results you’d expect, or that you just want to push it out a different way?

There is always more than one way to do X or Y.

The question is, does it work for you and your organization as are?

Is it easily understood to those whom you’d pass this on to?
Is it defined to be as extensible as possible, if needed of course.
Is it easily maintainable as is?

There can be more elegant ways to do things, but that too can be very based on opinion, personal stance, habits, beliefs, etc.

Here’s a similar post and function…