How to skip export of empty rows?

I have the following code that takes in an excel file, and converts it to a csv file. its working perfectly, except, i’d like to add an enhancement: dont export empty lines/rows, because some exported csv files, i have to go in manually and delete the blank/empty rows/lines and it would be a lot easier if this is done automatically as part of the export. How can i specify that?

function Convert-ExcelSheetsToCsv {
    param(
        [Parameter(Mandatory, ValueFromPipelineByPropertyName, Position=1)]
        [ValidateNotNullOrEmpty()]
        [Alias('FullName')]
        [string]$Path,
        [Parameter(Mandatory = $false, Position=0)]
        [bool]$AppendFileName,
        [Parameter(Mandatory = $false, Position=2)]
        [bool]$ExcludeHiddenSheets,
        [Parameter(Mandatory = $false, Position=3)]
        [bool]$ExcludeHiddenColumns,
        [Parameter(Mandatory = $false, Position=4)]
        $ExcludeWorkSheets,
        [Parameter(Mandatory = $false, Position=5)]
        [bool]$AppendToColumnName,
        [Parameter(Mandatory = $false, Position=6)]
        $columnNameSuffix
    )
    Begin {
        "<code>r</code>nTerminating Excel process (if any) to prevent 'Can't access' (file in use) exception..."

        Get-Process 'exce[l]' | Stop-Process -Force #this is to avoid "Can't access" exception if file is opened or in use
        #Only 'excel' will be matched, but because a wildcard [] is used, not finding a match will not generate an error.

        $excel = New-Object -ComObject Excel.Application -Property @{
            Visible       = $false
            DisplayAlerts = $false
        }
    }
    Process {
        #$root = Split-Path -Path $Path
        $filename = [System.IO.Path]::GetFileNameWithoutExtension($Path)

        "<code>r</code>nGenerating CSV files from '$Path' Worksheet(s)..."

        $workbook = $excel.Workbooks.Open($Path)

        foreach ($worksheet in ($workbook.Worksheets | Where-Object { <# $_.Visible -eq -1 #> !($_.Name -in (($ExcludeWorkSheets).Split('(.+?)(?:,|$)'))) })) {       
            if($ExcludeHiddenColumns) {
                $ColumnsCount = $worksheet.UsedRange.Columns.Count
                for ($i=1; $i -le $ColumnsCount; $i++)
                {
                    $column = $worksheet.Columns.Item($i).EntireColumn #$worksheet.sheets.columns.entirecolumn.hidden=$true
                    if ($column.hidden -eq $true)
                    {   
                        $columnname = $column.cells.item(1,$i).value2

                        if ($worksheet.Visible -eq 0) #worksheet hidden
                        {
                            "<code>r</code>nHidden column [{0}] found in hidden [{1}] worksheet. Deleting..." -f $columnname, $($worksheet.name)
                        }
                        else {
                            "<code>r</code>nHidden column [{0}] found in [{1}] worksheet. Deleting..." -f $columnname, $($worksheet.name)
                        }

                        try {
                            $column.Delete() | out-null

                            "<code>r</code>nHidden column [{0}] was Deleted! Proceeding with Export to CSV operation...<code>r</code>n" -f $columnname
                        }
                        catch {
                            "<code>r</code>nFailed to Delete hidden column [$columnname] from [$($worksheet.name)] worksheet! $($error[0])<code>r</code>n$($error[0].InvocationInfo.PositionMessage)<code>r</code>n$($error[0].ScriptStackTrace)"
                            #$_ | Select *
                        }

                        #$i = $i - 1
                    }
                }
            }

            if ($ExcludeHiddenSheets) {
                if ($worksheet.Visible -eq -1) #worksheet visible
                {
                    if($AppendToColumnName) {
                        Rename-WorkSheetColumns $worksheet $columnNameSuffix
                    }

                    $ws = $worksheet
                }
            }
            else {
                if($AppendToColumnName) {
                    Rename-WorkSheetColumns $worksheet $columnNameSuffix
                }

                $ws = $worksheet
            }

            if ($AppendFileName) {
                $name = Join-Path -Path $csvPATH <# $root #> -ChildPath "${filename}_$($ws.Name).csv"
            }
            else {
                $name = Join-Path -Path $csvPATH <# $root #> -ChildPath "$($ws.Name).csv"
            }

            try {
                $ws.SaveAs($name, 6) #6 to ignore formatting and convert to pure text, otherwise, file could end up containing rubbish

                #"<code>r</code>n'$name' generated successfully!"
            } 
            catch {
                "<code>r</code>nFailed to save csv! Path: '$name'. $($error[0])<code>r</code>n$($error[0].InvocationInfo.PositionMessage)<code>r</code>n$($error[0].ScriptStackTrace)"
            }
        }
    }
    End {
        $excel.Quit()
        $null = [System.Runtime.InteropServices.Marshal]::ReleaseComObject($excel)
    }
}

i am thinking it would be done here, but i dont know the value to specify not to export empty lines

$ws.SaveAs($name, 6)

Meaning the line is just a bunch of commas?

Blank rows is a bit of a case by case basis and sometime they are there for formatting. Recommend taking a look at VBA code forums as they are built as macros and will provide the basic programmatic methods to do what you require. This looked like it has several options depending on the selection which is going to be the tricky part. You’d need code to select only where the content is located as you don’t want to process the entire workbook, but again that is most like found in VBA code:

https://www.ablebits.com/office-addins-blog/2018/12/19/delete-blank-lines-excel/

[quote quote=231157]Blank rows is a bit of a case by case basis and sometime they are there for formatting. Recommend taking a look at VBA code forums as they are built as macros and will provide the basic programmatic methods to do what you require. This looked like it has several options depending on the selection which is going to be the tricky part. You’d need code to select only where the content is located as you don’t want to process the entire workbook, but again that is most like found in VBA code:

How to delete blank rows in Excel quickly and safely
<iframe class="wp-embedded-content" style="position: absolute; clip: rect(1px, 1px, 1px, 1px);" title="“How to delete blank rows in Excel quickly and safely” — Excel tutorials, functions and formulas for beginners and advanced users - Ablebits.com Blog" src="https://www.ablebits.com/office-addins-blog/2018/12/19/delete-blank-lines-excel/embed/#?secret=cgIdzDrruv" width="600" height="338" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" sandbox="allow-scripts" data-secret="cgIdzDrruv" data-mce-fragment="1"></iframe>

[/quote]
hmm you bring up a good point, currently, it takes a long time to export just 10 sheets (about 15-20 mins)…

[quote quote=231148]Meaning the line is just a bunch of commas?

[/quote]
yessir. so for example, one csv file has the following contents after the export:

cat,dog,mouse,yara

,

,

the frist line/row would be considered fine, but the other ones are just empty/blank records which i have to go in manually and delete, because we are using these csv files to insert into a table afterwards and as you know, with a primary key, you can only have unique records, which means those empty lines count as duplicates

So are you only wanting to remove the lines if all columns are lacking values?

[quote quote=231172]So are you only wanting to remove the lines if all columns are lacking values?

[/quote]
yessir.

In my opinion it would be extraordinarily more difficult to handle this in the excel com object. I would simply take the CSV and run it through one more function to remove the blank lines. Something like this works fine in my tests, others may have better suggestions.

$CSVIN = 'c:\temp\csvwithemptyrows.csv'
$CSVOUT = 'c:\temp\noblankrows.csv'
Get-Content $CSVIN | where {$_.trim(',') -ne ""} | set-content $CSVOUT

Hope this helps

[quote quote=231187]In my opinion it would be extraordinarily more difficult to handle this in the excel com object. I would simply take the CSV and run it through one more function to remove the blank lines. Something like this works fine in my tests, others may have better suggestions.

PowerShell
4 lines
<textarea class="ace_text-input" style="opacity: 0; height: 17.9048px; width: 7.2381px; left: 45px; top: 0px;" spellcheck="false" wrap="off"></textarea>
1
2
3
4
$CSVIN = 'c:\temp\csvwithemptyrows.csv'
$CSVOUT = 'c:\temp\noblankrows.csv'
Get-Content $CSVIN | where {$_.trim(',') -ne ""} | set-content $CSVOUT
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Hope this helps

[/quote]
I thought that may be the case, ya I agree, it makes more sense to parse the csv file itself, plus with powershell it makes it much faster since the com object takes forever

Be sure to confirm this works as expected.

[quote quote=231202][/quote]
ok so this kinda works, but there is one problem: i dont want to have to rename the file name to a different name. in other words, i want the csvin and csvout file names to be the same. but i would get this error if i keep them the same name…

$CSVIN = ‘.\test.csv’

$CSVOUT = ‘.\test.csv’

Get-Content $CSVIN | where {$_.trim(’,’) -ne “”} | set-content $CSVOUT

set-content : The process cannot access the file ‘C:\test.csv’ because it is being used by another process.

Surely you can figure out how to work around that.