export data from csv to sql database

im trying to take info from a csv (9 column headers) and export it to an sql database using powershell. been looking at tons of examples from the internet and im confused.
the brunt of the data will be a one time thing… then, it will be a daily thing where new student names will be added as needed. hopefully doing a check to see if student id is in the database… if not… add this info to the database.
suggestions would be appreciated.

do you have a table designed yet?

I have been looking at suggestions for that and I did design a table but was unable to get that to work correctly saw I deleted that table ( but could easily create one again)… then saw scripts that created the table prior to inputting the data. so I have been trying both ways.

# Database variables 
$sqlserver = "pc1\sqlexpress" 
$database = "students" 
$table = "info" 
  
# CSV variables 
$csvfile = "C:\pathtofile\hs-stu-pass.csv" 
$csvdelimiter = "," 
$firstRowColumns = $false 
  
################### No need to modify anything below ################### 
Write-Host "Script started..." 
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()  
[void][Reflection.Assembly]::LoadWithPartialName("System.Data") 
[void][Reflection.Assembly]::LoadWithPartialName("System.Data.SqlClient") 
  
# 50k worked fastest and kept memory usage to a minimum 
$batchsize = 50000 
  
# Build the sqlbulkcopy connection, and set the timeout to infinite 
$connectionstring = "Data Source=$sqlserver;Integrated Security=true;Initial Catalog=$database;" 
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connectionstring, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock) 
$bulkcopy.DestinationTableName = $table 
$bulkcopy.bulkcopyTimeout = 0 
$bulkcopy.batchsize = $batchsize 
  
# Create the datatable, and autogenerate the columns. 
$datatable = New-Object System.Data.DataTable 
  
# Open the text file from disk 
$reader = New-Object System.IO.StreamReader($csvfile) 
$columns = (Get-Content $csvfile -First 1).Split($csvdelimiter) 
if ($firstRowColumns -eq $true) { $null = $reader.readLine() } 
  
foreach ($column in $columns) {  
    $null = $datatable.Columns.Add() 
} 
  
# Read in the data, line by line 
while (($line = $reader.ReadLine()) -ne $null)  { 
    
    $null = $datatable.Rows.Add($line.Split($csvdelimiter)) 
 
    $i++; if (($i % $batchsize) -eq 0) {  
        $bulkcopy.WriteToServer($datatable)  
        Write-Host "$i rows have been inserted in $($elapsed.Elapsed.ToString())." 
        $datatable.Clear()  
    }  
}  
  
# Add in all the remaining rows since the last clear 
if($datatable.Rows.Count -gt 0) { 
    $bulkcopy.WriteToServer($datatable) 
    $datatable.Clear() 
} 
  
# Clean Up 
$reader.Close(); $reader.Dispose() 
$bulkcopy.Close(); $bulkcopy.Dispose() 
$datatable.Dispose() 
  
Write-Host "Script complete. $i rows have been inserted into the database." 
Write-Host "Total Elapsed Time: $($elapsed.Elapsed.ToString())" 
# Sometimes the Garbage Collector takes too long to clear the huge datatable. 
[System.GC]::Collect()

the furthest I got was with this script… but confused with table existing before running the script or letting the script build the tables.

You can go either way. T-SQL has an “IF EXISTS” statement you can run, so that a CREATE TABLE statement would only run if the table didn’t exist. The difficulty here is really more in using T-SQL than PowerShell per se, I think.

Take a look at our “Building Trend and Historical Reports” ebook - the accompanying module (in our GitHub repo) has examples that might prove useful.

ok… thanks don… i’ll look into that. thanks for the direction… its appreciated.

im having a hard time finding the examples… is there a link?

Here’s an example of inserting a row into a table.

$dbconn = New-Object System.Data.SqlClient.SqlConnection
$dbconn.connectionstring = “Server=$sqlserver; Database=‘Reporting’; Integrated Security=TRUE”
$dbconn.Open()
$dbwrite = $dbconn.CreateCommand()
$dbwrite.CommandText = “INSERT INTO $table (UPN,AzureRMS,Office_PP,Lync,Office_Online,Sharepoint,Exchange,Visio,Project,ExchangeArchive,Department,LastModification,Added_by)`
VALUES (’$col1’,’$col2’,’$col3’,’$col4’,’$col5’,’$col6’,’$col7’,’$col8’,’$col9’,’$col10’,’$col11’,’$col12’,’$col13’)”
$dbwrite.ExecuteNonQuery()
$dbconn.Close()

ok…so UPN,AzureRMS,etc would be column names. and VALUES (’$col1’,’$col2’, … the $col1 would be the values to be inserted into those columns?

$logQuery = new-object -ComObject "MSUtil.LogQuery"

$inputFormat = new-object -comobject "MSUtil.LogQuery.CSVInputFormat"

$outputFormat = new-object -comobject "MSUtil.LogQuery.SQLOutputFormat"

$outputFormat.server = "HP600-WIN8-MG\sqlexpress"

$outputFormat.database = "studentinfo"

$outputFormat.driver = "SQL Server"

$outputFormat.createTable = $true

$query = "SELECT stuid, lastname, firstname, gradelevel, status, homeroom, hrteacher, newpassword INTO stuinfo FROM c:\sqlproject\HS-Stu-pass.csv"

$null = $logQuery.ExecuteBatch($query,$inputFormat,$outputFormat)

ok…everyones suggestions and thoughts put me in this direction…much appreciated. this is what I did to get the csv info into the sql database ( with no tables in it) along with “logparser”
worked great… created the table along with the columns, and 1400 lines exported to sql in 4 seconds.

going forward im going to download a new csv on a daily basis that will contain new students along with existing students ( as far as the sql database is concerned)
I will need to export new students into the database, and possibly change the data in the gradelevel, status, homeroom, hrteacher columns of the existing students in the sql database.
my question is what would be the best way to approach this?

Personally, I prefer Dan’s approach. Using this allows you to dynamically generate the values from the variables. Also, it then means that all you really need to worry about is getting the right SQL code.

Why not forget about the PowerShell code for a bit, and try to build up the different type of TSQL that you’d need for these actions? You can use SQL Server Management Studio for this and validate your results straight away. Then once you’ve got the right TSQL, all you need to do is modify your code accordingly with the right queries.

thanks for the response Tim.

my sql/tsql skills are in the beginning stages. so using sql server management studio for this is difficult at best… I just don’t have enough grasp of it yet… but im reading and learning in between posts here. combining this with powershell is really challenging but very enjoyable.
is my assumption of dan’s coding correct?
UPN, AzureRMS, etc would be column names. and then VALUES (’$col1’,’$col2’, …. the $col1 would be the values to be inserted into those columns?

if so… how would I assign values to those variables? im guessing a foreach block taking the info from a csv… is this correct?

thanks again for the response… I appreciate the time you and everyone has taken to give pointers.

right

$users = import-csv userlist.csv

$dbconn = New-Object System.Data.SqlClient.SqlConnection
$dbconn.connectionstring = “Server=$sqlserver; Database=‘Reporting’; Integrated Security=TRUE”
$dbconn.Open()

Foreach($i in $users){

$col1 = $i.upn
$col2 = $i.azurerms
and so on.

$dbwrite = $dbconn.CreateCommand()
$dbwrite.CommandText = “INSERT INTO $table (UPN,AzureRMS,Office_PP,Lync,Office_Online,Sharepoint,Exchange,Visio,Project,ExchangeArchive,Department,LastModification,Added_by)`
VALUES (’$col1’,’$col2’,’$col3’,’$col4’,’$col5’,’$col6’,’$col7’,’$col8’,’$col9’,’$col10’,’$col11’,’$col12’,’$col13’)”
$dbwrite.ExecuteNonQuery()

}

$dbconn.Close()

Or something like it.

thanks dan for the response. its appreciated.

just wanted to drop a thank you to everyone that responded with pointers. my head has been buried in powershell / sql and I have accomplished so much. but of course there is more to learn and conquer.
thanks again Dan, Tim, and Don for all suggestions.