Hey,<\/p>\n
So I’ve got this Powershell script which imports ~5.5m lines across ~150 csv’s into SQL. The script works fine but it takes approximately 5 hours to complete which seems crazy long to me. Wondering if anyone has some example times for importing from CSV to SQL via Powershell I can compare to? This is the first time I’ve done something like this.<\/p>\n
The data itself is a mixture of booleans, decimal and strings, some strings are quite long.<\/p>\n
The relevant section of the script which does the importing is as follows in case anyone can see something which might help with times. Most of the delay seems to come from importing the CSV to memory prior to the write-sqltabledata command, once it gets to write-sqltabledata that part completes fairly quick.<\/p>\n
\"Getting row count for table $table\"\n $CSVRowCount = Import-CSV $DataCSV | Measure-Object | Select -ExpandProperty Count\n\n \"Deleting database table $table\"\n If ($TableExists) {\n $DeleteDB = \"USE [$SQLDatabase]\n GO\n IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[$table]') AND type in (N'U'))\n DROP TABLE [dbo].[$table]\n GO\"\n Invoke-SQLCmd -Query $DeleteDB -ServerInstance $SQLInstance -Username $SQLUsername -Password $SQLPassword\n }\n\n ### Performing batch import from CSV to SQL\n $SQLImportBatch = 50000\n $LoopsReq = [math]::ceiling($CSVRowCount / $SQLImportBatch)\n For ($i=0; $i -lt $LoopsReq; $i++) {\n ,(Import-Csv -Path $DataCSV | Select -Skip ([Int]($i * $SQLImportBatch)) -First $SQLImportBatch) | Write-SqlTableData -ServerInstance $SQLInstance -DatabaseName $SQLDatabase -SchemaName \"dbo\" -TableName $Table -Force -ConnectionTimeout 14400 -Timeout 0\n }\n<\/code><\/pre>\nServer this runs on is running off spinning disks which I suspect is a large contributer to the slowness. Other than that, CPU sits around 20-40%, memory about 75-85%.<\/p>\n
Cheers!<\/p>","upvoteCount":9,"answerCount":12,"datePublished":"2024-02-23T15:20:20.000Z","author":{"@type":"Person","name":"s-laverty","url":"https://community.spiceworks.com/u/s-laverty"},"suggestedAnswer":[{"@type":"Answer","text":"
Hey,<\/p>\n
So I’ve got this Powershell script which imports ~5.5m lines across ~150 csv’s into SQL. The script works fine but it takes approximately 5 hours to complete which seems crazy long to me. Wondering if anyone has some example times for importing from CSV to SQL via Powershell I can compare to? This is the first time I’ve done something like this.<\/p>\n
The data itself is a mixture of booleans, decimal and strings, some strings are quite long.<\/p>\n
The relevant section of the script which does the importing is as follows in case anyone can see something which might help with times. Most of the delay seems to come from importing the CSV to memory prior to the write-sqltabledata command, once it gets to write-sqltabledata that part completes fairly quick.<\/p>\n
\"Getting row count for table $table\"\n $CSVRowCount = Import-CSV $DataCSV | Measure-Object | Select -ExpandProperty Count\n\n \"Deleting database table $table\"\n If ($TableExists) {\n $DeleteDB = \"USE [$SQLDatabase]\n GO\n IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[$table]') AND type in (N'U'))\n DROP TABLE [dbo].[$table]\n GO\"\n Invoke-SQLCmd -Query $DeleteDB -ServerInstance $SQLInstance -Username $SQLUsername -Password $SQLPassword\n }\n\n ### Performing batch import from CSV to SQL\n $SQLImportBatch = 50000\n $LoopsReq = [math]::ceiling($CSVRowCount / $SQLImportBatch)\n For ($i=0; $i -lt $LoopsReq; $i++) {\n ,(Import-Csv -Path $DataCSV | Select -Skip ([Int]($i * $SQLImportBatch)) -First $SQLImportBatch) | Write-SqlTableData -ServerInstance $SQLInstance -DatabaseName $SQLDatabase -SchemaName \"dbo\" -TableName $Table -Force -ConnectionTimeout 14400 -Timeout 0\n }\n<\/code><\/pre>\nServer this runs on is running off spinning disks which I suspect is a large contributer to the slowness. Other than that, CPU sits around 20-40%, memory about 75-85%.<\/p>\n
Cheers!<\/p>","upvoteCount":9,"datePublished":"2024-02-23T15:20:20.000Z","url":"https://community.spiceworks.com/t/importing-from-csv-to-sql-how-long/967513/1","author":{"@type":"Person","name":"s-laverty","url":"https://community.spiceworks.com/u/s-laverty"}},{"@type":"Answer","text":"
It likely will be IO as you noted, as it has to read it in, convert it and write it to the SQL tables, this will be a lot of little IO which are not great for performance.<\/p>\n
Large sequential writes are best, little random ones are not, for any media, but HDD will be the worst.<\/p>\n
I can’t add anything about the import process or time, but I do think you are right with the issue being the HDDs.<\/p>","upvoteCount":2,"datePublished":"2024-02-23T15:40:04.000Z","url":"https://community.spiceworks.com/t/importing-from-csv-to-sql-how-long/967513/2","author":{"@type":"Person","name":"Rod-IT","url":"https://community.spiceworks.com/u/Rod-IT"}},{"@type":"Answer","text":"
Your script is looping one line at a time and most likely calling the INSERT INTO statement 5 million times. A better option is to use the BULK INSERT command as mentioned on BULK INSERT (Transact-SQL) - SQL Server | Microsoft Learn<\/a><\/p>","upvoteCount":3,"datePublished":"2024-02-23T16:41:38.000Z","url":"https://community.spiceworks.com/t/importing-from-csv-to-sql-how-long/967513/3","author":{"@type":"Person","name":"dannyh2","url":"https://community.spiceworks.com/u/dannyh2"}},{"@type":"Answer","text":"