I have a client GL file (Excel) that needs to be regularly imported to SQL Server (as tab-delimited txt).
Sample (simplified):
PayCode PayDesc Dept1 Dept2 Dept3 ======================================== REG Regular 00001 00001 00002 VAC Vacation 00011 00012 00013 HOL Holiday 00021 00022 00022
However, the number of columns (departments) may change over time, and SQL Server doesn't support importing files with an uncertain column count. So rather than recode the SQL import each time a change occurs, I decided to "unpivot" the data prior to import, so the resulting table structure will always be static:
PayCode PayDesc DeptCode GLCode ==================================== REG Regular Dept1 00001 REG Regular Dept2 00001 REG Regular Dept3 00002 VAC Vacation Dept1 00011 VAC Vacation Dept2 00012 VAC Vacation Dept3 00013 HOL Holiday Dept1 00021 HOL Holiday Dept2 00022 HOL Holiday Dept3 00022
I've got the script functioning, but being a PS lightweight, I thought I would solicit for enhancement ideas. I feel like I'm missing good pipeline opportunities due to inexperience.
Code updated using feedback from briantist
$srcDir = "C:"
$srcTab = "Sheet1"
$srcFile = Join-Path -Path $srcDir -ChildPath "ClientGL.xlsx"
$outFile = Join-Path -Path $srcDir -ChildPath "ClientGL.txt"
$delim = "`t"
## Export from Excel to tab-delimited .txt
Write-Host "Launch Excel"
$Excel = New-Object -comobject Excel.Application
$Excel.Visible = $False
$Excel.displayalerts = $False
Write-Host "Open $srcFile"
$Workbook = $Excel.Workbooks.Open($srcFile)
$Workbook.Worksheets.Item($srcTab).Select()
Write-Host "Export $outFile as Tab-Delimited TXT"
$Workbook.SaveAs("$outFile",-4158)
Write-Host "Quit Excel"
$Excel.Quit()
If(Get-Process -name excel){Stop-Process -name excel}
## Import data and headers
Write-Host "Import $outFile data"
$inData = Import-Csv -Path $outFile -Delimiter $delim
Write-Host "Import $outFile headers"
$inHeaders = (Get-Content -Path $outFile -TotalCount 1) -split $delim
## Build output datatable
Write-Host "Build output datatable"
$outData = New-Object System.Data.DataTable
[Void]$outData.Columns.Add("PayCode",[string])
[Void]$outData.Columns.Add("PayDesc",[string])
[Void]$outData.Columns.Add("DeptCode",[string])
[Void]$outData.Columns.Add("GLCode",[string])
## Unpivot Department columns, and add results to datatable
Write-Host "Unpivot each Dept column and add data to datatable"
for($col=2; $col -lt $inHeaders.Count; $col++)
{
$dept = $inHeaders[$col] #current $col header name
foreach($row in $inData)
{
$dr = $outData.NewRow();
$dr.Item("PayCode") = $row.PayCode
$dr.Item("PayDesc") = $row.PayDesc
$dr.Item("DeptCode") = $dept #header name
$dr.Item("GLCode") = $row.$dept #row.col value
$outData.Rows.Add($dr)
}
}
## Output modified file
Write-Host "Output modified $outFile"
$outData | Export-Csv -Path $outFile -Delimiter $delim -notypeinformation -Force
Write-Host "Remove quotes from file"
(Get-Content -Path $outFile) | ForEach-Object {$_ -replace '"', ""} | Out-File -FilePath $outFile -Force
1 Answer 1
Path Joining
$srcDir = "C:"
$srcFile = "$SrcDir\ClientGL.xlsx"
$srcTab = "Sheet1"
$outFile = "$SrcDir\ClientGL.txt"
Although it may seem unnecessary in this case, you should know that Join-Path
exists.
$srcDir = "C:"
$srcFile = $SrcDir | Join-Path -ChildPath "ClientGL.xlsx"
$srcTab = "Sheet1"
$outFile = Join-Path $SrcDir "ClientGL.txt"
The nice thing about using this (again, unnecessary here) is that it takes care of leading and trailing path separators, so the following:
Join-Path "C:\" "\Windows"
Join-Path "C:" "Windows"
Will both produce C:\Windows
. That makes it great for user input.
Write-Host
vs Write-Verbose
This can be debated, but some (myself included) take a cautious approach with Write-Host
. The reason is that it forces the output to the screen, instead of letting the caller decide whether they want status messages.
Ultimately, you take this on a case-by-case, and sometimes Write-Host
is the right tool, and maybe it is for you in this case.
Using Write-Verbose
instead, you would normally not see the output, unless you call your script with -Verbose
, and then you will see it.
To add support for this, you need to add a parameter block to your script, even if you aren't going to take any parameters.
Top of script:
[CmdletBinding()]
param()
That will automatically add support for -Verbose
.
Aliases
If(ps excel){kill -name excel}
and
(Get-Content $outFile) | % {$_ -replace '"', ""} | Out-File -FilePath $outFile -Force
ps
is an alias for Get-Process
.
kill
is an alias for Stop-Process
.
%
is an alias for ForEach-Object
.
I love PowerShell's aliases, but when it comes to writing a script that's meant to be re-used often or might ever be read by other people, I am of the school that one should use the full cmdlet/function name, and use full parameter names.
I use aliases when I'm using PowerShell as a shell and typing commands to run immediately, or to quickly prototype something.
For me it's a matter of clarify and readability, but this is of course personal preference (I have met at least one person who feels differently and uses aliases everywhere).
Reading the first line of a file
$inHeaders = (Get-Content $outFile | Select-Object -First 1).Split("`t")
If you're using PowerShell 2.0, the above line will read every line of the file, and even though Select-Object
will only give you back the first 1, it has no way of stopping the pipeline. If your file is large, this could take a very long time.
If you're using PowerShell 3+, then the above will actually only read the first line because Select-Object
will stop the pipeline.
Note that Get-Content
has a -TotalCount
parameter though, so you can use:
$inHeaders = (Get-Content $outFile -TotalCount 1).Split("`t")
and it will read just the first line.
.Split()
vs -Split
$inHeaders = (Get-Content $outFile | Select-Object -First 1).Split("`t")
Here you're using the .Split()
method of a [String]
. PowerShell also has a -Split
operator. The operator has a lot of nifty features that from what I've seen no one ever uses, because they don't know they exist or (in my case) they always forget about them when needed.
But the main difference that you probably will remember is that the operator takes a regular expression, and the method takes a string.
There's nothing wrong with either, just wanted you to be aware of the differences.
Accepting Parameters
$srcDir = "C:"
$srcFile = "$SrcDir\ClientGL.xlsx"
$srcTab = "Sheet1"
$outFile = "$SrcDir\ClientGL.txt"
Consider making these parameters that can be passed to the script. You could also make them have defaults that way.
Top of script:
[CmdletBinding()
param(
[Parameter(
Mandatory=$false
)]
[String]
$Path = "C:" ,
[Parameter(
Mandatory=$false
)]
[String]
$srcTab = "Sheet1"
)
$srcFile = "$Path\ClientGL.xlsx"
$outFile = "$Path\ClientGL.txt"
Now you can call the script like so:
.\MyScript.ps1 -Path "C:\Some\Other\Path\" -srcTab "Expenses"
Now you can see why using Join-Path
would save you some headache because you're accepting user input!
Just to break down this param()
block a bit:
[Parameter()]
can take multiple comma separated options; I've shown you Mandatory
. Mandatory means mandatory to be supplied by the caller
not to have a value, so when you use a default value like we are here, you don't want to make the parameter mandatory. Parameters are by default optional, so this was unnecessary; I was just demonstrating it.
The entire [Parameter()]
attribute is also optional, and could be left out.
[String]
is specifying the data type for the parameter. PowerShell will try to coerce input values into this type if possible. Data type is also optional.
$Path
: this is the name of the parameter. It is the only part that's mandatory. I used $Path
instead of $srcDir
because PowerShell has conventions for certain names to represent certain things.
So generally any cmdlet or function that accepts a path will use -Path
. Same for -ComputerName
, -Identity
, -Credential
.
Also note that all of these pieces could have been on a single line. I like to separate them for clarity.
For more on these, have a look at Create a Truly Advanced PowerShell Function and the about_Functions_Advanced and about_Functions_Advanced_Parameters help topics.
-
\$\begingroup\$ Good stuff. Since most of my code is clobbered together from what I could find online, these are helpful tips. \$\endgroup\$Matt– Matt2015年06月04日 20:29:29 +00:00Commented Jun 4, 2015 at 20:29
-
\$\begingroup\$ Edited to add a section on accepting parameters. \$\endgroup\$briantist– briantist2015年06月04日 20:44:29 +00:00Commented Jun 4, 2015 at 20:44
-
1\$\begingroup\$ Considering converting to a generic parameterized script, just as you suggest, once I get the base code cleaned up. \$\endgroup\$Matt– Matt2015年06月04日 20:52:41 +00:00Commented Jun 4, 2015 at 20:52