I have always been a big fan of hash tables. Very
powerful things.
I had a task where I needed to match the dates from some
custom data to the Date Dimension from the NetApp OnCommand Insight Data
Warehouse.
365 days of data x 3 (there were 3 rows in my custom data
table for each date) = 1095 rows
Initially I had a where clause in my FOR loop of 1095
repeats like this:
$dateTK = ($DateDimension | Where{($_.dateYear -eq
$dateYear) -and ($_.monthNum -eq $monthNum) -and ($_.dayInMonth -eq
$dayInMonth) -and ($_.repDay -eq 1)}).tk
If you are curious, I had got Date Dimension from
running -
select * from dwh_capacity.date_dimension;
- in MySQL Workbench and exporting as a CSV. Then
importing the CSV into PowerShell as:
$DateDimension = Import-CSV
date_dimension.csv
Date Dimension in this case was say 6 years or 2190
individual dates (I was only interested in days which explains the $_.repDay -eq 1 in the PowerShell
above).
And my PowerShell program ran very slowly, which makes
sense, as for each of my 1095 loops, it had got to search through another array
of 2190 until it finds a match.
The solution to speed my program up was to create a DateTK
hash table (Date TK was what I was looking for in Date Dimension) prior to
running my 1095 loops.
This is what I did:
$DateTK_HT = @()
$DateDimension | Foreach{
[String]$DY =
$_.dateYear
[String]$DM =
$_.monthNum
[String]$DD =
$_.dayInMonth
If(!($DateTK_HT.$DY)){$DateTK_HT.$DY
= @{}}
If(!($DateTK_HT.$DY.$DM)){$DateTK_HT.$DY.$DM
= @{}}
If($_.repDay
-eq 1){
[String]$DateTK_HT.$DY.$DM.$DD
= $_.tk
}
}
Getting DateTK from my hash table is instantaneous:
[String]$dateTK = $DateTK_HT.$dateYear.$monthNum.$dayInMonth
Instead of having say 2000 loops inside each of my 1095
loops, it went down to 1 simple request from the hash table per loop.
Image: Using PowerShell Hash Tables for Instantaneous
Search
Comments
Post a Comment