Saturday, 16 May 2020

[PowerShell] Using PowerShell Hash Tables for Instantaneous Search

I have always been a big fan of hash tables. Very powerful things.

I had a task where I needed to match the dates from some custom data to the Date Dimension from the NetApp OnCommand Insight Data Warehouse.

365 days of data x 3 (there were 3 rows in my custom data table for each date) = 1095 rows

Initially I had a where clause in my FOR loop of 1095 repeats like this:


$dateTK = ($DateDimension | Where{($_.dateYear -eq $dateYear) -and ($_.monthNum -eq $monthNum) -and ($_.dayInMonth -eq $dayInMonth) -and ($_.repDay -eq 1)}).tk


If you are curious, I had got Date Dimension from running -


select * from dwh_capacity.date_dimension;


- in MySQL Workbench and exporting as a CSV. Then importing the CSV into PowerShell as:


$DateDimension = Import-CSV date_dimension.csv


Date Dimension in this case was say 6 years or 2190 individual dates (I was only interested in days which explains the $_.repDay -eq 1 in the PowerShell above).

And my PowerShell program ran very slowly, which makes sense, as for each of my 1095 loops, it had got to search through another array of 2190 until it finds a match.

The solution to speed my program up was to create a DateTK hash table (Date TK was what I was looking for in Date Dimension) prior to running my 1095 loops.

This is what I did:


$DateTK_HT = @()
$DateDimension | Foreach{
  [String]$DY = $_.dateYear
  [String]$DM = $_.monthNum
  [String]$DD = $_.dayInMonth
  If(!($DateTK_HT.$DY)){$DateTK_HT.$DY = @{}}
  If(!($DateTK_HT.$DY.$DM)){$DateTK_HT.$DY.$DM = @{}}
  If($_.repDay -eq 1){
    [String]$DateTK_HT.$DY.$DM.$DD = $_.tk
  }
}


Getting DateTK from my hash table is instantaneous:


[String]$dateTK = $DateTK_HT.$dateYear.$monthNum.$dayInMonth


Instead of having say 2000 loops inside each of my 1095 loops, it went down to 1 simple request from the hash table per loop.

Image: Using PowerShell Hash Tables for Instantaneous Search

No comments:

Post a comment