Wednesday, 25 June 2014

PowerShell Script to get Text Inbetween Quotes

The following PowerShell script reads an input file, and then outputs the strings/sentences/paragraphs that are in-between quotes. This script works even if the quotations marks are on other lines. Added extras are removing additional spaces and tabs so the output sentences don’t have ugly gaps in them.

It was developed for getting ‘Descriptions’ out of mib files.

Note: The script has been formatted for blogger by removing leading tabs.

The Script

###################
# START OF SCRIPT #
###################

####################################
# Text in between quotes extractor #
####################################

$inputDoc = get-content netapp.mib
# this is the input file (change as required)

New-Item output.txt -itemtype file -force
# output.txt is our output file

$recording = 0
# This variable is:
# 0 if we're not recording, and
# 1 if we're recording between quotation marks

$justTurnedOn = 0
# This variable is 1 if we've just turned on recording
# and don't want the first quotation mark in our output,
# otherwise it is zero

[string]$extractedDescription = ""
# This string contains the extracted line

$spaceCount = 0
# We also count spaces since we don't want loads of spaces in the output.
# More than 1 space and we won't record extra space.

foreach($line in $inputDoc){ #

$line = $line -replace "`t"," "
# Removing tabs from the string

$lineLength = $line.length
# Recording the line length.
# In the do loop we check each character in the line.

$i = 0
# Count starts at 0

do { ## while ($i -le $lineLength)

# IF the character is a quotation mark, and we're not recording
# start recording ($recording = 1)
# and flag that we've just started recording ($justTurnedOn = 1)

if (($line[$i] -eq '"') -and ($recording -eq 0)){
$recording = 1
$justTurnedOn = 1
}

# IF the character is a quotation mark, we're recording,
# and not just turned on recording
# stop recording ($recording = 0)
# Add to our output file the extracted string
# Reset the extracted string to ""

if (($line[$i] -eq '"') -and ($recording -eq 1) -and ($justTurnedOn -ne 1)){
$recording = 0
Add-Content output.txt "$extractedDescription"
[string]$extractedDescription = ""
}

# IF the character is a space, increase space count
# IF the character is not a space, $spaceCount = 0

if ($line[$i] -eq " "){$spaceCount++}
if ($line[$i] -ne " "){$spaceCount = 0}

# IF recording is turned on ($recording -eq 1)
# we've not just turned on recording
# and space count is less and or equal to 1
# Add the character to our extracted string

if (($recording -eq 1) -and ($justTurnedOn -ne 1) -and ($spaceCount -le 1)){
$extractedDescription += $line[$i]
}
             
$justTurnedOn = 0
# Sets recording has just been turned on to off (0)

$i++
# Increment $i (the character number in the line)
      
} while ($i -le $lineLength) ## END of do      

} # END foreach($line in $inputDoc)

notepad output.txt
# Opens out output in notepad

#################
# END OF SCRIPT #
#################

No comments:

Post a Comment