Monday, April 11, 2011

Powershell Memory Usage

Hi guys,

Im abit of a noob to Powershell so please dont chastise me :-) So I've got some rather large log files (600mb) that I need to process, my script essentially strips out those lines that contain "Message Received" then tokenises those lines and outputs a few of the tokens to an output file.

The logic of the script is fine (although Im sure it could be more efficient) but the problem is that as I write lines to the output file and the file subseuqenly grows larger, the amount of memory that powershell utilises also increases to the point of memory exhaustion.

Can anyone suggest how I can stop this occuring? I thought about breaking up the log into a temporary file of only say 10mb then processing on the temp file instead?

Heres my code, any help you guys could give would be fantastic :-)

Get-Date | Add-Content -Path d:\scripting\logparser\testoutput.txt


$a = Get-Content D:\scripting\logparser\importsample.txt 


foreach($l in $a){
#$l | Select-String -Pattern "Message Received." | Add-Content -Path d:\scripting\logparser\testoutput.txt
if
 (($l | Select-String -Pattern "Message Received." -Quiet) -eq "True")

 {
 #Add-Content -Path d:\scripting\logparser\testoutput.txt -value $l
 $var1,$var2,$var3,$var4,$var5,$var6,$var7,$var8,$var9,$var10,$var11,$var12,$var13,$var14,$var15,$var16,$var17,$var18,$var19,$var20 = [regex]::split($l,'\s+')
 Add-Content -Path d:\scripting\logparser\testoutput.txt -value $var1" "$var2" "$var3" "$var4" "$var16" "$var18

 }
else
 {}
} 
Get-Date | Add-Content -Path d:\scripting\logparser\testoutput.txt
From stackoverflow
  • You're effectively storing the entire log file in memory instead of sequential accessing it bit by bit.

    Assuming that you log file has some internal delimiter for each entry (maybe new line) you'd read in each entry at a time, not keeping more in memory than absolutely necessary.

    You won't be able to rely on the built in PowerShell stuff because it's in affect stupid.

    You'll have to apologize my code sample, my PowerShell is a bit rusty.

    var $reader = Create-Object "System.IO.StreamReader" testoutput.txt
    var $s = ""
    while(($s = reader.ReadLine())!=null)
    { 
        // do something with '$s' 
        // which would contain individual log entries.
    }
    $reader.Close()
    
  • If you do everything in the pipe, only one object at a time (one line from the file in your case) needs to be in memory.

    Get-Content $inputFile | Where-Object { $_ -match "Message Received" } |
      foreach-object -process {
      $fields = [regex]::split($_,'\s+') # An array is created
      Add-Content -path $outputFile -value [String]::Join(" ", $fields[0,1,2,3,15,17])
    }
    

    The $fields[0,1,2,3,15,17] creates an array of the given indices of $fields.

    This could also be done in a single pipeline using an expression rather than a property name passed to Select-Object, but would be less clear.

0 comments:

Post a Comment