Wednesday, April 13, 2011

A regular expression to retrieve the previous line in a log file

My log files contain the following:

2009-03-12T12:44:27+0000 something was logged
2009-03-12T12:45:36+0000 127.0.0.1 127.0.0.1 <auth.info> last message repeated 2 times

I can write a regular expression that retrieves the line with the "last message repeated..." statement, however, that line is meaningless without also retrieving the line that precedes it.

With that being said, does anyone know of a regular expression that would allow me to retrieve both lines whenever the "last message repeated..." statement is detected?

From stackoverflow
  • I would do it this way. Try to search for pattern that includes two groups. First group is a line followed by another group containing another line with "last message repeated" text. Then the content of the first group is the text you are looking for.

    Something like this (this is overly simplified regex):

    \n(.*)\n(.*)last message repeated
    

    Now first group value contain the line you are interested in.

    Huuuze : What would that regex look like?
    goldenmean : @David: Can u give the regex with two groups as u said?
    David Pokluda : Regexes added to the answer. Those are simple but you get the idea. They work - I verified them in Regex Buddy.
  • Edited to be 2 group matching regex. You can give it a shot at: RegexLib

    Less then optimized but this:

    ([\r\n].*?)(?:=?\r|\n)(.*?(?:last message repeated).*)
    

    Should work to get results out of something like this:

    2009-03-12T12:44:27+0000 something1 was logged
    2009-03-12T12:44:27+0000 something2 was logged
    2009-03-12T12:45:36+0000 127.0.0.1 127.0.0.1 <auth.info> last message repeated 2 times
    2009-03-12T12:44:27+0000 something3 was logged
    2009-03-12T12:44:27+0000 something4 was logged
    2009-03-12T12:44:27+0000 something5 was logged
    2009-03-12T12:45:36+0000 127.0.0.1 127.0.0.1 <auth.info> last message repeated 2 times
    

    Resulting in:

    Matches
    First Match, First Group: 2009-03-12T12:44:27+0000 something2 was logged
    First Match, Second Group: 2009-03-12T12:45:36+0000 127.0.0.1 127.0.0.1 <auth.info> last message repeated 2 times
    Second Match, First Group: 2009-03-12T12:44:27+0000 something5 was logged 
    Second Match, Second Group: 2009-03-12T12:45:36+0000 127.0.0.1 127.0.0.1 <auth.info> last message repeated 2 times
    
  • Does it have to be regex? grep allows you to get a context before and after match (-B NUM and -A NUM options)

    Huuuze : Good answer, but, yes, it has to be a regex.
  • The pattern ^.*$ matches a whole line. Translation: Start Of Line, followed by any number of characters, followed by End Of Line. So perhaps you can search for "any line, followed by" (the pattern you have there).

0 comments:

Post a Comment