log_parser.py
author Tero Marttila <terom@fixme.fi>
Wed, 11 Feb 2009 22:24:55 +0200
changeset 111 95c0c49d76aa
parent 110 37e67ec434f3
permissions -rw-r--r--
implement prev/next_date in LogSource
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     1
"""
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     2
    Parse log data into log_events
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     3
"""
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     4
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     5
import re
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     6
import datetime
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     7
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     8
from log_line import LogTypes, LogLine
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     9
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    10
class LogParseError (Exception) :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    11
    """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    12
        Parsing some line failed
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    13
    """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    14
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    15
    def __init__ (self, line, offset, message) :
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    16
        super(LogParseError, self).__init__("%r@%s: %s" % (line, offset, message))
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    17
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    18
class LogParser (object) :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    19
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    20
        Abstract interface
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    21
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    22
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    23
    def __init__ (self, tz, timestamp_fmt="%H:%M:%S") :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    24
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    25
            Setup the parser to use the given format for line timestamps, which are of the given timezone
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    26
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    27
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    28
        self.tz = tz
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    29
        self.timestamp_fmt = timestamp_fmt
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    30
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    31
    def parse_lines (self, channel, lines, date=None, starting_offset=None) :
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    32
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    33
            Parse the given (iterable) lines of unicode text into a LogEvent, no trailing newline.
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
    34
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    35
            Channel is the LogChannel that these lines belong to.
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    36
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
    37
            Offset is the starting offset, and may be None to not use it.
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    38
            
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    39
            Giving date lets the parser build full timestamps, otherwise, unless line timestamps have full date
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    40
            information, event timestamps will have a date component of 1900/1/1.
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    41
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    42
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    43
        abstract
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    44
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    45
class IrssiParser (LogParser) :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    46
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    47
        A parser for irssi logfiles
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    48
    """
110
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    49
    
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    50
    # timestamp prefix, with trailing space
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    51
    _TS = r'(?P<timestamp>[a-zA-Z0-9: ]+[a-zA-Z0-9])\s*'
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    52
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    53
    # subexpression parts
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    54
    _NICK = r'(?P<nickname>.+?)'
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    55
    _NICK2 = r'(?P<nickname2>.+?)'
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    56
    _TARGET = r'(?P<target>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    57
    _CHAN = r'(?P<channel>.+?)'
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    58
    _CHAN2 = r'(?P<channel2>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    59
    _USERHOST = r'(?P<username>.*?)@(?P<hostname>.*?)'
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    60
    _MSG = r'(?P<message>.*)'
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    61
    _SRV1 = r'(?P<server1>.+?)'
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    62
    _SRV2 = r'(?P<server2>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    63
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    64
    # regular expressions for matching lines, by type
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    65
    TYPE_EXPRS = (
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    66
        (   LogTypes.LOG_OPEN,      r'--- Log opened (?P<datetime>.+)'                              ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    67
        (   LogTypes.LOG_CLOSE,     r'--- Log closed (?P<datetime>.+)'                              ),
110
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    68
        (   LogTypes.MSG,           _TS + r'<(?P<flags>.)' + _NICK + '> ' + _MSG                   ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    69
        (   LogTypes.NOTICE,        _TS + r'-' + _NICK + ':' + _CHAN + '- ' + _MSG                 ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    70
        (   LogTypes.ACTION,        _TS + r'\* ' + _NICK + ' ' + _MSG                             ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    71
        (   LogTypes.JOIN,          _TS + r'-!- ' + _NICK + ' \[' + _USERHOST + '\] has joined ' + _CHAN                               ), 
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    72
        (   LogTypes.PART,          _TS + r'-!- ' + _NICK + ' \[' + _USERHOST + '\] has left ' + _CHAN + ' \[(?P<message>.*?)\]'       ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    73
        (   LogTypes.KICK,          _TS + r'-!- ' + _TARGET + ' was kicked from ' + _CHAN + ' by ' + _NICK + ' \[(?P<message>.*?)\]'   ),
103
0e829e6275dc implement --until, and fix handling of ServerMode
Tero Marttila <terom@fixme.fi>
parents: 97
diff changeset
    74
        # XXX: use hostname instead of nickname for ServerMode
110
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    75
        (   LogTypes.MODE,          _TS + r'-!- (mode|ServerMode)/' + _CHAN + ' \[(?P<mode>.+?)\] by (?P<nickname>\S+)'                ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    76
        (   LogTypes.NICK,          _TS + r'-!- ' + _NICK + ' is now known as (?P<target>\S+)'                                         ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    77
        (   LogTypes.QUIT,          _TS + r'-!- ' + _NICK + ' \[' + _USERHOST + '\] has quit \[(?P<message>.*?)\]'                     ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    78
        (   LogTypes.TOPIC,         _TS + r'-!- (' + _NICK + ' changed the topic of ' + _CHAN + ' to: (?P<topic>.*)|Topic unset by ' + _NICK2 + ' on ' + _CHAN2 + ')'    ),
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    79
110
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    80
        (   LogTypes.SELF_NOTICE,   _TS + r'\[notice\(' + _CHAN + '\)\] ' + _MSG                   ),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    81
        (   LogTypes.SELF_NICK,     _TS + r'-!- You\'re now known as (?P<target>\S+)'              ),
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    82
110
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    83
        (   LogTypes.NETSPLIT_START,    _TS + r'-!- Netsplit ' + _SRV1 + ' <-> ' + _SRV2 + ' quits: (?P<nick_list>[^(]+)( \(\+(?P<count>\d+) more,\S+\))?'),
37e67ec434f3 make IrssiParser a bit more resilient as to whitespace in/after timestamps
Tero Marttila <terom@fixme.fi>
parents: 109
diff changeset
    84
        (   LogTypes.NETSPLIT_END,      _TS + r'-!- Netsplit over, joins: (?P<nick_list>[^(]+)( \(\+(?P<count>\d+) more\))?'              ),
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    85
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    86
        (   'DAY_CHANGED',          r'--- Day changed (?P<date>.+)'                                 ),
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    87
    )
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    88
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    89
    # precompile
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    90
    TYPE_REGEXES = [(type, re.compile(expr)) for type, expr in TYPE_EXPRS]
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    91
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    92
    def parse_line (self, channel, line, date, offset=None) :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    93
        """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    94
            Parse a single line, and return the resulting LogLine, or None, to ignore the line.
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    95
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    96
            Uses self.TYPE_REGEXES to do the matching
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    97
        """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    98
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    99
        # empty line
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   100
        if not line :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   101
            return
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   102
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   103
        # look for match
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   104
        match = type = None
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   105
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   106
        # test each type
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   107
        for type, regex in self.TYPE_REGEXES :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   108
            # attempt to match
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   109
            match = regex.match(line)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   110
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   111
            # found, break
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   112
            if match :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   113
                break
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   114
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   115
        # no match found?
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   116
        if not match :
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   117
            raise LogParseError(line, offset, "Line did not match any type")
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   118
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   119
        # match groups
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   120
        groups = match.groupdict(None)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   121
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   122
        # parse timestamp
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   123
        if 'datetime' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   124
            # parse datetime using default asctime() format
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   125
            dt = datetime.datetime.strptime(groups['datetime'], '%a %b %d %H:%M:%S %Y')
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   126
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   127
        elif 'timestamp' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   128
            # parse timestamp into naive datetime
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   129
            dt = datetime.datetime.strptime(groups['timestamp'], self.timestamp_fmt)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   130
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   131
            # override date?
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   132
            if date :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   133
                dt = dt.replace(year=date.year, month=date.month, day=date.day)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   134
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   135
        elif 'date' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   136
            # parse date-only datetime
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   137
            dt = datetime.datetime.strptime(groups['date'], '%a %b %d %Y')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   138
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   139
        else :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   140
            # no timestamp !?
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   141
            raise LogParseError(line, offset, "No timestamp")
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   142
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   143
        # now localize with timezone
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   144
        dtz = self.tz.localize(dt)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   145
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   146
        # channel, currently unused
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   147
        channel_name = (groups.get('channel') or groups.get('channel2'))
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   148
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   149
        # source
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   150
        if 'server1' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   151
            source = (None, None, groups.get('server1'), None)
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   152
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   153
        else :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   154
            source = (groups.get('nickname') or groups.get('nickname2'), groups.get('username'), groups.get('hostname'), groups.get('flags'))
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   155
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   156
        # target
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   157
        if 'server2' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   158
            target = groups.get('server2')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   159
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   160
        else :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   161
            target = groups.get('target')
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   162
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   163
        # data
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   164
        if 'message' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   165
            data = groups['message']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   166
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   167
        elif 'mode' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   168
            data = groups['mode']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   169
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   170
        elif 'topic' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   171
            data = groups['topic']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   172
        
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   173
        elif 'nick_list' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   174
            # split into components
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   175
            list = groups['nick_list'].split(', ')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   176
            
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   177
            # additional count?
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   178
            if 'count' in groups and groups['count'] :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   179
                list.append('+%d' % int(groups['count']))
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   180
            
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   181
            # join
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   182
            data = ' '.join(list)
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   183
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   184
        else :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   185
            data = None
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   186
        
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   187
        # custom types?
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   188
        if type == 'DAY_CHANGED' :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   189
            # new date
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   190
            date = dtz
109
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   191
        
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   192
        else :
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   193
            # build+return (date, LogLine)
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   194
            return date, LogLine(channel, offset, type, dtz, source, target, data)
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   195
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   196
    def parse_lines (self, channel, lines, date=None, starting_offset=None) :
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   197
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   198
            Parse the given lines, yielding LogEvents. 
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   199
        """
65
8b50694f841e improve search further
Tero Marttila <terom@fixme.fi>
parents: 64
diff changeset
   200
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   201
        for offset, line in enumerate(lines) :
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   202
            # offset?
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   203
            if starting_offset :
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   204
                offset = starting_offset + offset
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   205
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   206
            else :
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   207
                offset = None
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   208
            
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   209
            # try and parse
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   210
            try :
109
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   211
                # get None or (date, line)
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   212
                line_info = self.parse_line(channel, line, date, offset)
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   213
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   214
           # passthrough LogParseError's
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   215
            except LogParseError :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   216
                raise
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   217
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   218
            # wrap other errors as LogParseError
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   219
            except Exception, e :
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   220
                raise LogParseError(line, offset, "Parsing line failed: %s" % e)
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   221
            
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   222
            else :
109
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   223
                # nothing?
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   224
                if not line_info :
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   225
                    continue
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   226
                
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   227
                # unpack, update date
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   228
                date, line = line_info
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   229
                
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   230
                # yield
ca82d0fee336 fix handling of custom types by parser/formatter
Tero Marttila <terom@fixme.fi>
parents: 103
diff changeset
   231
                yield line
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   232
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   233