log_parser.py
author Tero Marttila <terom@fixme.fi>
Wed, 11 Feb 2009 04:19:10 +0200
changeset 104 34c65a8c8b94
parent 103 0e829e6275dc
child 109 ca82d0fee336
permissions -rw-r--r--
split scripts/search-index options into groups
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     1
"""
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     2
    Parse log data into log_events
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     3
"""
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     4
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     5
import re
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     6
import datetime
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     7
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     8
from log_line import LogTypes, LogLine
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
     9
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    10
class LogParseError (Exception) :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    11
    """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    12
        Parsing some line failed
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    13
    """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    14
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    15
    def __init__ (self, line, offset, message) :
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    16
        super(LogParseError, self).__init__("%r@%s: %s" % (line, offset, message))
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    17
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    18
class LogParser (object) :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    19
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    20
        Abstract interface
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    21
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    22
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    23
    def __init__ (self, tz, timestamp_fmt="%H:%M:%S") :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    24
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    25
            Setup the parser to use the given format for line timestamps, which are of the given timezone
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    26
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    27
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    28
        self.tz = tz
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    29
        self.timestamp_fmt = timestamp_fmt
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    30
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    31
    def parse_lines (self, channel, lines, date=None, starting_offset=None) :
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    32
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    33
            Parse the given (iterable) lines of unicode text into a LogEvent, no trailing newline.
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
    34
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    35
            Channel is the LogChannel that these lines belong to.
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    36
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
    37
            Offset is the starting offset, and may be None to not use it.
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    38
            
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    39
            Giving date lets the parser build full timestamps, otherwise, unless line timestamps have full date
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    40
            information, event timestamps will have a date component of 1900/1/1.
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    41
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    42
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    43
        abstract
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    44
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    45
class IrssiParser (LogParser) :
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    46
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    47
        A parser for irssi logfiles
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    48
    """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    49
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    50
    # subexpression parts
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    51
    _TS = r'(?P<timestamp>\S+)'
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    52
    _NICK = r'(?P<nickname>.+?)'
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    53
    _NICK2 = r'(?P<nickname2>.+?)'
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    54
    _TARGET = r'(?P<target>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    55
    _CHAN = r'(?P<channel>.+?)'
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    56
    _CHAN2 = r'(?P<channel2>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    57
    _USERHOST = r'(?P<username>.*?)@(?P<hostname>.*?)'
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    58
    _MSG = r'(?P<message>.*)'
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    59
    _SRV1 = r'(?P<server1>.+?)'
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    60
    _SRV2 = r'(?P<server2>.+?)'
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    61
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    62
    # regular expressions for matching lines, by type
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    63
    TYPE_EXPRS = (
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    64
        (   LogTypes.LOG_OPEN,      r'--- Log opened (?P<datetime>.+)'                              ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    65
        (   LogTypes.LOG_CLOSE,     r'--- Log closed (?P<datetime>.+)'                              ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    66
        (   LogTypes.MSG,           _TS + r' <(?P<flags>.)' + _NICK + '> ' + _MSG                   ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    67
        (   LogTypes.NOTICE,        _TS + r' -' + _NICK + ':' + _CHAN + '- ' + _MSG                 ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    68
        (   LogTypes.ACTION,        _TS + r'  \* ' + _NICK + ' ' + _MSG                             ),
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    69
        (   LogTypes.JOIN,          _TS + r' -!- ' + _NICK + ' \[' + _USERHOST + '\] has joined ' + _CHAN                               ), 
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    70
        (   LogTypes.PART,          _TS + r' -!- ' + _NICK + ' \[' + _USERHOST + '\] has left ' + _CHAN + ' \[(?P<message>.*?)\]'       ),
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    71
        (   LogTypes.KICK,          _TS + r' -!- ' + _TARGET + ' was kicked from ' + _CHAN + ' by ' + _NICK + ' \[(?P<message>.*?)\]'   ),
103
0e829e6275dc implement --until, and fix handling of ServerMode
Tero Marttila <terom@fixme.fi>
parents: 97
diff changeset
    72
        # XXX: use hostname instead of nickname for ServerMode
0e829e6275dc implement --until, and fix handling of ServerMode
Tero Marttila <terom@fixme.fi>
parents: 97
diff changeset
    73
        (   LogTypes.MODE,          _TS + r' -!- (mode|ServerMode)/' + _CHAN + ' \[(?P<mode>.+?)\] by (?P<nickname>\S+)'                ),
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    74
        (   LogTypes.NICK,          _TS + r' -!- ' + _NICK + ' is now known as (?P<target>\S+)'                                         ),
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    75
        (   LogTypes.QUIT,          _TS + r' -!- ' + _NICK + ' \[' + _USERHOST + '\] has quit \[(?P<message>.*?)\]'                     ),
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
    76
        (   LogTypes.TOPIC,         _TS + r' -!- (' + _NICK + ' changed the topic of ' + _CHAN + ' to: (?P<topic>.*)|Topic unset by ' + _NICK2 + ' on ' + _CHAN2 + ')'    ),
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    77
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    78
        (   LogTypes.SELF_NOTICE,   _TS + r' \[notice\(' + _CHAN + '\)\] ' + _MSG                   ),
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    79
        (   LogTypes.SELF_NICK,     _TS + r' -!- You\'re now known as (?P<target>\S+)'              ),
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    80
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    81
        (   LogTypes.NETSPLIT_START,    _TS + r' -!- Netsplit ' + _SRV1 + ' <-> ' + _SRV2 + ' quits: (?P<nick_list>[^(]+)( \(\+(?P<count>\d+) more,\S+\))?'),
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    82
        (   LogTypes.NETSPLIT_END,      _TS + r' -!- Netsplit over, joins: (?P<nick_list>[^(]+)( \(\+(?P<count>\d+) more\))?'              ),
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    83
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
    84
        (   'DAY_CHANGED',          r'--- Day changed (?P<date>.+)'                                 ),
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    85
    )
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    87
    # precompile
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    88
    TYPE_REGEXES = [(type, re.compile(expr)) for type, expr in TYPE_EXPRS]
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    89
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    90
    def parse_line (self, channel, line, date, offset=None) :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    91
        """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    92
            Parse a single line, and return the resulting LogLine, or None, to ignore the line.
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    93
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    94
            Uses self.TYPE_REGEXES to do the matching
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    95
        """
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    96
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    97
        # empty line
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    98
        if not line :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
    99
            return
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   100
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   101
        # look for match
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   102
        match = type = None
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   103
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   104
        # test each type
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   105
        for type, regex in self.TYPE_REGEXES :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   106
            # attempt to match
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   107
            match = regex.match(line)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   108
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   109
            # found, break
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   110
            if match :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   111
                break
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   112
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   113
        # no match found?
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   114
        if not match :
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   115
            raise LogParseError(line, offset, "Line did not match any type")
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   116
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   117
        # match groups
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   118
        groups = match.groupdict(None)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   119
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   120
        # parse timestamp
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   121
        if 'datetime' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   122
            # parse datetime using default asctime() format
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   123
            dt = datetime.datetime.strptime(groups['datetime'], '%a %b %d %H:%M:%S %Y')
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   124
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   125
        elif 'timestamp' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   126
            # parse timestamp into naive datetime
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   127
            dt = datetime.datetime.strptime(groups['timestamp'], self.timestamp_fmt)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   128
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   129
            # override date?
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   130
            if date :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   131
                dt = dt.replace(year=date.year, month=date.month, day=date.day)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   132
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   133
        elif 'date' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   134
            # parse date-only datetime
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   135
            dt = datetime.datetime.strptime(groups['date'], '%a %b %d %Y')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   136
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   137
        else :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   138
            # no timestamp !?
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   139
            raise LogParseError(line, offset, "No timestamp")
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   140
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   141
        # now localize with timezone
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   142
        dtz = self.tz.localize(dt)
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   143
92
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   144
        # channel, currently unused
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   145
        channel_name = (groups.get('channel') or groups.get('channel2'))
74f6a0b01ddf change debug formatter to to use str(LogLine) for TSV, and fix handling of topic-unset
Tero Marttila <terom@fixme.fi>
parents: 86
diff changeset
   146
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   147
        # source
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   148
        if 'server1' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   149
            source = (None, None, groups.get('server1'), None)
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   150
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   151
        else :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   152
            source = (groups.get('nickname') or groups.get('nickname2'), groups.get('username'), groups.get('hostname'), groups.get('flags'))
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   153
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   154
        # target
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   155
        if 'server2' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   156
            target = groups.get('server2')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   157
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   158
        else :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   159
            target = groups.get('target')
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   160
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   161
        # data
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   162
        if 'message' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   163
            data = groups['message']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   164
        
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   165
        elif 'mode' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   166
            data = groups['mode']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   167
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   168
        elif 'topic' in groups :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   169
            data = groups['topic']
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   170
        
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   171
        elif 'nick_list' in groups :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   172
            # split into components
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   173
            list = groups['nick_list'].split(', ')
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   174
            
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   175
            # additional count?
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   176
            if 'count' in groups and groups['count'] :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   177
                list.append('+%d' % int(groups['count']))
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   178
            
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   179
            # join
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   180
            data = ' '.join(list)
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   181
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   182
        else :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   183
            data = None
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   184
        
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   185
        # custom types?
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   186
        if type == 'DAY_CHANGED' :
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   187
            # new date
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   188
            date = dtz
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   189
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   190
        # build+return (date, LogLine)
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   191
        return date, LogLine(channel, offset, type, dtz, source, target, data)
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   192
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   193
    def parse_lines (self, channel, lines, date=None, starting_offset=None) :
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   194
        """
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   195
            Parse the given lines, yielding LogEvents. 
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   196
        """
65
8b50694f841e improve search further
Tero Marttila <terom@fixme.fi>
parents: 64
diff changeset
   197
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   198
        for offset, line in enumerate(lines) :
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   199
            # offset?
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   200
            if starting_offset :
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   201
                offset = starting_offset + offset
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   202
50
f13cf27a360b implement more LogSource features (logs for date, cleanup last_logs), implement irssi parser, formatter, other misc. stuff
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   203
            else :
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   204
                offset = None
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   205
            
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   206
            # try and parse
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   207
            try :
97
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   208
                # update date as needed
6165f1ba458d implement parser/formatter netsplits and day-change
Tero Marttila <terom@fixme.fi>
parents: 92
diff changeset
   209
                date, line = self.parse_line(channel, line, date, offset)
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   210
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   211
            # passthrough LogParseError's
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   212
            except LogParseError :
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   213
                raise
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   214
            
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   215
            # wrap other errors as LogParseError
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   216
            except Exception, e :
86
645cf9c4441e implement full parser+formatter for irssi
Tero Marttila <terom@fixme.fi>
parents: 83
diff changeset
   217
                raise LogParseError(line, offset, "Parsing line failed: %s" % e)
83
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   218
            
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   219
            else :
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   220
                # yield unless None
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   221
                if line :
a34e9f56ddda improve parser resilience, improve get_month_days, add 'Channel' item to general menu
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   222
                    yield line
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   223
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents: 50
diff changeset
   224