log_search.py
author Tero Marttila <terom@fixme.fi>
Thu, 12 Feb 2009 22:34:54 +0200
changeset 119 df859bfdd3be
parent 118 f530c158aa07
child 121 86aebc9cb60b
permissions -rw-r--r--
add version string
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     1
"""
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     2
    Full-text searching of logs
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     3
"""
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     4
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     5
import datetime, calendar, pytz
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
     6
import os.path
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     7
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     8
import HyperEstraier as hype
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
     9
96
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
    10
import log_line, utils, config
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    11
74
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    12
class LogSearchError (Exception) :
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    13
    """
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    14
        General search error
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    15
    """
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    16
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    17
    pass
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    18
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    19
class NoResultsFound (LogSearchError) :
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    20
    """
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    21
        No results found
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    22
    """
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    23
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    24
    pass
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
    25
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    26
class LogSearchIndex (object) :
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    27
    """
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    28
        An index on the logs for a group of channels.
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    29
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    30
        This uses Hyper Estraier to handle searching, whereby each log line is a document (yes, I have a powerful server).
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    31
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    32
        These log documents have the following attributes:
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    33
            @uri                - channel/date/line
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    34
            channel             - channel code
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    35
            type                - the LogType id
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    36
            timestamp           - UTC timestamp
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    37
            source_nickname     - source nickname
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    38
            source_username     - source username
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    39
            source_hostname     - source hostname
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    40
            source_chanflags    - source channel flags
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    41
            target_nickname     - target nickname
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    42
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
    43
        Each document then has a single line of data, which is the log data message
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    44
    """
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    45
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    46
    def __init__ (self, channels, path, mode='r') :
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    47
        """
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    48
            Open the database at the given path, with the given mode:
99
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    49
                First char:
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    50
                    r       - read, error if not exists
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    51
                    w       - write, create if not exists
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    52
                    a       - write, error if not exists
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    53
                    c       - create, error if exists
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    54
                
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    55
                Additional chars:
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    56
                    trunc   - truncate if exists
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    57
                    +       - read as well as write
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    58
                    ?       - non-blocking lock open, i.e. it fails if already open
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    59
            
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    60
            Channels is the ChannelList.
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    61
        """
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    62
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    63
        # store
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    64
        self.channels = channels
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    65
        self.path = path
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    66
        self.mode = mode
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    67
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    68
        # check it does not already exist?
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    69
        if mode in 'c' and os.path.exists(path) :
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
    70
            raise LogSearchError("Index already exists: %s" % (path, ))
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    71
        
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    72
        # mapping of { mode -> flags }
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    73
        mode_to_flag = {
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    74
            'r':    hype.Database.DBREADER,
67
13975aa16b4c fix LogSearchIndex open permissions
Tero Marttila <terom@fixme.fi>
parents: 66
diff changeset
    75
            'w':    hype.Database.DBWRITER | hype.Database.DBCREAT,
13975aa16b4c fix LogSearchIndex open permissions
Tero Marttila <terom@fixme.fi>
parents: 66
diff changeset
    76
            'a':    hype.Database.DBWRITER,
99
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    77
            'c':    hype.Database.DBCREAT,
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    78
        }
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    79
99
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    80
        # flags to use, standard modes
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    81
        flags = mode_to_flag[mode[0]]
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    82
 
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    83
        # mode-flags
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    84
        if '?' in mode :
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    85
            # non-blocking locking
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    86
            flags |= hype.Database.DBLCKNB
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    87
        
99
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    88
        elif '+' in mode :
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    89
            # read
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    90
            flags |= hype.Database.DBREADER
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    91
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    92
        elif 'trunc' in mode :
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    93
            # truncate. Dangerous!
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    94
            flags |= hype.Database.DBTRUNC
8719ac564b22 implement non-blocking locking for the estdb, and our own locking for the autoload statetmpfile... it should work well now
Tero Marttila <terom@fixme.fi>
parents: 96
diff changeset
    95
       
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    96
        # make instance
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    97
        self.db = hype.Database()
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    98
        
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
    99
        # open
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   100
        if not self.db.open(path, flags) :
65
8b50694f841e improve search further
Tero Marttila <terom@fixme.fi>
parents: 64
diff changeset
   101
            raise Exception("Index open failed: %s, mode=%s, flags=%#06x: %s" % (path, mode, flags, self.db.err_msg(self.db.error())))
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   102
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   103
    def insert (self, channel, lines) :
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   104
        """
68
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   105
            Adds a sequence of LogLines from the given LogChannel to the index, and return the number of added items
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   106
        """
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   107
        
93
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   108
        # count from zero
68
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   109
        count = 0
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   110
        
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   111
        # iterate
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   112
        for line in lines :
93
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   113
            # insert
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   114
            self.insert_line(channel, line)
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   115
68
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   116
            # count
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   117
            count += 1
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   118
        
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   119
        # return
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   120
        return count
8157c41b3236 improve search form & script
Tero Marttila <terom@fixme.fi>
parents: 67
diff changeset
   121
93
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   122
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   123
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   124
    def insert_line (self, channel, line) :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   125
        """
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   126
            Adds a single LogLine for the given LogChannel to the index
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   127
        """
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   128
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   129
        # validate the LogChannel
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   130
        assert channel.id
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   131
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   132
        # validate the LogLine
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   133
        assert line.offset
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   134
        assert line.timestamp
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   135
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   136
        # create new document
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   137
        doc = hype.Document()
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   138
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   139
        # line date
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   140
        date = line.timestamp.date()
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   141
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   142
        # ensure that it's not 1900
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   143
        assert date.year != 1900
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   144
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   145
        # add URI
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   146
        doc.add_attr('@uri',        "%s/%s/%d" % (channel.id, date.strftime('%Y-%m-%d'), line.offset))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   147
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   148
        # add channel id
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   149
        doc.add_attr('channel',     channel.id)
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   150
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   151
        # add type
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   152
        doc.add_attr('type',        str(line.type))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   153
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   154
        # add UTC timestamp
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   155
        doc.add_attr('timestamp',   str(utils.to_utc_timestamp(line.timestamp)))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   156
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   157
        # add source attribute?
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   158
        if line.source :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   159
            source_nickname, source_username, source_hostname, source_chanflags = line.source
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   160
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   161
            if source_nickname :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   162
                doc.add_attr('source_nickname', source_nickname.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   163
            
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   164
            if source_username :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   165
                doc.add_attr('source_username', source_username.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   166
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   167
            if source_hostname :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   168
                doc.add_attr('source_hostname', source_hostname.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   169
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   170
            if source_chanflags :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   171
                doc.add_attr('source_chanflags', source_chanflags.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   172
        
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   173
        # add target attributes?
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   174
        if line.target :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   175
            target_nickname = line.target
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   176
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   177
            if target_nickname :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   178
                doc.add_attr('target_nickname', target_nickname.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   179
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   180
        # add data
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   181
        if line.data :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   182
            doc.add_text(line.data.encode('utf8'))
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   183
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   184
        # put, "clean up dispensable regions of the overwritten document"
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   185
        if not self.db.put_doc(doc, hype.Database.PDCLEAN) :
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   186
            raise Exeception("Index put_doc failed")
48fca00689e3 implement scripts/search-index autoload
Tero Marttila <terom@fixme.fi>
parents: 89
diff changeset
   187
            
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   188
    def search_cond (self, cond) :
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   189
        """
74
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
   190
            Search using a raw hype.Condition. Raises NoResultsFound if there aren't any results
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   191
        """
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   192
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   193
        # execute search, unused 'flags' arg stays zero
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   194
        results = self.db.search(cond, 0)
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   195
74
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
   196
        # no results?
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
   197
        if not results :
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
   198
            raise NoResultsFound()
1ab95857d584 handle the 'no search results' case
Tero Marttila <terom@fixme.fi>
parents: 68
diff changeset
   199
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   200
        # iterate over the document IDs
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   201
        for doc_id in results :
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   202
            # load document, this throws an exception...
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   203
            # option constants are hype.Database.GDNOATTR/GDNOTEXT
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   204
            doc = self.db.get_doc(doc_id, 0)
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   205
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   206
            # load the attributes/text
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
   207
            channel         = self.channels.lookup(doc.attr('channel'))
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
   208
            type            = int(doc.attr('type'))
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   209
            timestamp       = utils.from_utc_timestamp(int(doc.attr('timestamp')))
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   210
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   211
            # source
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   212
            source = (doc.attr('source_nickname'), doc.attr('source_username'), doc.attr('source_hostname'), doc.attr('source_chanflags'))
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   213
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   214
            # target
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   215
            target = doc.attr('target_nickname')
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   216
            
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   217
            # message text
87
39915772f090 update LogSearchIndex to use new LogLine fields
Tero Marttila <terom@fixme.fi>
parents: 74
diff changeset
   218
            message         = doc.cat_texts().decode('utf8')
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   219
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   220
            # build+yield to as LogLine
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   221
            yield log_line.LogLine(channel, None, type, timestamp, source, target, message)
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   222
    
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   223
    def search (self, options=None, channel=None, attrs=None, phrase=None, order=None, max=None, skip=None) :
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   224
        """
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   225
            Search with flexible parameters
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   226
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   227
                options     - bitmask of hype.Condition.*
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   228
                channel     - LogChannel object
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   229
                attrs       - raw attribute expressions
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   230
                phrase      - the search query phrase
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   231
                order       - order attribute expression
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   232
                max         - number of results to return
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   233
                skip        - number of results to skip
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   234
        """
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   235
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   236
        # build condition
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   237
        cond = hype.Condition()
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   238
        
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   239
        if options :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   240
            # set options
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   241
            cond.set_options(options)
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   242
        
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   243
        if channel :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   244
            # add channel attribute
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   245
            cond.add_attr(("channel STREQ %s" % channel.id).encode('utf8'))
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   246
        
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   247
        if attrs :
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   248
            # add attributes
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   249
            for attr in attrs :
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   250
                cond.add_attr(attr.encode('utf8'))
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   251
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   252
        if phrase :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   253
            # add phrase
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   254
            cond.set_phrase(phrase.encode('utf8'))
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   255
        
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   256
        if order :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   257
            # set order
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   258
            cond.set_order(order)
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   259
        
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   260
        if max :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   261
            # set max
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   262
            cond.set_max(max)
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   263
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   264
        if skip :
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   265
            # set skip
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   266
            cond.set_skip(skip)
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   267
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   268
        # execute
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   269
        return self.search_cond(cond)
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   270
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   271
    def search_simple (self, channel, query, count=None, offset=None, search_msg=True, search_nick=False) :
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   272
        """
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   273
            Search for lines from the given channel for the given simple query.
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   274
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   275
            The search_* params define which attributes to search for (using fulltext search for the message, STROR for
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   276
            attributes).
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   277
        """
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   278
        
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   279
        # search attributes
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   280
        attrs = []
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   281
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   282
        # nickname target query
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   283
        if search_nick :
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   284
            attrs.append("source_nickname STRINC %s" % query)
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   285
#            attrs.append("target_nickname STRINC %s" % query)
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   286
        
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   287
        # use search(), backwards
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   288
        results = list(self.search(
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   289
            # simplified phrase
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   290
            options     = hype.Condition.SIMPLE,
64
cdb6403c2498 beginnings of a LogSearchIndex class
Tero Marttila <terom@fixme.fi>
parents:
diff changeset
   291
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   292
            # specific channel
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   293
            channel     = channel,
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   294
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   295
            # given phrase
118
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   296
            phrase      = query if search_msg else None,
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   297
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   298
            # attributes defined above
f530c158aa07 implement some basic search-targets for message and nickname
Tero Marttila <terom@fixme.fi>
parents: 99
diff changeset
   299
            attrs       = attrs,
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   300
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   301
            # order by timestamp, descending (backwards)
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   302
            order       = "timestamp NUMD",
66
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   303
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   304
            # count/offset
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   305
            max         = count,
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   306
            skip        = offset,
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   307
        ))
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   308
        
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   309
        # reverse
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   310
        return reversed(results)
090ed78ec8fa add count/skip to search results, requires modifications to the swig bindings for HyperEstraier...
Tero Marttila <terom@fixme.fi>
parents: 65
diff changeset
   311
89
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   312
    def list (self, channel, date, count=None, skip=None) :
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   313
        """
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   314
            List all indexed log items for the given UTC date
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   315
        """
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   316
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   317
        # start/end dates
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   318
        dt_start = datetime.datetime(date.year, date.month, date.day, 0, 0, 0, 0)
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   319
        dt_end   = datetime.datetime(date.year, date.month, date.day, 23, 23, 59, 999999)
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   320
        
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   321
        # search
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   322
        return self.search(
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   323
            # specific channel
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   324
            channel     = channel,
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   325
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   326
            # specific date range
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   327
            attrs       = [
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   328
                "timestamp NUMBT %d %d" % (utils.to_utc_timestamp(dt_start), utils.to_utc_timestamp(dt_end))
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   329
            ],
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   330
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   331
            # order correctly
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   332
            order       = "timestamp NUMA",
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   333
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   334
            # max count/offset
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   335
            max         = count,
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   336
            skip        = skip
2dc6de43f317 add utils.to/from_utc_timestamp functions, fix LogSearchIndex to store all LogLine attributes, add list() method to get LogLines for a given date, and improve scripts/search-index
Tero Marttila <terom@fixme.fi>
parents: 87
diff changeset
   337
        )
96
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   338
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   339
# global read-only index
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   340
_index = None
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   341
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   342
def get_index () :
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   343
    """
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   344
        Returns the default read-only index, suitable for searching
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   345
    """
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   346
    
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   347
    global _index
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   348
    
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   349
    # open?
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   350
    if not _index :
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   351
        _index = LogSearchIndex(config.LOG_CHANNELS, config.SEARCH_INDEX_PATH, 'r')
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   352
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   353
    # return
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   354
    return _index
d30c88e89a7e move the LogSearchIndex open from handlers to log_search, and make it lazy
Tero Marttila <terom@fixme.fi>
parents: 93
diff changeset
   355