89:1f3986bca1a3
|
2011-02-07 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced record-oriented reading and writing of files where an array is
populated in a single read from a file or flushed to a buffer in a single write
operation.
Moved various data representation operations into the data module, removing
explicit object size concerns from the higher-level modules, replacing them with
usage of adder and subtractor functions where appropriate.
Made the vint caches lists instead of dictionaries.
Enforced tuples as the input representation of serialised sequence values. |
|
|
iixr/data.py iixr/fields.py iixr/files.py iixr/positions.py iixr/terms.py test.py
|
|
88:4c35f0aa339c
|
2011-02-03 |
Paul Boddie |
changeset
files
shortlog
graph
|
Changed the files to have an internal array for reading and writing data. |
|
|
iixr/fields.py iixr/files.py iixr/positions.py iixr/terms.py test.py
|
|
87:b50ba4291c5c
|
2011-01-28 |
Paul Boddie |
changeset
files
shortlog
graph
|
Made the read_sequence method simpler to follow and perhaps slightly more
efficient.
Fixed the PhraseFilter to handle out-of-sequence tokens properly as well as
iterators for different tokens contributing identical positions. |
|
|
iixr/files.py iixr/phrases.py
|
|
86:34f535fe8cb0
|
2011-01-25 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced various optimisation attempts. |
|
|
iixr/data.py iixr/files.py iixr/terms.py
|
|
85:c4da9505f73e
|
2011-01-25 |
Paul Boddie |
changeset
files
shortlog
graph
|
Added a threshold or interval which causes the term dictionary to be flushed
when a certain number of document positions have been recorded.
Updated the copyright information. |
|
|
docs/COPYING.txt iixr/index.py
|
|
84:80df3e7605a4
|
2011-01-21 |
Paul Boddie |
changeset
files
shortlog
graph
|
For large numbers of positions, sorting afterwards is likely to be much quicker. |
|
|
iixr/phrases.py
|
|
83:3ddb93334c95
|
2011-01-11 |
Paul Boddie |
changeset
files
shortlog
graph
|
Permit fields for documents to be spread across partitions, potentially because
documents have been added more than once to an index. |
|
|
iixr/merging.py
|
|
82:9867931a9269
|
2010-12-17 |
Paul Boddie |
changeset
files
shortlog
graph
|
Avoid identical adjacent tokens being matched to the same document token. |
|
|
iixr/phrases.py
|
|
81:ea2944f51430
|
2010-11-26 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced support for higher-level sequential access to indexes. |
|
|
iixr/index.py iixr/terms.py
|
|
80:e0bd00412dbc
|
2010-11-26 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced parameterisation of phrase discovery using different phrase filters
to that provided. |
|
|
iixr/phrases.py
|
|