79:2f94fb23bcff
|
2010-11-26 |
Paul Boddie |
changeset
files
shortlog
graph
|
Updated the copyright and licensing information. |
|
|
iixr/__init__.py
|
|
78:489129c7f225
|
2010-11-26 |
Paul Boddie |
changeset
files
shortlog
graph
|
Changed the from_document method to remember the current document and positions,
although the positions iterator will not be reset upon repeated invocations
involving the same document number. |
|
|
iixr/positions.py
|
|
77:7e79dd580a62
|
2010-11-23 |
Paul Boddie |
changeset
files
shortlog
graph
|
Added support for phrase searching where document positions are specified using
sequences of values, with the first value in each sequence being the token
index/position.
Added more tests of document numbers and position values being specified using
sequences. |
|
|
iixr/phrases.py test.py
|
|
76:f1cbbf5ef885
|
2010-11-22 |
Paul Boddie |
changeset
files
shortlog
graph
|
Made partition discovery more widely available, adding code to find the next
partition number to use, thus avoiding overwriting index data when opening a
writer on an existing index.
Made sure that term and field dictionaries are always written out: this might
not occur if the underlying writers have been obtained from an index writer and
then used to write data directly. |
|
|
iixr/filesystem.py iixr/index.py
|
|
75:8d35240236b2
|
2010-11-22 |
Paul Boddie |
changeset
files
shortlog
graph
|
Added integrity checks for appropriate term and position ordering. |
|
|
iixr/terms.py
|
|
74:d308dc25f5a2
|
2010-11-21 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced support for specifying sequences for document numbers and positions,
with the latter being "monotonic" sequences whose elements contain items that
are always greater than or equal to the items in the same position in each
preceding element of the sequence.
Fixed the get_terms method of the term dictionary reader to refer to the
iterator over term information (and not the list of terms provided by the term
index).
Expanded the tests to cover sequences as document numbers and positions. |
|
|
iixr/fields.py iixr/files.py iixr/positions.py iixr/terms.py test.py
|
|
73:6dd92daca068
|
2010-11-20 |
Paul Boddie |
changeset
files
shortlog
graph
|
Introduced code to handle index merging where a large number of partitions
exist, combining the term and field dictionary merging into a common method
which is then parameterised for each kind of data. |
|
|
iixr/index.py
|
|
72:1cccc03f183e
70:4614ef99dbe1
|
2010-11-20 |
Paul Boddie |
changeset
files
shortlog
graph
|
Added get_terms convenience methods to the index and term dictionary readers.
Introduced safer closure of mergers. |
|
|
iixr/index.py iixr/merging.py iixr/terms.py
|
|
71:00995a70f535
|
2010-11-20 |
Paul Boddie |
changeset
files
shortlog
graph
|
An experiment adding preceding text to position records. |
|
|
iixr/phrases.py iixr/positions.py
|
|
70:4614ef99dbe1
71:00995a70f535 72:1cccc03f183e
|
2010-11-20 |
Paul Boddie |
changeset
files
shortlog
graph
|
Added a string serialisation function.
Fixed a parameter/argument name. |
|
|
iixr/data.py iixr/index.py
|
|