iixr

Graph

(0) -100 -60 tip
Added comments and docstrings. default tip
Added comment/warning.
Fixed the itermerge optimisation for single element collections of iterators.
Introduced last term/positions caching for all iterators.
Fix the index bisection and attempt to prevent unnecessary seeking and scanning.
Moved the IndexReader to the terms module, renaming it to MultipleReader.
Removed numerous classes, simplifying the package and focusing on combined term
Changed cache slicing to record pointer updating.
Removed unnecessary recursion when flushing records/caches.
Introduced read and write caches in order to investigate performance changes.
Removed redundant data attribute.
Introduced size declarations for sequences employed by readers and writers.
Moved the record handling into reset methods in order to have records encompass
Introduced record-oriented reading and writing of files where an array is
Changed the files to have an internal array for reading and writing data.
Made the read_sequence method simpler to follow and perhaps slightly more
Introduced various optimisation attempts.
Added a threshold or interval which causes the term dictionary to be flushed
For large numbers of positions, sorting afterwards is likely to be much quicker.
Permit fields for documents to be spread across partitions, potentially because
Avoid identical adjacent tokens being matched to the same document token.
Introduced support for higher-level sequential access to indexes.
Introduced parameterisation of phrase discovery using different phrase filters
Updated the copyright and licensing information.
Changed the from_document method to remember the current document and positions,
Added support for phrase searching where document positions are specified using
Made partition discovery more widely available, adding code to find the next
Added integrity checks for appropriate term and position ordering.
Introduced support for specifying sequences for document numbers and positions,
Introduced code to handle index merging where a large number of partitions
Added get_terms convenience methods to the index and term dictionary readers.
An experiment adding preceding text to position records.
Added a string serialisation function.
Introduced position dictionary, file and index iterators which capture the
Removed iterators and openers with the intention of having synchronised reading
Added a document cache, used when reading fields.
Fixed field interval configuration.
Fixed field interval configuration.
Changed indexing interval configuration to use the Index initialiser.
Simplified the IndexWriter document cache, adopting a list of items instead of a
Added proper phrase searching.
Changed find_positions methods to return an empty list instead of None where no
Added elementary phrase searching support.
Added support for updating empty indexes.
Added measures for the closure of position iterators.
Introduced array usage when writing position index entries.
Introduced separate vint functions for strings and byte arrays.
Introduced various optimisations: increasing the vint cache and introducing
Simplified vint implementation, taking advantage of the cache.
Removed Pyrex extension result.
Use file methods directly.
Replaced the partial Pyrex vint implementation with a cache.
Removed caching since it does not seem to help significantly.
Switched the write caches in FileWriter instances to StringIO instances.
Added copyright and licensing information.
Added iterator reuse for sequential term dictionary access, along with iterator
Added a cache offset attribute to better track available cached data.
Removed old module.
Made iixr a package with several submodules.
Added constants for various measures.
(0) -100 -60 tip