Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Commit

Permalink
Reduce incremental indexing time of words_prefix_position_docids DB
Browse files Browse the repository at this point in the history
This database can easily contain millions of entries. Thus, iterating
over it can be very expensive.

For regular `documentAdditionOrUpdate` tasks, `del_prefix_fst_words`
will always be empty. Thus, we can save a significant amount of time
by adding this `if !del_prefix_fst_words.is_empty()` condition.

The code's behaviour remains completely unchanged.
  • Loading branch information
loiclec committed Jan 31, 2023
1 parent 33f61d2 commit a2690ea
Showing 1 changed file with 11 additions and 7 deletions.
18 changes: 11 additions & 7 deletions milli/src/update/words_prefix_position_docids.rs
Original file line number Diff line number Diff line change
Expand Up @@ -140,16 +140,20 @@ impl<'t, 'u, 'i> WordPrefixPositionDocids<'t, 'u, 'i> {

// We remove all the entries that are no more required in this word prefix position
// docids database.
let mut iter =
self.index.word_prefix_position_docids.iter_mut(self.wtxn)?.lazily_decode_data();
while let Some(((prefix, _), _)) = iter.next().transpose()? {
if del_prefix_fst_words.contains(prefix.as_bytes()) {
unsafe { iter.del_current()? };
// We also avoid iterating over the whole `word_prefix_position_docids` database if we know in
// advance that the `if del_prefix_fst_words.contains(prefix.as_bytes()) {` condition below
// will always be false (i.e. if `del_prefix_fst_words` is empty).
if !del_prefix_fst_words.is_empty() {
let mut iter =
self.index.word_prefix_position_docids.iter_mut(self.wtxn)?.lazily_decode_data();
while let Some(((prefix, _), _)) = iter.next().transpose()? {
if del_prefix_fst_words.contains(prefix.as_bytes()) {
unsafe { iter.del_current()? };
}
}
drop(iter);
}

drop(iter);

// We finally write all the word prefix position docids into the LMDB database.
sorter_into_lmdb_database(
self.wtxn,
Expand Down

0 comments on commit a2690ea

Please sign in to comment.