Conversation
b6c0dec to
bb179a6
Compare
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
bb179a6 to
3c56d41
Compare
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
89355de to
0734ed6
Compare
quickwit/quickwit-metastore/migrations/postgresql/26_add-split-soft-deleted-doc-ids.down.sql
Outdated
Show resolved
Hide resolved
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
c86fbfb to
817c329
Compare
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
rdettai-sk
left a comment
There was a problem hiding this comment.
Not 100% through with the review (I only glanced over the search part). I think an integration test would be really nice.
eb8afbf to
c2300ad
Compare
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
c2300ad to
a2b75d8
Compare
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
rdettai-sk
left a comment
There was a problem hiding this comment.
My biggest remaining concern is regarding the list splits endpoint. It is actually very slow (couple of seconds) on big indexes.
| metastore: &MetastoreServiceClient, | ||
| progress: &Progress, | ||
| ) { | ||
| let list_splits_request = match ListSplitsRequest::try_from_index_uid(index_uid.clone()) { |
There was a problem hiding this comment.
This can gather a lot of splits from the metastore and needs to be called at every merge upload. We should try to cap the overhead. We can easily:
- filter on the current node
- filter on published splits only
I think we can also filter on immature splits. Not 100% sure how that works, but it would be the most efficient.
There was a problem hiding this comment.
of course if we bring it to the metastore we can more easily get only the splits we need 😄
| replaced_splits.push(ReplacedSplit { | ||
| split_id: metadata.split_id().to_string(), | ||
| soft_deleted_doc_ids: metadata.soft_deleted_doc_ids.clone(), | ||
| }); |
There was a problem hiding this comment.
I think you are telling this poor split to replace itself
There was a problem hiding this comment.
You are right, 4 hours lost one 1 wrong line, not bad 😅
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
a24023f to
d945645
Compare
Description
This PR allow to soft delete documents from an index
Tasks