Skip to content

A User should know why documents are returned #8

@mtbarta

Description

@mtbarta

As a user, I want to understand why certain documents are returned so that we can formulate the LLM context better.

One of the benefits of late interaction is having per-token scores that can help us explain why a certain document is returned. This also means that we can calculate a "highlight" span that is most similar to the query.

See https://blog.vespa.ai/announcing-colbert-embedder-in-vespa/ for Vespa's explainer.

Acceptance Criteria

  1. Token scores are returned in the search results.
  2. Token forms are returned in the search results.

Note: This requires us to know what token is stored in the index and hydrate it for the results. We could couple this with the model's vocabulary and store the vocab id or we can store the raw token.

Metadata

Metadata

Assignees

No one assigned

    Labels

    StoryUser Story

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions