Documentation Index
Fetch the complete documentation index at: https://docs.jabrod.com/llms.txt
Use this file to discover all available pages before exploring further.
Querying & Retrieval
Once your data is chunked, embedded, and stored, you can query it via the/api/rag/query endpoint.
The Retrieval Process
When you send a query:- Query Embedding: Jabrod converts your query string into a vector using the pipeline’s configured embedding model.
- Vector Search: Jabrod searches the vector database for chunks whose vectors are closest (most similar) to the query vector.
- Filtering: Any results below the pipeline’s configured Similarity Threshold are discarded.
- Ranking: The top
Kresults are returned to you, ordered by similarity score.
Top K
ThetopK parameter determines how many chunks are returned.
- A low
topK(e.g., 3) returns only the most highly relevant pieces of information, which is cheaper and faster if you are feeding it into an LLM. - A high
topK(e.g., 10-20) returns more context, which is useful if the answer is spread across multiple documents, but increases token usage if passed to an LLM.
retrievalTopK on the pipeline itself, but you can override it on a per-request basis in the API.
Similarity Threshold
Vectors are compared using Cosine Similarity, which results in a score between 0 and 1.1.0means identical meaning.0.0means completely unrelated.
0.7, Jabrod will ignore any chunks that score below 0.7. This helps prevent “hallucinations” by ensuring that if no relevant data exists in the knowledge base, the API returns an empty array rather than returning vaguely related garbage.