JavaScript
Query and Retrieve
Perform a Retrieval
Fetch ranked context passages without generating an LLM answer.
POST
JavaScript
Overview
Retrieval-only requests return ranked context passages without LLM generation. Use this endpoint when you want to:- Build your own prompting layer
- Inspect retrieval quality before LLM generation
- Implement custom response logic
- Reduce costs (no LLM API calls)
Performance Benefit: Retrieval-only requests are typically 5-10x faster than full query requests since they skip LLM generation.
Events are recorded with
is_query=false in the history, distinguishing them from full query executions.Authentication
Requires valid JWT token or session authentication. You must own the target corpus.Request Body
ID of the corpus to search. Must be fully indexed (
indexing_status: "IND").Natural-language query used to fetch relevant context passages.How retrieval works:
- Query is analyzed for intent and complexity
- Multiple retrieval strategies run in parallel (semantic, keyword, hybrid)
- Results are reranked using ensemble methods
- Top-ranked chunks are returned with metadata
- Be specific for better precision
- Use natural language (not keyword stuffing)
- Phrase as questions for best results
- Context is king - include relevant details
Example request
Response
Unique identifier for this retrieval execution.
ISO 8601 timestamp when retrieval was executed.
Last update timestamp (usually same as
created_at).ID of the corpus that was searched.
The query used for retrieval.
Total execution time in milliseconds (includes search, ranking, and reranking).
Structured retrieval results with ranked context passages.
Example Response
Client examples
- Python
- TypeScript / JavaScript
- Java
Use Cases
Custom LLM Integration
Custom LLM Integration
Retrieve context and use your own LLM:
Quality Assessment
Quality Assessment
Test retrieval quality before full deployment:
- Submit test queries
- Inspect returned chunks and scores
- Verify relevance of top results
- Adjust corpus content if needed
- Run A/B tests with different configurations
Citation-Only Responses
Citation-Only Responses
Return citations without LLM generation:
Cost Optimization
Cost Optimization
Reduce API costs by:
- Caching retrieval results
- Generating answers client-side
- Using cheaper LLMs with retrieved context
- Implementing custom prompt logic
Retrieval History
List Retrieval Events
List Retrieval Events
View past retrievals for a corpus:Query parameters:
order_by- Sort bycreated_ator-created_at(newest first, default)start_date- Filter after this date (UTC,YYYY-MM-DDformat)end_date- Filter before this date (UTC,YYYY-MM-DDformat)page- Page number (default: 1)page_size- Results per page (default: 20)
Delete Retrieval Records
Delete Retrieval Records
Remove individual retrievals from history:
Retrieval entries are immutable snapshots.
PUT and PATCH operations return 403 Forbidden.Error Handling
400 Bad Request
400 Bad Request
Causes:
- Missing
corporaoruser_queryfields - Corpus is not indexed (
indexing_status!=IND) - Invalid UUID format
- Verify all required fields are present
- Check corpus indexing status:
GET /api/corpora/{id}/ - Ensure corpus has indexed resources
404 Not Found
404 Not Found
Causes:
- Corpus doesn’t exist
- You don’t own the corpus
- Verify corpus ID is correct
- List your corpora:
GET /api/corpora/ - Check authentication credentials
500 Internal Server Error
500 Internal Server Error
Causes:
- Vector database connectivity issues
- Retrieval pipeline errors
- Reranker service failures
- Retry with exponential backoff
- Check system health:
GET /sys-health - Contact support if errors persist
Performance Tips
Optimize Retrieval Speed
Optimize Retrieval Speed
Improve performance with these strategies:
- Corpus size - Smaller, focused corpora retrieve faster
- Query specificity - Precise queries require less computation
- Result limits - Request fewer chunks if you don’t need many
- Caching - Cache frequent queries on your end
- Parallel requests - Run multiple retrievals concurrently
Understanding Scores
Understanding Scores
Relevance scores explained:
- 0.9-1.0: Highly relevant, exact match or very close
- 0.7-0.9: Relevant, good semantic match
- 0.5-0.7: Moderately relevant, partial match
- < 0.5: Low relevance, consider filtering out
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Response
201 - application/json
The date and time the organization was created
Last updated time
The query that was executed to get the results
Time taken to process and retrieve data (in milliseconds)
Metadata retrieved from the query
The corpora to which the query belongs

