Overview
Retrieval-only requests return ranked context passages without LLM generation. Use this endpoint when you want to:
Build your own prompting layer
Inspect retrieval quality before LLM generation
Implement custom response logic
Reduce costs (no LLM API calls)
Performance Benefit : Retrieval-only requests are typically 5-10x faster than full query requests since they skip LLM generation.
Events are recorded with is_query=false in the history, distinguishing them from full query executions.
Authentication
Requires valid JWT token or session authentication. You must own the target corpus.
Request Body
ID of the corpus to search. Must be fully indexed (indexing_status: "IND").
Natural-language query used to fetch relevant context passages. How retrieval works:
Query is analyzed for intent and complexity
Multiple retrieval strategies run in parallel (semantic, keyword, hybrid)
Results are reranked using ensemble methods
Top-ranked chunks are returned with metadata
Query tips:
Be specific for better precision
Use natural language (not keyword stuffing)
Phrase as questions for best results
Context is king - include relevant details
Example request
curl -X POST https://{your-host}/api/retrieve/ \
-H "Authorization: Bearer $SOAR_LABS_TOKEN " \
-H "Content-Type: application/json" \
-d '{
"corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
"user_query": "Summarize onboarding prerequisites"
}'
Response
Unique identifier for this retrieval execution.
ISO 8601 timestamp when retrieval was executed.
Last update timestamp (usually same as created_at).
ID of the corpus that was searched.
The query used for retrieval.
Total execution time in milliseconds (includes search, ranking, and reranking).
Structured retrieval results with ranked context passages. Array of retrieved chunks, ranked by relevance. The actual content of the retrieved chunk.
Context metadata about the source. Title of the source document.
AI-generated summary of the containing section.
Corpus name (normalized format).
Additional metadata may include:
excerpt_keywords - Extracted keywords
questions - Sample questions the chunk can answer
file_name - Original filename (if from file)
url - Source URL (if from web)
Relevance score (0-1) after ensemble reranking. Higher is more relevant.
Zero-indexed position in ranked results (0 = most relevant).
Example Response
{
"id" : "11c7d0d5-f3bd-4dfb-9157-bf51c38e62fd" ,
"created_at" : "2024-09-01T13:29:06.102Z" ,
"updated_at" : "2024-09-01T13:29:06.102Z" ,
"corpora" : "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd" ,
"user_query" : "Summarize onboarding prerequisites" ,
"retrieval_time" : 311.8 ,
"retrieve_data" : {
"result" : [
{
"text" : "Before onboarding, ensure SSO is configured..." ,
"metadata" : {
"document_title" : "Onboarding checklist" ,
"section_summary" : "Environment prerequisites" ,
"corpora" : "support_playbooks"
},
"score" : 0.77 ,
"rank" : 0
}
]
}
}
Client examples
Python
TypeScript / JavaScript
Java
import os
import requests
BASE_URL = "https://your-soar-instance.com"
TOKEN = os.environ[ "SOAR_LABS_TOKEN" ]
CORPUS_ID = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"
payload = {
"corpora" : CORPUS_ID ,
"user_query" : "Summarize onboarding prerequisites" ,
}
response = requests.post(
f " { BASE_URL } /api/retrieve/" ,
headers = {
"Authorization" : f "Bearer { TOKEN } " ,
"Content-Type" : "application/json" ,
},
json = payload,
timeout = 60 ,
)
response.raise_for_status()
retrieval = response.json()
const BASE_URL = "https://your-soar-instance.com" ;
const token = process . env . SOAR_LABS_TOKEN ! ;
async function retrieveContext ( corpusId : string , userQuery : string ) {
const response = await fetch ( ` ${ BASE_URL } /api/retrieve/` , {
method: "POST" ,
headers: {
"Content-Type" : "application/json" ,
Authorization: `Bearer ${ token } ` ,
},
body: JSON . stringify ({ corpora: corpusId , user_query: userQuery }),
});
if ( ! response . ok ) {
throw new Error ( `Retrieve failed: ${ response . status } ` );
}
return response . json ();
}
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
var BASE_URL = "https://your-soar-instance.com" ;
var token = System . getenv ( "SOAR_LABS_TOKEN" );
var corpusId = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd" ;
var json = "{" +
" \" corpora \" : \" " + corpusId + " \" ," +
" \" user_query \" : \" Summarize onboarding prerequisites \" " +
"}" ;
var request = HttpRequest . newBuilder ( URI . create (BASE_URL + "/api/retrieve/" ))
. header ( "Authorization" , "Bearer " + token)
. header ( "Content-Type" , "application/json" )
. POST ( HttpRequest . BodyPublishers . ofString (json))
. build ();
var response = HttpClient . newHttpClient (). send (request, HttpResponse . BodyHandlers . ofString ());
if ( response . statusCode () >= 400 ) {
throw new RuntimeException ( "Retrieve failed: " + response . statusCode ());
}
var body = response . body ();
Use Cases
Retrieve context and use your own LLM: # Get relevant chunks
retrieval = client.post( "/api/retrieve/" , json = {
"corpora" : corpus_id,
"user_query" : "How do I configure SSL?"
})
# Extract context
context = " \n\n " .join([
chunk[ "text" ]
for chunk in retrieval[ "retrieve_data" ][ "result" ]
])
# Use your own LLM
response = openai.ChatCompletion.create(
model = "gpt-4" ,
messages = [
{ "role" : "system" , "content" : "Answer using this context: " + context},
{ "role" : "user" , "content" : retrieval[ "user_query" ]}
]
)
Test retrieval quality before full deployment:
Submit test queries
Inspect returned chunks and scores
Verify relevance of top results
Adjust corpus content if needed
Run A/B tests with different configurations
Return citations without LLM generation: {
"answer" : "See relevant documentation:" ,
"sources" : [
{
"title" : "Onboarding checklist" ,
"excerpt" : "Before onboarding, ensure SSO is configured..." ,
"url" : "/docs/onboarding"
}
]
}
Reduce API costs by:
Caching retrieval results
Generating answers client-side
Using cheaper LLMs with retrieved context
Implementing custom prompt logic
Savings : ~80-90% reduction in LLM API costs for high-volume use cases.
Retrieval History
View past retrievals for a corpus: curl -X GET "https://{your-host}/api/retrieve/?corpora_id={corpus-uuid}" \
-H "Authorization: Bearer $SOAR_LABS_TOKEN "
Query parameters:
order_by - Sort by created_at or -created_at (newest first, default)
start_date - Filter after this date (UTC, YYYY-MM-DD format)
end_date - Filter before this date (UTC, YYYY-MM-DD format)
page - Page number (default: 1)
page_size - Results per page (default: 20)
Remove individual retrievals from history: curl -X DELETE "https://{your-host}/api/retrieve/{retrieval-id}/" \
-H "Authorization: Bearer $SOAR_LABS_TOKEN "
Retrieval entries are immutable snapshots. PUT and PATCH operations return 403 Forbidden.
Error Handling
Causes:
Missing corpora or user_query fields
Corpus is not indexed (indexing_status != IND)
Invalid UUID format
Resolution:
Verify all required fields are present
Check corpus indexing status: GET /api/corpora/{id}/
Ensure corpus has indexed resources
Causes:
Corpus doesn’t exist
You don’t own the corpus
Resolution:
Verify corpus ID is correct
List your corpora: GET /api/corpora/
Check authentication credentials
500 Internal Server Error
Causes:
Vector database connectivity issues
Retrieval pipeline errors
Reranker service failures
Resolution:
Retry with exponential backoff
Check system health: GET /sys-health
Contact support if errors persist
Typical Response Times:
Small corpora (< 1000 chunks): 100-300ms
Medium corpora (1000-10000 chunks): 300-800ms
Large corpora (10000+ chunks): 800-2000ms
Improve performance with these strategies:
Corpus size - Smaller, focused corpora retrieve faster
Query specificity - Precise queries require less computation
Result limits - Request fewer chunks if you don’t need many
Caching - Cache frequent queries on your end
Parallel requests - Run multiple retrievals concurrently
Relevance scores explained:
0.9-1.0 : Highly relevant, exact match or very close
0.7-0.9 : Relevant, good semantic match
0.5-0.7 : Moderately relevant, partial match
< 0.5 : Low relevance, consider filtering out
Tip : Set a minimum score threshold (e.g., 0.6) to filter low-quality results.Authorizations jwtHeaderAuth jwtCookieAuth cookieAuth basicAuth
Bearer authentication header of the form Bearer <token> , where <token> is your auth token.
Body application/json application/x-www-form-urlencoded multipart/form-data
The query that was executed to get the results
corpora
string<uuid> | null
required
The corpora to which the query belongs
created_at
string<date-time>
required
The date and time the organization was created
updated_at
string<date-time>
required
The query that was executed to get the results
retrieval_time
number<double> | null
required
Time taken to process and retrieve data (in milliseconds)
Metadata retrieved from the query
corpora
string<uuid> | null
required
The corpora to which the query belongs