Skip to main content
POST
/
api
/
retrieve
{
  "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z",
  "user_query": "<string>",
  "retrieval_time": 123,
  "retrieve_data": "<unknown>",
  "corpora": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
}

Overview

Retrieval-only requests return ranked context passages without LLM generation. Use this endpoint when you want to:
  • Build your own prompting layer
  • Inspect retrieval quality before LLM generation
  • Implement custom response logic
  • Reduce costs (no LLM API calls)
Performance Benefit: Retrieval-only requests are typically 5-10x faster than full query requests since they skip LLM generation.
Events are recorded with is_query=false in the history, distinguishing them from full query executions.

Authentication

Requires valid JWT token or session authentication. You must own the target corpus.

Request Body

corpora
UUID
required
ID of the corpus to search. Must be fully indexed (indexing_status: "IND").
user_query
string
required
Natural-language query used to fetch relevant context passages.How retrieval works:
  1. Query is analyzed for intent and complexity
  2. Multiple retrieval strategies run in parallel (semantic, keyword, hybrid)
  3. Results are reranked using ensemble methods
  4. Top-ranked chunks are returned with metadata
Query tips:
  • Be specific for better precision
  • Use natural language (not keyword stuffing)
  • Phrase as questions for best results
  • Context is king - include relevant details

Example request

curl -X POST https://{your-host}/api/retrieve/ \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
    "user_query": "Summarize onboarding prerequisites"
  }'

Response

id
UUID
Unique identifier for this retrieval execution.
created_at
timestamp
ISO 8601 timestamp when retrieval was executed.
updated_at
timestamp
Last update timestamp (usually same as created_at).
corpora
UUID
ID of the corpus that was searched.
user_query
string
The query used for retrieval.
retrieval_time
float
Total execution time in milliseconds (includes search, ranking, and reranking).
retrieve_data
object
Structured retrieval results with ranked context passages.

Example Response

{
  "id": "11c7d0d5-f3bd-4dfb-9157-bf51c38e62fd",
  "created_at": "2024-09-01T13:29:06.102Z",
  "updated_at": "2024-09-01T13:29:06.102Z",
  "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
  "user_query": "Summarize onboarding prerequisites",
  "retrieval_time": 311.8,
  "retrieve_data": {
    "result": [
      {
        "text": "Before onboarding, ensure SSO is configured...",
        "metadata": {
          "document_title": "Onboarding checklist",
          "section_summary": "Environment prerequisites",
          "corpora": "support_playbooks"
        },
        "score": 0.77,
        "rank": 0
      }
    ]
  }
}

Client examples

import os
import requests

BASE_URL = "https://your-soar-instance.com"
TOKEN = os.environ["SOAR_LABS_TOKEN"]
CORPUS_ID = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"

payload = {
    "corpora": CORPUS_ID,
    "user_query": "Summarize onboarding prerequisites",
}

response = requests.post(
    f"{BASE_URL}/api/retrieve/",
    headers={
        "Authorization": f"Bearer {TOKEN}",
        "Content-Type": "application/json",
    },
    json=payload,
    timeout=60,
)
response.raise_for_status()
retrieval = response.json()

Use Cases

Retrieve context and use your own LLM:
# Get relevant chunks
retrieval = client.post("/api/retrieve/", json={
    "corpora": corpus_id,
    "user_query": "How do I configure SSL?"
})

# Extract context
context = "\n\n".join([
    chunk["text"]
    for chunk in retrieval["retrieve_data"]["result"]
])

# Use your own LLM
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "Answer using this context: " + context},
        {"role": "user", "content": retrieval["user_query"]}
    ]
)
Test retrieval quality before full deployment:
  1. Submit test queries
  2. Inspect returned chunks and scores
  3. Verify relevance of top results
  4. Adjust corpus content if needed
  5. Run A/B tests with different configurations
Return citations without LLM generation:
{
  "answer": "See relevant documentation:",
  "sources": [
    {
      "title": "Onboarding checklist",
      "excerpt": "Before onboarding, ensure SSO is configured...",
      "url": "/docs/onboarding"
    }
  ]
}
Reduce API costs by:
  • Caching retrieval results
  • Generating answers client-side
  • Using cheaper LLMs with retrieved context
  • Implementing custom prompt logic
Savings: ~80-90% reduction in LLM API costs for high-volume use cases.

Retrieval History

View past retrievals for a corpus:
curl -X GET "https://{your-host}/api/retrieve/?corpora_id={corpus-uuid}" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"
Query parameters:
  • order_by - Sort by created_at or -created_at (newest first, default)
  • start_date - Filter after this date (UTC, YYYY-MM-DD format)
  • end_date - Filter before this date (UTC, YYYY-MM-DD format)
  • page - Page number (default: 1)
  • page_size - Results per page (default: 20)
Remove individual retrievals from history:
curl -X DELETE "https://{your-host}/api/retrieve/{retrieval-id}/" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"
Retrieval entries are immutable snapshots. PUT and PATCH operations return 403 Forbidden.

Error Handling

Causes:
  • Missing corpora or user_query fields
  • Corpus is not indexed (indexing_status != IND)
  • Invalid UUID format
Resolution:
  • Verify all required fields are present
  • Check corpus indexing status: GET /api/corpora/{id}/
  • Ensure corpus has indexed resources
Causes:
  • Corpus doesn’t exist
  • You don’t own the corpus
Resolution:
  • Verify corpus ID is correct
  • List your corpora: GET /api/corpora/
  • Check authentication credentials
Causes:
  • Vector database connectivity issues
  • Retrieval pipeline errors
  • Reranker service failures
Resolution:
  • Retry with exponential backoff
  • Check system health: GET /sys-health
  • Contact support if errors persist

Performance Tips

Typical Response Times:
  • Small corpora (< 1000 chunks): 100-300ms
  • Medium corpora (1000-10000 chunks): 300-800ms
  • Large corpora (10000+ chunks): 800-2000ms
Improve performance with these strategies:
  1. Corpus size - Smaller, focused corpora retrieve faster
  2. Query specificity - Precise queries require less computation
  3. Result limits - Request fewer chunks if you don’t need many
  4. Caching - Cache frequent queries on your end
  5. Parallel requests - Run multiple retrievals concurrently
Relevance scores explained:
  • 0.9-1.0: Highly relevant, exact match or very close
  • 0.7-0.9: Relevant, good semantic match
  • 0.5-0.7: Moderately relevant, partial match
  • < 0.5: Low relevance, consider filtering out
Tip: Set a minimum score threshold (e.g., 0.6) to filter low-quality results.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

user_query
string
required

The query that was executed to get the results

corpora
string<uuid> | null
required

The corpora to which the query belongs

Response

201 - application/json
id
string<uuid>
required
created_at
string<date-time>
required

The date and time the organization was created

updated_at
string<date-time>
required

Last updated time

user_query
string
required

The query that was executed to get the results

retrieval_time
number<double> | null
required

Time taken to process and retrieve data (in milliseconds)

retrieve_data
any | null
required

Metadata retrieved from the query

corpora
string<uuid> | null
required

The corpora to which the query belongs