Perform a Retrieval

Overview

Retrieval-only requests return ranked context passages without LLM generation. Use this endpoint when you want to:

Build your own prompting layer
Inspect retrieval quality before LLM generation
Implement custom response logic
Reduce costs (no LLM API calls)

Performance Benefit: Retrieval-only requests are typically 5-10x faster than full query requests since they skip LLM generation.

Events are recorded with is_query=false in the history, distinguishing them from full query executions.

Authentication

Requires valid JWT token or session authentication. You must own the target corpus.

Request Body

corpora

UUID

required

ID of the corpus to search. Must be fully indexed (indexing_status: "IND").

user_query

string

required

Natural-language query used to fetch relevant context passages.How retrieval works:

Query is analyzed for intent and complexity
Multiple retrieval strategies run in parallel (semantic, keyword, hybrid)
Results are reranked using ensemble methods
Top-ranked chunks are returned with metadata

Query tips:

Be specific for better precision
Use natural language (not keyword stuffing)
Phrase as questions for best results
Context is king - include relevant details

Example request

curl -X POST https://{your-host}/api/retrieve/ \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
    "user_query": "Summarize onboarding prerequisites"
  }'

Response

UUID

Unique identifier for this retrieval execution.

created_at

timestamp

ISO 8601 timestamp when retrieval was executed.

updated_at

timestamp

Last update timestamp (usually same as created_at).

corpora

UUID

ID of the corpus that was searched.

user_query

string

The query used for retrieval.

retrieval_time

float

Total execution time in milliseconds (includes search, ranking, and reranking).

retrieve_data

object

Structured retrieval results with ranked context passages.

Hide result

result

array

Array of retrieved chunks, ranked by relevance.

Show chunk properties

text

string

The actual content of the retrieved chunk.

metadata

object

Context metadata about the source.

Show metadata fields

document_title

string

Title of the source document.

section_summary

string

AI-generated summary of the containing section.

corpora

string

Corpus name (normalized format).

Additional metadata may include:

excerpt_keywords - Extracted keywords
questions - Sample questions the chunk can answer
file_name - Original filename (if from file)
url - Source URL (if from web)

score

float

Relevance score (0-1) after ensemble reranking. Higher is more relevant.

rank

integer

Zero-indexed position in ranked results (0 = most relevant).

Example Response

{
  "id": "11c7d0d5-f3bd-4dfb-9157-bf51c38e62fd",
  "created_at": "2024-09-01T13:29:06.102Z",
  "updated_at": "2024-09-01T13:29:06.102Z",
  "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
  "user_query": "Summarize onboarding prerequisites",
  "retrieval_time": 311.8,
  "retrieve_data": {
    "result": [
      {
        "text": "Before onboarding, ensure SSO is configured...",
        "metadata": {
          "document_title": "Onboarding checklist",
          "section_summary": "Environment prerequisites",
          "corpora": "support_playbooks"
        },
        "score": 0.77,
        "rank": 0
      }
    ]
  }
}

Client examples

Python
TypeScript / JavaScript
Java

import os
import requests

BASE_URL = "https://your-soar-instance.com"
TOKEN = os.environ["SOAR_LABS_TOKEN"]
CORPUS_ID = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"

payload = {
    "corpora": CORPUS_ID,
    "user_query": "Summarize onboarding prerequisites",
}

response = requests.post(
    f"{BASE_URL}/api/retrieve/",
    headers={
        "Authorization": f"Bearer {TOKEN}",
        "Content-Type": "application/json",
    },
    json=payload,
    timeout=60,
)
response.raise_for_status()
retrieval = response.json()

const BASE_URL = "https://your-soar-instance.com";
const token = process.env.SOAR_LABS_TOKEN!;

async function retrieveContext(corpusId: string, userQuery: string) {
  const response = await fetch(`${BASE_URL}/api/retrieve/`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${token}`,
    },
    body: JSON.stringify({ corpora: corpusId, user_query: userQuery }),
  });

  if (!response.ok) {
    throw new Error(`Retrieve failed: ${response.status}`);
  }

  return response.json();
}

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

var BASE_URL = "https://your-soar-instance.com";
var token = System.getenv("SOAR_LABS_TOKEN");
var corpusId = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd";

var json = "{" +
    "\"corpora\":\"" + corpusId + "\"," +
    "\"user_query\":\"Summarize onboarding prerequisites\"" +
"}";

var request = HttpRequest.newBuilder(URI.create(BASE_URL + "/api/retrieve/"))
    .header("Authorization", "Bearer " + token)
    .header("Content-Type", "application/json")
    .POST(HttpRequest.BodyPublishers.ofString(json))
    .build();

var response = HttpClient.newHttpClient().send(request, HttpResponse.BodyHandlers.ofString());

if (response.statusCode() >= 400) {
    throw new RuntimeException("Retrieve failed: " + response.statusCode());
}

var body = response.body();

Use Cases

Custom LLM Integration

Retrieve context and use your own LLM:

# Get relevant chunks
retrieval = client.post("/api/retrieve/", json={
    "corpora": corpus_id,
    "user_query": "How do I configure SSL?"
})

# Extract context
context = "\n\n".join([
    chunk["text"]
    for chunk in retrieval["retrieve_data"]["result"]
])

# Use your own LLM
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "Answer using this context: " + context},
        {"role": "user", "content": retrieval["user_query"]}
    ]
)

Quality Assessment

Test retrieval quality before full deployment:

Submit test queries
Inspect returned chunks and scores
Verify relevance of top results
Adjust corpus content if needed
Run A/B tests with different configurations

Citation-Only Responses

Return citations without LLM generation:

{
  "answer": "See relevant documentation:",
  "sources": [
    {
      "title": "Onboarding checklist",
      "excerpt": "Before onboarding, ensure SSO is configured...",
      "url": "/docs/onboarding"
    }
  ]
}

Cost Optimization

Reduce API costs by:

Caching retrieval results
Generating answers client-side
Using cheaper LLMs with retrieved context
Implementing custom prompt logic

Savings: ~80-90% reduction in LLM API costs for high-volume use cases.

Retrieval History

List Retrieval Events

View past retrievals for a corpus:

curl -X GET "https://{your-host}/api/retrieve/?corpora_id={corpus-uuid}" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"

Query parameters:

order_by - Sort by created_at or -created_at (newest first, default)
start_date - Filter after this date (UTC, YYYY-MM-DD format)
end_date - Filter before this date (UTC, YYYY-MM-DD format)
page - Page number (default: 1)
page_size - Results per page (default: 20)

Delete Retrieval Records

Remove individual retrievals from history:

curl -X DELETE "https://{your-host}/api/retrieve/{retrieval-id}/" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"

Retrieval entries are immutable snapshots. PUT and PATCH operations return 403 Forbidden.

Error Handling

400 Bad Request

Causes:

Missing corpora or user_query fields
Corpus is not indexed (indexing_status != IND)
Invalid UUID format

Resolution:

Verify all required fields are present
Check corpus indexing status: GET /api/corpora/{id}/
Ensure corpus has indexed resources

404 Not Found

Causes:

Corpus doesn’t exist
You don’t own the corpus

Resolution:

Verify corpus ID is correct
List your corpora: GET /api/corpora/
Check authentication credentials

500 Internal Server Error

Causes:

Vector database connectivity issues
Retrieval pipeline errors
Reranker service failures

Resolution:

Retry with exponential backoff
Check system health: GET /sys-health
Contact support if errors persist

Performance Tips

Typical Response Times:

Small corpora (< 1000 chunks): 100-300ms
Medium corpora (1000-10000 chunks): 300-800ms
Large corpora (10000+ chunks): 800-2000ms

Optimize Retrieval Speed

Improve performance with these strategies:

Corpus size - Smaller, focused corpora retrieve faster
Query specificity - Precise queries require less computation
Result limits - Request fewer chunks if you don’t need many
Caching - Cache frequent queries on your end
Parallel requests - Run multiple retrievals concurrently

Understanding Scores

Relevance scores explained:

0.9-1.0: Highly relevant, exact match or very close
0.7-0.9: Relevant, good semantic match
0.5-0.7: Moderately relevant, partial match
< 0.5: Low relevance, consider filtering out

Tip: Set a minimum score threshold (e.g., 0.6) to filter low-quality results.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

user_query

string

required

The query that was executed to get the results

corpora

string<uuid> | null

required

The corpora to which the query belongs

Response

201 - application/json

string<uuid>

required

created_at

string<date-time>

required

The date and time the organization was created

updated_at

string<date-time>

required

Last updated time

user_query

string

required

The query that was executed to get the results

retrieval_time

number<double> | null

required

Time taken to process and retrieve data (in milliseconds)

retrieve_data

any | null

required

Metadata retrieved from the query

corpora

string<uuid> | null

required

The corpora to which the query belongs

Getting Started

Corpus Management

Query and Retrieve

Resources

Overview

Authentication

Request Body

Example request

Response

Example Response

Client examples

Use Cases

Retrieval History

Error Handling

Performance Tips

Authorizations

Body

Response

Getting Started

Corpus Management

Query and Retrieve

Resources

​Overview

​Authentication

​Request Body

​Example request

​Response

​Example Response

​Client examples

​Use Cases

​Retrieval History

​Error Handling

​Performance Tips

Authorizations

Body

Response

Overview

Authentication

Request Body

Example request

Response

Example Response

Client examples

Use Cases

Retrieval History

Error Handling

Performance Tips