Add a String

Overview

Strings are lightweight resources for ingesting raw text without creating files. The endpoint accepts batches so you can import multiple snippets in one request. Each record is queued for ingestion, then chunked and indexed for retrieval just like uploaded files.

Perfect for: Code snippets, FAQs, short documents, configuration examples, or any text content that doesn’t require file uploads.

Strings are processed asynchronously. Monitor indexing_status to track when content is ready for querying.

Authentication

Requires valid JWT token or session authentication. You must be the owner of the target corpus.

Request Body

corpora

UUID

required

ID of the corpus that will own these text strings. Must be a corpus you created and have access to.

strings

array<object>

required

Array of text snippets to ingest. Each object represents one string resource.Batch size recommendations:

Optimal: 10-50 strings per request
Maximum: Check your instance configuration (typically 100-200)

Show String object structure

strings[].title

string

required

Display title for the text snippet (max 100 characters). Used for identification and search context.

strings[].string

string

required

The raw text content to index. Can be any length, but very long texts (>10,000 chars) may be split into multiple chunks.Best practices:

Use clear, self-contained content
Include relevant context in each string
Format code blocks with proper syntax
Keep related information together

Example request

curl -X POST https://{your-host}/api/data/strings/ \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd",
    "strings": [
      {"title": "Escalation policy", "string": "Escalate to L2 after 30 minutes."},
      {"title": "SLA definition", "string": "Critical tickets must be responded to within 10m."}
    ]
  }'

Response

Returns an array of string objects (one for each submitted string):

UUID

Unique identifier for the string resource. Use for tracking, retrieval, or deletion.

created_at

timestamp

ISO 8601 timestamp when the string was created.

updated_at

timestamp

Last update timestamp. Changes when indexing status updates.

indexed_on

timestamp | null

Timestamp when indexing completed successfully. null while processing.

indexing_status

string

Current processing status:

PRS - Processing (chunking and indexing in progress)
IND - Indexed (ready for queries)
ERR - Error (processing failed)
PND - Pending (queued for processing)

title

string

The display title you provided for this string.

string

The raw text content being indexed.

character_count

integer

Total character count of the string content. Useful for tracking corpus size.

corpora

UUID

ID of the parent corpus containing this string.

Example Response

[
  {
    "id": "b43bbf35-9819-47a8-8aff-d4eb4b3e8219",
    "created_at": "2024-09-01T10:19:59.709807Z",
    "updated_at": "2024-09-01T10:19:59.709922Z",
    "indexed_on": null,
    "indexing_status": "PRS",
    "title": "Escalation policy",
    "string": "Escalate to L2 after 30 minutes.",
    "character_count": 35,
    "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"
  },
  {
    "id": "2f3c3748-173a-4ace-875d-cff12d5d71ee",
    "created_at": "2024-09-01T10:19:59.710042Z",
    "updated_at": "2024-09-01T10:19:59.710057Z",
    "indexed_on": null,
    "indexing_status": "PRS",
    "title": "SLA definition",
    "string": "Critical tickets must be responded to within 10m.",
    "character_count": 55,
    "corpora": "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"
  }
]

Important Notes

Batch Validation: The entire request fails if any string in the batch has missing title or string fields. Validate all entries before submitting.

Track ingestion progress via GET /api/data/strings/?corpora={id}. Each record transitions from PRS → IND when indexing completes.

Client examples

Python
TypeScript / JavaScript
Java

import os
import requests

BASE_URL = "https://your-soar-instance.com"
TOKEN = os.environ["SOAR_LABS_TOKEN"]
CORPUS_ID = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd"

payload = {
    "corpora": CORPUS_ID,
    "strings": [
        {"title": "Escalation policy", "string": "Escalate to L2 after 30 minutes."},
        {"title": "SLA definition", "string": "Critical tickets must be responded to within 10m."},
    ],
}

response = requests.post(
    f"{BASE_URL}/api/data/strings/",
    headers={
        "Authorization": f"Bearer {TOKEN}",
        "Content-Type": "application/json",
    },
    json=payload,
    timeout=30,
)
response.raise_for_status()
strings = response.json()

const BASE_URL = "https://your-soar-instance.com";
const token = process.env.SOAR_LABS_TOKEN!;

async function addStrings(corpusId: string) {
  const response = await fetch(`${BASE_URL}/api/data/strings/`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${token}`,
    },
    body: JSON.stringify({
      corpora: corpusId,
      strings: [
        { title: "Escalation policy", string: "Escalate to L2 after 30 minutes." },
        { title: "SLA definition", string: "Critical tickets must be responded to within 10m." },
      ],
    }),
  });

  if (!response.ok) {
    throw new Error(`Add strings failed: ${response.status}`);
  }

  return response.json();
}

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

var BASE_URL = "https://your-soar-instance.com";
var token = System.getenv("SOAR_LABS_TOKEN");
var corpusId = "8d0f0a5d-4b5e-4c09-9db6-0e9d2aa8a9fd";

var json = "{" +
    "\"corpora\":\"" + corpusId + "\"," +
    "\"strings\":[{" +
        "\"title\":\"Escalation policy\",\"string\":\"Escalate to L2 after 30 minutes.\"" +
    "},{" +
        "\"title\":\"SLA definition\",\"string\":\"Critical tickets must be responded to within 10m.\"" +
    "}]" +
"}";

var request = HttpRequest.newBuilder(URI.create(BASE_URL + "/api/data/strings/"))
    .header("Authorization", "Bearer " + token)
    .header("Content-Type", "application/json")
    .POST(HttpRequest.BodyPublishers.ofString(json))
    .build();

var response = HttpClient.newHttpClient().send(request, HttpResponse.BodyHandlers.ofString());

if (response.statusCode() >= 400) {
    throw new RuntimeException("Add strings failed: " + response.statusCode());
}

Best Practices

Optimize Content Structure

Structure strings for optimal retrieval:

Use descriptive titles - Helps with search context and user navigation
Keep content focused - One topic per string for better semantic matching
Include relevant keywords - Natural language is best, avoid keyword stuffing
Add contextual information - Background details improve answer quality
Format code properly - Use markdown code blocks for syntax highlighting

Example - Good Structure:

{
  "title": "JWT Token Refresh Endpoint",
  "string": "To refresh an expired JWT token, send a POST request to /api/auth/token/refresh/ with your refresh token in the request body. The endpoint returns a new access token valid for 1 hour."
}

Batch Import Strategies

Efficient batch importing techniques:

Group related content - Import FAQ sets, documentation sections together
Use consistent naming - Helps with organization and search
Start small - Test with 5-10 strings before bulk importing
Monitor progress - Poll status endpoint after each batch
Handle failures gracefully - Retry failed batches with corrections

Typical workflow:

Prepare 20-50 strings in a batch
Submit batch via POST request
Wait 5-10 seconds for initial processing
Poll status endpoint until all show IND
Proceed to next batch

Common Use Cases

Ideal scenarios for string resources:Developer Documentation:

API endpoint descriptions
Code examples and snippets
Configuration templates
Error messages and solutions

Knowledge Base:

FAQ answers
Policy statements
Product specifications
Troubleshooting steps

Training Content:

Glossary definitions
Best practice guidelines
Process documentation
Quick reference guides

Error Prevention

Avoid common issues:Validation Errors:

Ensure every object has both title and string
Check title length doesn’t exceed 100 characters
Verify corpus ID is valid UUID format

Processing Failures:

Avoid extremely long strings (>50,000 chars)
Remove special control characters
Use UTF-8 encoding for text with unicode
Test special characters in small batches first

Performance Issues:

Don’t send more than 100 strings per request
Wait for previous batch to complete before sending next
Implement exponential backoff on errors

Management Operations

List All Strings

Retrieve all strings in a corpus:

curl -X GET "https://{your-host}/api/data/strings/?corpora={corpus-id}" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"

Supports pagination with page and page_size parameters.

Update a String

Modify an existing string’s content or title:

curl -X PATCH "https://{your-host}/api/data/strings/{string-id}/" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Updated Title",
    "string": "Updated content"
  }'

Note: Updates trigger re-indexing of the content.

Delete Strings

Remove strings from the corpus:

curl -X DELETE "https://{your-host}/api/data/strings/{string-id}/" \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"

Warning: Deletion is immediate and removes vector embeddings. Cannot be undone.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

strings

object[]

required

Show child attributes

string

required

Content of the string

corpora

string<uuid>

required

Corpora to which the Maps to

title

string | null

Title of the string

Maximum string length: 100

Response

201 - application/json

string<uuid>

required

created_at

string<date-time>

required

The date and time the organization was created

updated_at

string<date-time>

required

Last updated time

indexed_on

string<date-time> | null

required

indexing_status

enum<string>

required

PND - Pending
IQE - In Queue
PRS - Processing
DEX - Data Extracted Successfully
DER - Data Extraction Error
IND - Indexed
CMP - Completed
ERR - Error

Available options:

PND,

IQE,

PRS,

DEX,

DER,

IND,

CMP,

ERR

string

required

Content of the string

character_count

integer | null

required

Number of characters in the string

corpora

string<uuid>

required

Corpora to which the Maps to

title

string | null

Title of the string

Maximum string length: 100

Getting Started

Corpus Management

Query and Retrieve

Resources

Overview

Authentication

Request Body

Example request

Response

Example Response

Important Notes

Client examples

Best Practices

Management Operations

Authorizations

Body

Response

Getting Started

Corpus Management

Query and Retrieve

Resources

​Overview

​Authentication

​Request Body

​Example request

​Response

​Example Response

​Important Notes

​Client examples

​Best Practices

​Management Operations

Authorizations

Body

Response

Overview

Authentication

Request Body

Example request

Response

Example Response

Important Notes

Client examples

Best Practices

Management Operations