Skip to main content
DELETE
/
api
/
data
/
files
/
{id}

Overview

Permanently delete a file resource along with its stored binary, metadata, and all derived vector embeddings. This immediately removes the file’s content from search results and retrieval operations.
Irreversible: Deletion cannot be undone. The file, its storage blob, and all vector embeddings are permanently removed.
Use cases: Removing outdated documentation, retracting superseded content, managing storage costs, or complying with data deletion requests.

Authentication

Requires valid JWT token or session authentication. You must own the parent corpus.

Path Parameters

id
UUID
required
File resource identifier. Must belong to a corpus you own.Example: 0f75c73e-91bb-4e2b-9ff2-6820a8636ad8

Example request

curl -X DELETE https://{your-host}/api/data/files/0f75c73e-91bb-4e2b-9ff2-6820a8636ad8/ \
  -H "Authorization: Bearer $SOAR_LABS_TOKEN"

Response Codes

204
No Content
File successfully deleted. No response body returned.
404
Not Found
File does not exist or belongs to a corpus you don’t own.
409
Conflict
Backend could not delete the storage blob or vector embeddings. Retry usually succeeds once the storage subsystem catches up.Common causes:
  • Storage service temporarily unavailable
  • Vector database connection timeout
  • Concurrent deletion attempt

What Gets Deleted

  • File record - Database entry with metadata
  • Storage blob - Actual file binary in cloud/local storage
  • Vector embeddings - All chunks removed from Qdrant
  • Indexing metadata - Processing status and timestamps
Immediate effect: Deletion prevents the file from appearing in future retrievals. Vector embeddings are removed before the database record is deleted.

Common Use Cases

Check file details before permanently deleting:
# Get file info first
file = requests.get(
    f"{base_url}/api/data/files/{file_id}/",
    headers=headers
).json()

print(f"File: {file['metadata'].get('filename', 'Unknown')}")
print(f"Size: {file['file_size'] / 1024:.2f} KB")
print(f"Uploaded: {file['created_at']}")
print(f"Status: {file['indexing_status']}")

# Confirm deletion
if input("Delete this file? (yes/no): ") == "yes":
    response = requests.delete(
        f"{base_url}/api/data/files/{file_id}/",
        headers=headers
    )
    if response.status_code == 204:
        print("File deleted successfully")
Remove multiple files efficiently:
# Get all files in corpus
files = requests.get(
    f"{base_url}/api/data/files/?corpora={corpus_id}",
    headers=headers
).json()

# Delete files matching criteria
deleted_count = 0
for file in files["results"]:
    # Example: Delete old files
    if should_delete(file):  # Your custom logic
        try:
            response = requests.delete(
                f"{base_url}/api/data/files/{file['id']}/",
                headers=headers
            )
            if response.status_code == 204:
                deleted_count += 1
                print(f"Deleted: {file['metadata'].get('filename')}")
        except Exception as e:
            print(f"Failed: {e}")

print(f"Total deleted: {deleted_count}")
Remove old version before uploading new:
# Step 1: Delete old file
requests.delete(
    f"{base_url}/api/data/files/{old_file_id}/",
    headers=headers
)

# Step 2: Upload new version
with open("updated_document.pdf", "rb") as f:
    files = {"files": ("updated_document.pdf", f, "application/pdf")}
    data = {"corpora": corpus_id}

    response = requests.post(
        f"{base_url}/api/data/files/",
        headers={"Authorization": f"Bearer {token}"},
        files=files,
        data=data
    )

new_file = response.json()[0]
print(f"New file ID: {new_file['id']}")
Best practice: Track file versions by naming convention or metadata to easily identify and replace outdated content.
Implement retry logic for 409 errors:
import time

def safe_delete_file(base_url, headers, file_id, max_retries=3):
    """Delete file with retry logic."""
    for attempt in range(max_retries):
        try:
            response = requests.delete(
                f"{base_url}/api/data/files/{file_id}/",
                headers=headers,
                timeout=60
            )

            if response.status_code == 204:
                return True
            elif response.status_code == 409:
                if attempt < max_retries - 1:
                    wait_time = 2 ** attempt
                    print(f"Conflict. Retrying in {wait_time}s...")
                    time.sleep(wait_time)
                else:
                    print("Deletion failed after retries")
                    return False
            elif response.status_code == 404:
                print("File not found or already deleted")
                return False
            else:
                response.raise_for_status()

        except Exception as e:
            print(f"Error: {e}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)

    return False
Instead of deleting, move files to an archive corpus:
# Step 1: Download file metadata
file = requests.get(
    f"{base_url}/api/data/files/{file_id}/",
    headers=headers
).json()

# Step 2: Download the file blob
# (Assuming you have a file download endpoint)

# Step 3: Re-upload to archive corpus
with open(local_path, "rb") as f:
    files = {"files": (filename, f, content_type)}
    data = {"corpora": archive_corpus_id}

    requests.post(
        f"{base_url}/api/data/files/",
        headers={"Authorization": f"Bearer {token}"},
        files=files,
        data=data
    )

# Step 4: Delete from original corpus
requests.delete(
    f"{base_url}/api/data/files/{file_id}/",
    headers=headers
)

print("File archived successfully")
Benefits:
  • Preserves data for potential future use
  • Maintains audit trail
  • Can restore if needed
Storage optimization: Regular cleanup of obsolete files reduces storage costs and improves corpus query performance.

Client examples

import os
import requests

BASE_URL = "https://your-soar-instance.com"
TOKEN = os.environ["SOAR_LABS_TOKEN"]
FILE_ID = "0f75c73e-91bb-4e2b-9ff2-6820a8636ad8"

response = requests.delete(
    f"{BASE_URL}/api/data/files/{FILE_ID}/",
    headers={"Authorization": f"Bearer {TOKEN}"},
    timeout=30,
)
if response.status_code != 204:
    response.raise_for_status()

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

id
string<uuid>
required

A UUID string identifying this File.

Response

204

No response body