JavaScript
Resources
Upload a File
Attach one or more files to a corpus so they can be processed and indexed.
POST
JavaScript
Overview
This endpoint handles binary file uploads and triggers asynchronous ingestion (content extraction + chunking + vector indexing). Only corpora you own can accept uploads, and files must be in supported formats.Supported Formats: PDF, DOCX, TXT, CSV, JSON, Markdown (MD), Excel (XLSX/XLS), HTML/HTM, LOG files
Authentication
Requires valid JWT token or session authentication. You must be the owner of the target corpus.Request Body
ID of the corpus that will own these files. Must be a corpus you created and have access to.
One or more files to upload. Send as
multipart/form-data with multiple files fields.Processing pipeline:- File validation (format, size)
- Upload to cloud storage
- Content extraction (text, tables, images)
- Chunking and metadata generation
- Vector embedding and indexing
Example request
Response
Returns an array of file objects (one for each uploaded file):Unique identifier for the file resource. Use for tracking, retrieval, or deletion.
ISO 8601 timestamp when the file was uploaded.
Last update timestamp. Changes when indexing status updates.
Timestamp when indexing completed successfully.
null while processing.Current processing status of the file:
PRS- Processing (extraction and indexing in progress)IND- Indexed (ready for queries)ERR- Error (processing failed, check logs)PND- Pending (queued for processing)
Original filename as uploaded.
Detected file extension/type (e.g.,
pdf, docx, csv).Cloud storage URL where the file is persisted. Access requires authentication.
File size in bytes.
ID of the parent corpus containing this file.
Example Response
Client examples
- Python
- TypeScript / JavaScript
- Java
Post-Upload Operations
Monitor Indexing Status
Monitor Indexing Status
Poll the files endpoint to track processing progress:Status progression:
PND → PRS → IND (success) or ERR (failure)Typical processing times:- Small text files (< 1MB): 5-15 seconds
- PDFs with images (5-10MB): 30-60 seconds
- Large documents (20MB+): 2-5 minutes
Retrieve File Details
Retrieve File Details
Get detailed information about a specific file:Returns full metadata including chunking statistics and extraction details.
Delete Files
Delete Files
Remove files from the corpus and vector index:Important: Deletion is immediate and irreversible. The operation:
- Removes the file from cloud storage
- Deletes all associated vector embeddings
- Updates corpus size metadata
- Cannot be undone - you’ll need to re-upload if deleted accidentally
Handle Processing Errors
Handle Processing Errors
If a file shows
indexing_status: "ERR", common causes include:- Corrupted or invalid file format - Re-export and try again
- Unsupported encoding - Convert to UTF-8 for text files
- Password-protected PDFs - Remove protection before uploading
- Extremely large files - Split into smaller chunks
- Unsupported content - Check if file type is in supported list
Best Practices
Optimize File Preparation
Optimize File Preparation
Before uploading:
- Remove unnecessary pages - Reduce file size by excluding cover pages, blank pages
- Use OCR for scanned PDFs - Convert image-based PDFs to searchable text
- Clean up formatting - Remove excessive whitespace, fix broken tables
- Verify encoding - Ensure text files use UTF-8 encoding
- Test file opens - Verify files aren’t corrupted before upload
Organize by Corpus
Organize by Corpus
Create separate corpora for different content types or use cases:
- Internal Documentation - Company policies, procedures
- Product Knowledge - Technical specs, user guides
- Customer Support - FAQs, troubleshooting guides
- Training Materials - Onboarding docs, tutorials
Monitor Upload Queue
Monitor Upload Queue
For bulk uploads:
- Upload in reasonable batches (10-20 files per request)
- Poll status every 10-30 seconds
- Implement exponential backoff if servers are busy
- Log file IDs for tracking and error recovery
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Response
201 - application/json
The date and time the organization was created
Last updated time
PND- PendingIQE- In QueuePRS- ProcessingDEX- Data Extracted SuccessfullyDER- Data Extraction ErrorIND- IndexedCMP- CompletedERR- Error
Available options:
PND, IQE, PRS, DEX, DER, IND, CMP, ERR Original, user-facing name of the uploaded file.
Type of the file
Location of the file on Remote Storage
bytes
Corpora to which the Maps to

