Everything you need for production RAG

Deploy intelligent AI agents with advanced retrieval, multi-LLM support, and enterprise-grade features. All in one powerful platform.

Features that set Skald apart

Production-grade capabilities designed for real-world AI applications

Fully Customizable

Configurable RAG pipeline

Fine-tune every aspect of your RAG system through our API. Adjust chunking strategies, embedding models, reranking algorithms, and retrieval parameters to optimize for your specific use case.

Custom chunking strategies
Embedding model selection
Reranking configuration
Retrieval parameters
chat-example.ts
await skald.chat({
  query: 'What are our Q4 goals?',
  rag_config: {
    llmProvider: 'anthropic',
    queryRewrite: { enabled: true },
    vectorSearch: { 
      topK: 20, 
      similarityThreshold: 0.5 
    },
    reranking: { 
      enabled: true, 
      topK: 10 
    },
    references: { enabled: true }
  }
});
End-to-End Pipeline

Advanced document parsing

We handle the entire ingestion pipeline for you. From extracting text, tables, and structure from PDFs and Word documents to chunking, embedding, and indexing. Everything is processed and ready for retrieval automatically.

PDF & Word document parsing
Table and chart extraction
Automatic chunking & embedding
Production-ready indexing
financial_report_Q4.pdf
Extracted: 247 pages, 18 tables, 52 charts
TextTablesImagesMetadata
PROCESSING
STRUCTURED OUTPUT
✓ Text chunks: 1,243
✓ Embeddings generated
✓ Metadata extracted
✓ Ready for retrieval
Trustworthy AI

Built-in citations

Every answer includes automatic source tracking and references to original documents. Build trust with transparent, verifiable AI responses.

Why citations matter:
  • Verify AI responses against source material
  • Build user trust with transparency
  • Meet compliance requirements
AI

Based on the Q4 financial report, revenue increased by 23% compared to Q3, primarily driven by enterprise subscriptions.

SOURCES
1
financial_report_Q4.pdf, p.12
2
revenue_analysis.docx, section 3.2

Complete RAG infrastructure

Everything you need to build, deploy, and scale production AI applications

Advanced Document Parsing

Intelligent parsing powered by Docling for PDFs, Word documents, and more. Extract text, tables, and structure with precision for optimal RAG performance.

Multi-LLM Provider Support

Bring your own LLM provider or run inference locally. Support for OpenAI, Anthropic, and any custom LLM server. Complete flexibility without vendor lock-in.

Citations & Source References

Automatic source tracking and citation generation. Every answer includes references to the original documents, ensuring transparency and trustworthiness.

Flexible Knowledge Management

Store and organize memos with automatic metadata extraction. Support for notes, documents, code, and any text-based content with powerful tagging capabilities.

Semantic Search

Powerful vector search with configurable embeddings. Find relevant context instantly with query rewriting and intelligent ranking.

Chat & Retrieval APIs

Production-ready chat API with built-in RAG. Chat history management, context retrieval, and response generation in one unified endpoint.

Configurable RAG Pipeline

Fine-tune every aspect: chunking strategies, reranking algorithms, vector search parameters, and system prompts. Adapt to your exact requirements.

Knowledge Filtering

Restrict accessible knowledge per query with powerful filtering. Improve accuracy and performance by scoping context to relevant documents.

Built-in Evaluation

Experiment with configurations and measure performance. A/B test different RAG strategies and optimize for your specific use case.

Multi-Language SDKs

Production-ready SDKs for Node.js, Python, Ruby, Go, PHP, and .NET. Consistent APIs across languages with comprehensive documentation.

MCP Integration

Connect your AI agents directly to Skald using the official Model Context Protocol server. Seamless integration with agent frameworks.

Self-Hosted or Cloud

Deploy on your infrastructure with full data control or use our managed cloud. MIT licensed with no vendor lock-in.

Fast to Production

Get started in minutes with sensible defaults. Push context and chat out-of-the-box, then fine-tune as you scale.