Partio is a containerized summarization, chunking and embeddings controller that simplifies document processing for RAG pipelines, semantic search, and AI applications. Three containers. One API. Every language. Ollama, OpenAI, Gemini, and vLLM all fit.
# Start Partio with Docker Compose
$ curl -o compose.yaml \
https://raw.githubusercontent.com/jchristn/Partio/main/docker/compose.yaml
$ docker compose up -d
# Summarize, chunk, and embed a document
$ curl -X POST http://localhost:8400/v1.0/process \
-H "Authorization: Bearer default" \
-H "Content-Type: application/json" \
-d '{
"Type": "Text",
"Text": "Partio makes chunking and embedding easy.",
"SummarizationConfiguration": {
"CompletionEndpointId": "cep_default",
"Order": "TopDown"
},
"ChunkingConfiguration": {
"Strategy": "FixedTokenCount",
"FixedTokenCount": 256
},
"EmbeddingConfiguration": {
"EmbeddingEndpointId": "eep_default"
}
}'
{"Text":"Partio makes chunking...","Chunks":[{"Text":"...","Embeddings":[0.012,-0.045,...]}]}
Stop building summarization, chunking, and embedding infrastructure from scratch. Partio gives you production-ready document processing in minutes.
Just docker compose up and pull your models. No cluster management, no complex infrastructure.
Generate intelligent summaries of your content before chunking and embedding. Choose between top-down or bottom-up strategies with configurable prompts and retry logic.
Process text, code, lists, tables, hyperlinks, images, and more. Partio understands your data structure and chunks it intelligently based on content type.
Fixed token count, sentence-based, paragraph-based, regex-based, table rows, key-value pairs, and more. Use overlap and sliding windows for context continuity.
Isolate teams, customers, or environments with built-in multi-tenancy. Each tenant gets their own users, credentials, and embedding endpoints.
Native SDKs for C#, Python, and JavaScript. Zero-dependency JS client using native fetch. Full async support across all languages.
Works with Ollama, OpenAI, Gemini, vLLM, Azure OpenAI, LocalAI, or other compatible APIs. Background health checks ensure endpoint reliability.
Partio handles the hard parts of summarization, chunking, and embedding so you can focus on building your application.
Partio sits between your application and your providers, handling summarization, chunking, embedding, and provider management across Ollama, OpenAI, Gemini, and vLLM.
Whether you're building RAG pipelines, semantic search, or AI-powered applications, Partio fits right in.
Chunk documents into semantically meaningful pieces and generate embeddings in a single API call. Feed directly into your vector store for retrieval-augmented generation.
AI EngineersGenerate high-quality embeddings for search indexes. Support multiple chunking strategies to optimize recall and precision for your specific corpus.
Data ScientistsProcess tables, code blocks, lists, and mixed-format documents with type-aware chunking. Each content type gets the strategy it deserves.
DevelopersBatch-process your knowledge base through a REST API. Partio handles the chunking and embedding so your ingestion pipeline stays clean and focused.
DevelopersServe multiple customers from a single deployment with isolated tenants, scoped credentials, and per-tenant embedding endpoints. Built-in audit logging included.
DevOpsSwitch between embedding providers without changing application code. Compare Ollama, OpenAI, Gemini, and vLLM side by side using the same chunking strategies.
Data ScientistsFive specialized table chunking strategies: row-based, row with headers, grouped rows, key-value pairs, and whole-table. Perfect for CSV, Excel, and database exports.
AI EngineersFull request history with upstream embedding call tracking. See exactly what was sent to your provider, what came back, and how long it took.
DevOpsYou could build summarization, chunking, and embedding infrastructure yourself. Here's what that looks like compared to Partio.
| Capability | DIY Approach | Partio |
|---|---|---|
| Setup time | Days to weeks | Minutes |
| Summarization | Custom LLM integration | Built-in top-down & bottom-up strategies |
| Chunking strategies | Build each one yourself | 11 strategies, configurable |
| Embedding & inference providers | Hard-coded per provider | Ollama, OpenAI, Gemini, and vLLM |
| Multi-tenancy | Custom auth & isolation | Built-in tenant isolation |
| Health monitoring | Build from scratch | Automatic background checks |
| Request auditing | Custom logging | Full audit trail with upstream tracking |
| Admin dashboard | Build or skip | React dashboard included |
| SDKs | Write your own clients | C#, Python, JS ready to go |
| Database support | Pick one, build schema | SQLite, PostgreSQL, MySQL, SQL Server |
| Deployment | Custom Docker/CI | docker compose up |
| Maintenance | You own it forever | Community-maintained, MIT licensed |
Summarize, chunk, and embed documents in just a few lines of code, in any language.
# Summarize, chunk, and embed a text document
curl -X POST http://localhost:8400/v1.0/process \
-H "Authorization: Bearer default" \
-H "Content-Type: application/json" \
-d '{
"Type": "Text",
"Text": "RAG improves LLM responses by grounding them...",
"SummarizationConfiguration": {
"CompletionEndpointId": "cep_default",
"Order": "TopDown"
},
"ChunkingConfiguration": { "Strategy": "SentenceBased" },
"EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
}'
# Process without summarization (chunking + embedding only)
curl -X POST http://localhost:8400/v1.0/process \
-H "Authorization: Bearer default" \
-H "Content-Type: application/json" \
-d '{
"Type": "Text",
"Text": "# Introduction\nWelcome.\n# Setup\nInstall deps.",
"ChunkingConfiguration": { "Strategy": "RegexBased", "RegexPattern": "^# " },
"EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
}'
from partio_sdk import PartioClient
# Connect to Partio
with PartioClient("http://localhost:8400", "partioadmin") as client:
# Check server health
health = client.health()
print(f"Server: {health['Status']} v{health['Version']}")
# Summarize, chunk, and embed a text document
result = client.process({
"Type": "Text",
"Text": "RAG improves LLM responses by grounding them...",
"SummarizationConfiguration": {
"CompletionEndpointId": "cep_default",
"Order": "TopDown"
},
"ChunkingConfiguration": { "Strategy": "SentenceBased" },
"EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
})
# Access chunks and embeddings
for chunk in result["Chunks"]:
print(f"Chunk: {chunk['Text'][:50]}...")
print(f"Embedding dims: {len(chunk['Embeddings'])}")
import { PartioClient } from './partio-sdk.js';
// Connect to Partio (zero dependencies, native fetch)
const client = new PartioClient('http://localhost:8400', 'partioadmin');
// Summarize, chunk, and embed a text document
const result = await client.process({
Type: 'Text',
Text: 'RAG improves LLM responses by grounding them...',
SummarizationConfiguration: {
CompletionEndpointId: 'cep_default',
Order: 'TopDown'
},
ChunkingConfiguration: { Strategy: 'FixedTokenCount', FixedTokenCount: 256 },
EmbeddingConfiguration: { EmbeddingEndpointId: 'eep_default' }
});
console.log(`Chunks: ${result.Chunks.length}`);
using Partio.Sdk;
using Partio.Sdk.Models;
// Connect to Partio
using var client = new PartioClient("http://localhost:8400", "partioadmin");
// Summarize, chunk, and embed a text document
var response = await client.ProcessAsync(new SemanticCellRequest
{
Type = "Text",
Text = "RAG improves LLM responses...",
SummarizationConfiguration = new SummarizationConfiguration
{
CompletionEndpointId = "cep_default",
Order = "TopDown"
},
ChunkingConfiguration = new ChunkingConfiguration
{
Strategy = "SentenceBased"
},
EmbeddingConfiguration = new EmbeddingConfiguration
{
EmbeddingEndpointId = "eep_default"
}
});
Optionally summarize your content before chunking and embedding. Partio uses LLM-powered inference endpoints to generate intelligent summaries.
Summarize parent cells first, then children. Context flows downward, providing each child with the parent's summary for richer, contextually-aware summaries.
All cell typesSummarize leaf cells first, then work upward. Parent summaries incorporate child summaries, building a comprehensive understanding from the ground up.
All cell typesChoose the right chunking strategy for your content type. Mix and match across different document sections.
Split into chunks of N tokens using cl100k_base encoding. Supports configurable overlap.
All typesSplit at sentence boundaries for natural, readable chunks that preserve meaning.
All typesSplit at paragraph boundaries to keep related content together in each chunk.
All typesTreat the entire list as a single chunk, preserving the complete list context.
Lists onlyEach list item becomes its own chunk, ideal for FAQ-style or itemized content.
Lists onlyEach table row as a space-separated chunk without headers.
Tables onlyEach row as a markdown table with column headers repeated, providing context per chunk.
Tables onlyGroups of N rows with headers. Balance between context and chunk size.
Tables onlyRows formatted as "key: value" pairs, natural for structured data extraction.
Tables onlyThe entire table as a single markdown-formatted chunk.
Tables onlySplit at boundaries defined by a user-supplied regular expression. Ideal for Markdown headings, log timestamps, LaTeX sections, or any structured format.
All typesChoose your preferred deployment method. Partio runs anywhere Docker runs.
curl -o compose.yaml \
https://raw.githubusercontent.com/jchristn/Partio/main/docker/compose.yaml
curl -o partio.json \
https://raw.githubusercontent.com/jchristn/Partio/main/docker/partio.json
docker compose up -d
# Bash / macOS / Linux
curl http://localhost:11434/api/pull -d '{"name": "all-minilm"}'
# Windows Terminal (cmd)
curl http://localhost:11434/api/pull -d "{\"name\": \"all-minilm\"}"
# Bash / macOS / Linux
curl http://localhost:11434/api/pull -d '{"name": "gemma3:4b"}'
# Windows Terminal (cmd)
curl http://localhost:11434/api/pull -d "{\"name\": \"gemma3:4b\"}"
# Server health check
curl http://localhost:8400/
{"Status":"Healthy","Version":"0.2.0"}
# Open the dashboard
open http://localhost:8401
:8400, Dashboard on :8401partioadminBearer defaultdocker run -d \
--name partio-server \
-p 8400:8400 \
-v ./data:/app/data \
-v ./logs:/app/logs \
jchristn77/partio-server:latest
docker run -d \
--name partio-dashboard \
-p 8401:8401 \
jchristn77/partio-dashboard:latest
git clone https://github.com/jchristn/Partio.git
cd Partio
# Requires .NET 10.0 SDK
cd src/Partio.Server
dotnet build
dotnet run
# Requires Node.js 18+
cd dashboard
npm install
npm run dev
A full-featured React admin dashboard for managing tenants, users, credentials, endpoints, and processing — all from your browser.
Create and manage tenants with labels, tags, and isolated resources. Full CRUD with pagination and filtering.
Manage users per tenant with role-based access. Generate and revoke bearer token credentials with one click.
Configure Ollama, OpenAI, Gemini, and vLLM endpoints. Set health check intervals, thresholds, and monitor endpoint status in real time.
Test chunking and embedding directly in the browser. Select strategies, configure parameters, and inspect results with a built-in JSON viewer.
Browse all API requests with full audit details. Inspect request/response bodies and track upstream embedding calls for debugging.
Visual health status for all embedding endpoints. Automatic background checks with configurable thresholds for healthy/unhealthy state transitions.
Native clients with full API coverage. 30+ methods per SDK, consistent interfaces, and idiomatic patterns for each language.
using var client = new PartioClient(endpoint, key);
var result = await client.ProcessAsync(request);
with statement)requests>=2.28.0with PartioClient(endpoint, key) as client:
result = client.process(request)
const client = new PartioClient(endpoint, key);
const result = await client.process(request);
A clean, consistent API with full CRUD operations, pagination, and comprehensive error handling.
/
/v1.0/health
/v1.0/whoami
/v1.0/process
/v1.0/process/batch
/v1.0/tenants
/v1.0/users
/v1.0/credentials
/v1.0/endpoints/embedding
/v1.0/endpoints/completion
/v1.0/endpoints/embedding/{id}/health
/v1.0/endpoints/embedding/health
/v1.0/endpoints/completion/{id}/health
/v1.0/endpoints/completion/health
Partio supports four database backends. Start with SQLite for zero-config development, scale to PostgreSQL or SQL Server for production.
Zero configuration, file-based. Perfect for development, testing, and single-node deployments.
DefaultProduction-grade relational database. Ideal for high-throughput multi-tenant deployments.
Widely deployed and well-supported. Great when you're already running MySQL infrastructure.
Enterprise-ready with full feature parity. Fits naturally into Microsoft-centric environments.
Partio is free, open source, and MIT licensed. Get started in minutes with Docker or build from source.