Open Source · MIT License · v0.2.0

Summarize. Chunk. Embed.
Ship Faster.

Partio is a containerized summarization, chunking and embeddings controller that simplifies document processing for RAG pipelines, semantic search, and AI applications. Three containers. One API. Every language. Ollama, OpenAI, Gemini, and vLLM all fit.

Get Started View on GitHub Docker Hub

MIT License Docker Ready C# · Python · JS SDKs Multi-Tenant RESTful API React Dashboard LLM Summarization

terminal

# Start Partio with Docker Compose
$ curl -o compose.yaml \
    https://raw.githubusercontent.com/jchristn/Partio/main/docker/compose.yaml
$ docker compose up -d

# Summarize, chunk, and embed a document
$ curl -X POST http://localhost:8400/v1.0/process \
    -H "Authorization: Bearer default" \
    -H "Content-Type: application/json" \
    -d '{
      "Type": "Text",
      "Text": "Partio makes chunking and embedding easy.",
      "SummarizationConfiguration": {
        "CompletionEndpointId": "cep_default",
        "Order": "TopDown"
      },
      "ChunkingConfiguration": {
        "Strategy": "FixedTokenCount",
        "FixedTokenCount": 256
      },
      "EmbeddingConfiguration": {
        "EmbeddingEndpointId": "eep_default"
      }
    }'

{"Text":"Partio makes chunking...","Chunks":[{"Text":"...","Embeddings":[0.012,-0.045,...]}]}

Why Partio?

Stop building summarization, chunking, and embedding infrastructure from scratch. Partio gives you production-ready document processing in minutes.

Three Containers, Done

Just docker compose up and pull your models. No cluster management, no complex infrastructure.

LLM-Powered Summarization

Generate intelligent summaries of your content before chunking and embedding. Choose between top-down or bottom-up strategies with configurable prompts and retry logic.

Semantic Cell Processing

Process text, code, lists, tables, hyperlinks, images, and more. Partio understands your data structure and chunks it intelligently based on content type.

11 Chunking Strategies

Fixed token count, sentence-based, paragraph-based, regex-based, table rows, key-value pairs, and more. Use overlap and sliding windows for context continuity.

Multi-Tenant by Design

Isolate teams, customers, or environments with built-in multi-tenancy. Each tenant gets their own users, credentials, and embedding endpoints.

SDKs for Every Stack

Native SDKs for C#, Python, and JavaScript. Zero-dependency JS client using native fetch. Full async support across all languages.

Bring Your Own Provider

Works with Ollama, OpenAI, Gemini, vLLM, Azure OpenAI, LocalAI, or other compatible APIs. Background health checks ensure endpoint reliability.

Everything You Need

Partio handles the hard parts of summarization, chunking, and embedding so you can focus on building your application.

Processing

LLM-powered summarization with top-down and bottom-up strategies
Hierarchical semantic cells with parent-child relationships
Completion endpoint management for inference/summarization
8 semantic cell types (Text, Code, List, Table, Hyperlink, Meta, Binary, Image)
11 chunking strategies with configurable overlap
Sliding window overlap for context continuity
Context prefix injection per chunk
Batch processing for high-throughput workloads
L2 normalization for cosine similarity
Deterministic, reproducible chunking output

Infrastructure

Docker containers for server and dashboard
Multi-architecture builds (amd64, arm64)
4 database backends (SQLite, PostgreSQL, MySQL, SQL Server)
Zero-config SQLite default for quick start
Stateless server for horizontal scaling
Background health checks with configurable thresholds
Automatic request gating for unhealthy endpoints

Security & Operations

Multi-tenant isolation at the database level
Three-tier authentication (Admin, Credential, Public)
Scoped bearer tokens per tenant and user
Request history with full audit logging
Upstream embedding and completion call tracking
Configurable retention and automatic cleanup
Postman collection with 50+ pre-built requests

Built for Your Workflow

Whether you're building RAG pipelines, semantic search, or AI-powered applications, Partio fits right in.

RAG Pipelines

Chunk documents into semantically meaningful pieces and generate embeddings in a single API call. Feed directly into your vector store for retrieval-augmented generation.

AI Engineers

Semantic Search

Generate high-quality embeddings for search indexes. Support multiple chunking strategies to optimize recall and precision for your specific corpus.

Data Scientists

Document Processing

Process tables, code blocks, lists, and mixed-format documents with type-aware chunking. Each content type gets the strategy it deserves.

Developers

Knowledge Base Ingestion

Batch-process your knowledge base through a REST API. Partio handles the chunking and embedding so your ingestion pipeline stays clean and focused.

Developers

Multi-Tenant SaaS

Serve multiple customers from a single deployment with isolated tenants, scoped credentials, and per-tenant embedding endpoints. Built-in audit logging included.

DevOps

Model Experimentation

Switch between embedding providers without changing application code. Compare Ollama, OpenAI, Gemini, and vLLM side by side using the same chunking strategies.

Data Scientists

Table & Structured Data

Five specialized table chunking strategies: row-based, row with headers, grouped rows, key-value pairs, and whole-table. Perfect for CSV, Excel, and database exports.

AI Engineers

Observability & Debugging

Full request history with upstream embedding call tracking. See exactly what was sent to your provider, what came back, and how long it took.

DevOps

Why Not Roll Your Own?

You could build summarization, chunking, and embedding infrastructure yourself. Here's what that looks like compared to Partio.

Capability	DIY Approach	Partio
Setup time	Days to weeks	Minutes
Summarization	Custom LLM integration	Built-in top-down & bottom-up strategies
Chunking strategies	Build each one yourself	11 strategies, configurable
Embedding & inference providers	Hard-coded per provider	Ollama, OpenAI, Gemini, and vLLM
Multi-tenancy	Custom auth & isolation	Built-in tenant isolation
Health monitoring	Build from scratch	Automatic background checks
Request auditing	Custom logging	Full audit trail with upstream tracking
Admin dashboard	Build or skip	React dashboard included
SDKs	Write your own clients	C#, Python, JS ready to go
Database support	Pick one, build schema	SQLite, PostgreSQL, MySQL, SQL Server
Deployment	Custom Docker/CI	docker compose up
Maintenance	You own it forever	Community-maintained, MIT licensed

Simple by Design

Summarize, chunk, and embed documents in just a few lines of code, in any language.

# Summarize, chunk, and embed a text document
curl -X POST http://localhost:8400/v1.0/process \
  -H "Authorization: Bearer default" \
  -H "Content-Type: application/json" \
  -d '{
    "Type": "Text",
    "Text": "RAG improves LLM responses by grounding them...",
    "SummarizationConfiguration": {
      "CompletionEndpointId": "cep_default",
      "Order": "TopDown"
    },
    "ChunkingConfiguration": { "Strategy": "SentenceBased" },
    "EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
  }'

# Process without summarization (chunking + embedding only)
curl -X POST http://localhost:8400/v1.0/process \
  -H "Authorization: Bearer default" \
  -H "Content-Type: application/json" \
  -d '{
    "Type": "Text",
    "Text": "# Introduction\nWelcome.\n# Setup\nInstall deps.",
    "ChunkingConfiguration": { "Strategy": "RegexBased", "RegexPattern": "^# " },
    "EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
  }'

from partio_sdk import PartioClient

# Connect to Partio
with PartioClient("http://localhost:8400", "partioadmin") as client:

    # Check server health
    health = client.health()
    print(f"Server: {health['Status']} v{health['Version']}")

    # Summarize, chunk, and embed a text document
    result = client.process({
        "Type": "Text",
        "Text": "RAG improves LLM responses by grounding them...",
        "SummarizationConfiguration": {
            "CompletionEndpointId": "cep_default",
            "Order": "TopDown"
        },
        "ChunkingConfiguration": { "Strategy": "SentenceBased" },
        "EmbeddingConfiguration": { "EmbeddingEndpointId": "eep_default" }
    })

    # Access chunks and embeddings
    for chunk in result["Chunks"]:
        print(f"Chunk: {chunk['Text'][:50]}...")
        print(f"Embedding dims: {len(chunk['Embeddings'])}")

import { PartioClient } from './partio-sdk.js';

// Connect to Partio (zero dependencies, native fetch)
const client = new PartioClient('http://localhost:8400', 'partioadmin');

// Summarize, chunk, and embed a text document
const result = await client.process({
  Type: 'Text',
  Text: 'RAG improves LLM responses by grounding them...',
  SummarizationConfiguration: {
    CompletionEndpointId: 'cep_default',
    Order: 'TopDown'
  },
  ChunkingConfiguration: { Strategy: 'FixedTokenCount', FixedTokenCount: 256 },
  EmbeddingConfiguration: { EmbeddingEndpointId: 'eep_default' }
});

console.log(`Chunks: ${result.Chunks.length}`);

using Partio.Sdk;
using Partio.Sdk.Models;

// Connect to Partio
using var client = new PartioClient("http://localhost:8400", "partioadmin");

// Summarize, chunk, and embed a text document
var response = await client.ProcessAsync(new SemanticCellRequest
{
    Type = "Text",
    Text = "RAG improves LLM responses...",
    SummarizationConfiguration = new SummarizationConfiguration
    {
        CompletionEndpointId = "cep_default",
        Order = "TopDown"
    },
    ChunkingConfiguration = new ChunkingConfiguration
    {
        Strategy = "SentenceBased"
    },
    EmbeddingConfiguration = new EmbeddingConfiguration
    {
        EmbeddingEndpointId = "eep_default"
    }
});

Chunking Strategies

Choose the right chunking strategy for your content type. Mix and match across different document sections.

FixedTokenCount

Split into chunks of N tokens using cl100k_base encoding. Supports configurable overlap.

All types

SentenceBased

Split at sentence boundaries for natural, readable chunks that preserve meaning.

All types

ParagraphBased

Split at paragraph boundaries to keep related content together in each chunk.

All types

WholeList

Treat the entire list as a single chunk, preserving the complete list context.

Lists only

ListEntry

Each list item becomes its own chunk, ideal for FAQ-style or itemized content.

Lists only

Row

Each table row as a space-separated chunk without headers.

Tables only

RowWithHeaders

Each row as a markdown table with column headers repeated, providing context per chunk.

Tables only

RowGroupWithHeaders

Groups of N rows with headers. Balance between context and chunk size.

Tables only

KeyValuePairs

Rows formatted as "key: value" pairs, natural for structured data extraction.

Tables only

WholeTable

The entire table as a single markdown-formatted chunk.

Tables only

RegexBased

Split at boundaries defined by a user-supplied regular expression. Ideal for Markdown headings, log timestamps, LaTeX sections, or any structured format.

All types

Get Started in Minutes

Choose your preferred deployment method. Partio runs anywhere Docker runs.

Download the Compose file

curl -o compose.yaml \
  https://raw.githubusercontent.com/jchristn/Partio/main/docker/compose.yaml
curl -o partio.json \
  https://raw.githubusercontent.com/jchristn/Partio/main/docker/partio.json

Start the services

docker compose up -d

Pull an embedding model

# Bash / macOS / Linux
curl http://localhost:11434/api/pull -d '{"name": "all-minilm"}'

# Windows Terminal (cmd)
curl http://localhost:11434/api/pull -d "{\"name\": \"all-minilm\"}"

Pull a completion model (for summarization)

# Bash / macOS / Linux
curl http://localhost:11434/api/pull -d '{"name": "gemma3:4b"}'

# Windows Terminal (cmd)
curl http://localhost:11434/api/pull -d "{\"name\": \"gemma3:4b\"}"

Verify it's running

# Server health check
curl http://localhost:8400/
{"Status":"Healthy","Version":"0.2.0"}

# Open the dashboard
open http://localhost:8401

Default ports: Server on :8400, Dashboard on :8401
Default admin key: partioadmin
Default credential: Bearer default
Default database: SQLite (zero configuration)

Run the Partio server

docker run -d \
  --name partio-server \
  -p 8400:8400 \
  -v ./data:/app/data \
  -v ./logs:/app/logs \
  jchristn77/partio-server:latest

Run the dashboard

docker run -d \
  --name partio-dashboard \
  -p 8401:8401 \
  jchristn77/partio-dashboard:latest

Images: jchristn77/partio-server · jchristn77/partio-dashboard
Architectures: linux/amd64, linux/arm64

Clone the repository

git clone https://github.com/jchristn/Partio.git
cd Partio

Build and run the server

# Requires .NET 10.0 SDK
cd src/Partio.Server
dotnet build
dotnet run

Build and run the dashboard

# Requires Node.js 18+
cd dashboard
npm install
npm run dev

Prerequisites: .NET 10.0 SDK, Node.js 18+
Repository: github.com/jchristn/Partio

Manage Everything from the Dashboard

A full-featured React admin dashboard for managing tenants, users, credentials, endpoints, and processing — all from your browser.

Tenant Management

Create and manage tenants with labels, tags, and isolated resources. Full CRUD with pagination and filtering.

User & Credential Admin

Manage users per tenant with role-based access. Generate and revoke bearer token credentials with one click.

Endpoint Configuration

Configure Ollama, OpenAI, Gemini, and vLLM endpoints. Set health check intervals, thresholds, and monitor endpoint status in real time.

Live Processing Tool

Test chunking and embedding directly in the browser. Select strategies, configure parameters, and inspect results with a built-in JSON viewer.

Request History

Browse all API requests with full audit details. Inspect request/response bodies and track upstream embedding calls for debugging.

Health Monitoring

Visual health status for all embedding endpoints. Automatic background checks with configurable thresholds for healthy/unhealthy state transitions.

SDKs for Every Language

Native clients with full API coverage. 30+ methods per SDK, consistent interfaces, and idiomatic patterns for each language.

Partio.Sdk

.NET 8.0 / 10.0

Full async/await with Task-based API
IDisposable pattern for clean resource management
Nullable reference types enabled
XML documentation for IntelliSense

using var client = new PartioClient(endpoint, key);
var result = await client.ProcessAsync(request);

View SDK

partio_sdk

Python 3.7+

Context manager support (with statement)
Session-based HTTP with persistent headers
Single dependency: requests>=2.28.0
Custom PartioError exception class

with PartioClient(endpoint, key) as client:
    result = client.process(request)

View SDK

partio-sdk

Node.js 18+ / ES6

Zero external dependencies (native fetch)
ES module with full async/await
Works in Node.js and modern browsers
Custom PartioError class

const client = new PartioClient(endpoint, key);
const result = await client.process(request);

View SDK

RESTful API

A clean, consistent API with full CRUD operations, pagination, and comprehensive error handling.

Health & Identity

GET /

Health check

GET /v1.0/health

Health with version

GET /v1.0/whoami

Current user identity

Processing

POST /v1.0/process

Chunk & embed a single document

POST /v1.0/process/batch

Batch process multiple documents

Administration

PUT /v1.0/tenants

Create tenant

PUT /v1.0/users

Create user

PUT /v1.0/credentials

Create credential

PUT /v1.0/endpoints/embedding

Create embedding endpoint

PUT /v1.0/endpoints/completion

Create completion endpoint

Endpoint Health

GET /v1.0/endpoints/embedding/{id}/health

Single embedding endpoint health

GET /v1.0/endpoints/embedding/health

All embedding endpoints health

GET /v1.0/endpoints/completion/{id}/health

Single completion endpoint health

GET /v1.0/endpoints/completion/health

All completion endpoints health

Full API reference with request/response examples, authentication details, and error codes.

API Docs Postman

Summarize. Chunk. Embed.Ship Faster.

Why Partio?

Three Containers, Done

LLM-Powered Summarization

Semantic Cell Processing

11 Chunking Strategies

Multi-Tenant by Design

SDKs for Every Stack

Bring Your Own Provider

Everything You Need

Processing

Infrastructure

Security & Operations

How It Works

Built for Your Workflow

RAG Pipelines

Semantic Search

Document Processing

Knowledge Base Ingestion

Multi-Tenant SaaS

Model Experimentation

Table & Structured Data

Observability & Debugging

Why Not Roll Your Own?

Simple by Design

Summarization Strategies

TopDown

BottomUp

Chunking Strategies

FixedTokenCount

SentenceBased

ParagraphBased

WholeList

ListEntry

Row

RowWithHeaders

RowGroupWithHeaders

KeyValuePairs

WholeTable

RegexBased

Get Started in Minutes

Download the Compose file

Start the services

Pull an embedding model

Pull a completion model (for summarization)

Verify it's running

Run the Partio server

Run the dashboard

Clone the repository

Build and run the server

Build and run the dashboard

Manage Everything from the Dashboard

Tenant Management

User & Credential Admin

Endpoint Configuration

Live Processing Tool

Request History

Health Monitoring

SDKs for Every Language

Partio.Sdk

partio_sdk

partio-sdk

RESTful API

Health & Identity

Processing

Administration

Endpoint Health

Your Database, Your Choice

SQLite

PostgreSQL

MySQL

SQL Server

Ready to Simplify Your Processing Pipeline?

Summarize. Chunk. Embed.
Ship Faster.