Integrations

Drop-in adapters for popular AI frameworks. All integrations use your project API key and talk to the same Dewey REST API.

LangChain#

langchain-dewey provides a drop-in retriever, a full VectorStore, and an agent research tool. Install it alongside your existing LangChain setup:

pip install langchain-dewey

DeweyRetriever

Drop-in BaseRetriever backed by Dewey's hybrid semantic + BM25 search. Each returned Document carries full citation metadata: score, filename, section title, and section level.

from langchain_dewey import DeweyRetriever
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

retriever = DeweyRetriever(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
    k=8,
)

qa = RetrievalQA.from_chain_type(llm=ChatOpenAI(), retriever=retriever)
answer = qa.invoke("What are the main findings?")

DeweyVectorStore

Full VectorStore implementation. Dewey manages its own embeddings, so the embedding argument is accepted for interface compatibility but unused.

from langchain_dewey import DeweyVectorStore

# Wrap an existing collection
store = DeweyVectorStore(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
)
docs = store.similarity_search("retrieval augmented generation", k=5)

# Build from texts (creates a new collection, uploads, waits for ready)
store = DeweyVectorStore.from_texts(
    texts=["Neural networks learn via backpropagation.", "..."],
    embedding=None,
    api_key="dwy_live_...",
    collection_name="my-docs",
)

# Build from LangChain document loaders
from langchain_community.document_loaders import PyPDFLoader

pages = PyPDFLoader("research_paper.pdf").load_and_split()
store = DeweyVectorStore.from_documents(
    pages, embedding=None, api_key="dwy_live_...", collection_name="research"
)

Research tool

create_research_tool returns a LangChain @tool that runs a full Dewey research query — searching, reading, and synthesising across multiple documents — for use with any LangChain agent.

from langchain_dewey import create_research_tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

research = create_research_tool(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
    depth="balanced",  # "quick" | "balanced" | "deep" | "exhaustive"
)

llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful research assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])
agent = create_openai_tools_agent(llm, [research], prompt)
executor = AgentExecutor(agent=agent, tools=[research])
executor.invoke({"input": "Summarise the key findings across all documents."})

Requirements: Python 3.9+, meetdewey >= 1.0, langchain-core >= 0.3.

LlamaIndex#

llama-index-retrievers-dewey provides a drop-in retriever that plugs directly into LlamaIndex query engines and agents:

pip install llama-index-retrievers-dewey

DeweyRetriever

Drop-in BaseRetriever backed by Dewey's hybrid semantic + BM25 search. Returned nodes carry full citation metadata: score, filename, section title, and section level.

from llama_index.retrievers.dewey import DeweyRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.llms.openai import OpenAI

retriever = DeweyRetriever(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
    k=8,
)

# Use with any LlamaIndex query engine
query_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    llm=OpenAI(model="gpt-4o-mini"),
)
response = query_engine.query("What are the main findings?")
print(response)

# Direct retrieval
nodes = retriever.retrieve("attention mechanism scaling")
for n in nodes:
    print(f"[{'{'}n.score:.3f{'}'}] {'{'}n.node.metadata['filename']{'}'} — {'{'}n.node.text[:120]{'}'}")

Agent tool

Wrap the retriever as a RetrieverTool and hand it to any LlamaIndex agent:

from llama_index.core.tools import RetrieverTool
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

tool = RetrieverTool.from_defaults(
    retriever=retriever,
    description="Search company documents for relevant information.",
)
agent = ReActAgent.from_tools([tool], llm=OpenAI(model="gpt-4o-mini"), verbose=True)
agent.chat("Summarise the key findings across all documents.")

Requirements: Python 3.9+, meetdewey >= 1.0, llama-index-core >= 0.12.

Haystack#

dewey-haystack provides a document store, a retriever component, and a full agentic research component for Haystack 2.0 pipelines:

pip install dewey-haystack

DeweyDocumentStore

Implements the Haystack DocumentStore protocol. Upload Haystack Document objects directly; Dewey handles chunking and embeddings automatically.

from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack.utils import Secret

store = DeweyDocumentStore(
    api_key=Secret.from_env_var("DEWEY_API_KEY"),
    collection_id="3f7a1b2c-...",
)

DeweyRetriever

A Haystack @component that queries the document store and returns ranked chunks. Plugs into any Haystack pipeline.

from haystack import Pipeline
from haystack_integrations.document_stores.dewey import DeweyDocumentStore
from haystack_integrations.components.retrievers.dewey import DeweyRetriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret

store = DeweyDocumentStore(
    api_key=Secret.from_env_var("DEWEY_API_KEY"),
    collection_id="3f7a1b2c-...",
)

prompt_template = """
Answer using the provided context.
Context: {% for doc in documents %}{{ doc.content }}{% endfor %}
Question: {{ query }}
"""

pipeline = Pipeline()
pipeline.add_component("retriever", DeweyRetriever(document_store=store, top_k=5))
pipeline.add_component("prompt", PromptBuilder(template=prompt_template))
pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o-mini"))

pipeline.connect("retriever.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")

result = pipeline.run({
    "retriever": {"query": "What are the key findings?"},
    "prompt": {"query": "What are the key findings?"},
})
print(result["llm"]["replies"][0])

DeweyResearchComponent

A drop-in replacement for an LLM generator that runs Dewey's full agentic research loop — searching, reading, and synthesising across multiple documents. Returns a grounded Markdown answer and the source chunks cited.

from haystack import Pipeline
from haystack_integrations.components.retrievers.dewey import DeweyResearchComponent
from haystack.utils import Secret

pipeline = Pipeline()
pipeline.add_component(
    "research",
    DeweyResearchComponent(
        api_key=Secret.from_env_var("DEWEY_API_KEY"),
        collection_id="3f7a1b2c-...",
        depth="balanced",  # "quick" | "balanced" | "deep" | "exhaustive"
    ),
)

result = pipeline.run({"research": {"query": "What were the key findings?"}})
print(result["research"]["answer"])
for source in result["research"]["sources"]:
    print(f"  [{'{'}source.meta['filename']{'}'}] {'{'}source.content[:80]{'}'}")

Requirements: Python 3.9+, meetdewey >= 1.0, haystack-ai >= 2.0. The deep and exhaustive depths require a Pro plan and a BYOK key on your project.

Vercel AI SDK#

@meetdewey/vercel-ai provides two ready-to-use Vercel AI SDK tools that connect your application to a Dewey collection:

npm install @meetdewey/vercel-ai

deweyRetrievalTool

Wraps Dewey's hybrid semantic + BM25 search as an AI SDK tool(). The model decides when to search, receives ranked chunks as tool output, and can call the tool multiple times per response.

import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import { deweyRetrievalTool } from '@meetdewey/vercel-ai'

const result = streamText({
  model: anthropic('claude-sonnet-4-6'),
  system: 'Answer questions using the search tool. Always cite your sources.',
  messages,
  tools: {
    search: deweyRetrievalTool({
      apiKey: process.env.DEWEY_API_KEY,
      collectionId: process.env.DEWEY_COLLECTION_ID,
      limit: 8,
    }),
  },
  maxSteps: 5,
})

deweyResearchTool

Delegates a question to Dewey's agentic research endpoint. Dewey runs a multi-step loop internally — searching, reading sections, and synthesising — and returns a cited answer and source list. Useful for complex multi-document questions where you want Dewey to handle the retrieval loop rather than the outer model.

import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { deweyResearchTool } from '@meetdewey/vercel-ai'

const { text } = await generateText({
  model: openai('gpt-4o-mini'),
  system: 'You are an assistant. Use the research tool for document questions.',
  messages,
  tools: {
    research: deweyResearchTool({
      apiKey: process.env.DEWEY_API_KEY,
      collectionId: process.env.DEWEY_COLLECTION_ID,
      depth: 'balanced', // 'quick' | 'balanced' | 'deep' | 'exhaustive'
    }),
  },
  maxSteps: 3,
})

Requirements: Node.js 18+, ai >= 4.0 (peer dependency). Works with any AI SDK-compatible model provider.

Streamlit Demo#

A ready-to-run Streamlit app that demonstrates the full Dewey API — collections, document upload, hybrid search, and streaming agentic research. Use it as a starting point for your own document-intelligence app.

Quickstart

# 1. Clone the demo
git clone https://github.com/meetdewey/streamlit-demo
cd streamlit-demo

# 2. Create a virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 3. Set your API key
cp .env.example .env
# Edit .env and set DEWEY_API_KEY=dwy_live_...

# 4. Run
streamlit run Home.py

The app opens at http://localhost:8501. You can also enter the API key directly in the sidebar without touching .env.

Features

Page	What it demonstrates
Home	Project usage meters, collection overview
Collections	Create, inspect (stats), and delete collections
Documents	Paginated document list with status, multi-file upload
Search	Hybrid semantic + keyword search with scored chunk results
Research	Streaming agentic research with tool-call trace and cited sources

Three environment variables are supported: DEWEY_API_KEY (required), DEWEY_COLLECTION_ID (pre-selects a default collection), and DEWEY_API_URL (override for local development).