concept

Assistant thread

An assistant thread is a server-side conversation object that manages messages, tool calls, and run state for a long-running LLM session — pioneered by the OpenAI Assistants API and now mirrored by Anthropic, Mistral, and others in 2026.

Before threads, developers had to re-send the full message history on every turn — expensive, error-prone, no shared truth. Threads move that state to the provider: append a message, kick off a 'run', the model executes (possibly multi-step with tool calls), the thread accumulates the final state. By 2026 most major LLM APIs expose some thread/session abstraction: OpenAI Assistants threads, Anthropic Messages (stateless but with MCP for state), Mistral conversations. Trade-offs: convenience + managed truncation vs vendor lock-in + opaque billing + harder to migrate. For production agents needing portability, BYO state (DB + own truncation) still wins.

When to use assistant thread

Common mistakes

FAQ

What is assistant thread?

An assistant thread is a server-side conversation object that manages messages, tool calls, and run state for a long-running LLM session — pioneered by the OpenAI Assistants API and now mirrored by Anthropic, Mistral, and others in 2026.

When should I use assistant thread?

OpenAI-only quick-build agents. Long-running conversation surface (assistant chatbots).

What are the most common mistakes with assistant thread?

Using threads when you need portability — lock-in is real, migration is painful. Forgetting threads accumulate cost forever — old turns still bill on every run.

Last updated: 2026-06-01. Raw markdown: https://promtable.com/glossary/assistant-thread.md.