# Instruction hierarchy

**Source:** https://promtable.com/glossary/instruction-hierarchy

> Instruction hierarchy is a model's trained ordering of trust — system prompt outranks user message which outranks retrieved content — used to resist prompt injection and jailbreak attempts.

---
Instruction hierarchy is a model's trained ordering of trust — system prompt outranks user message which outranks retrieved content — used to resist prompt injection and jailbreak attempts.

Introduced by OpenAI in 2024 and now widely adopted, instruction hierarchy explicitly trains models to weight different sources of instructions differently. The system prompt (highest trust) sets policy; the user message (medium trust) can request actions within policy; tool output and retrieved content (lowest trust) provide information but should not be obeyed as instructions. The technique meaningfully reduces indirect prompt-injection success rates but does not eliminate them. Combine with input/output guardrails and tight tool surfaces for production safety.

## Common mistakes

- Treating instruction hierarchy as complete protection — it raises the bar, not closes the door.
- Putting trust-sensitive policy in the user message instead of the system prompt.

## Related terms

- [prompt-injection](https://promtable.com/glossary/prompt-injection)
- [system-prompt](https://promtable.com/glossary/system-prompt)
- [guardrails](https://promtable.com/glossary/guardrails)
- [jailbreak](https://promtable.com/glossary/jailbreak)

*Last updated: 2026-06-01*
---

Original page: https://promtable.com/glossary/instruction-hierarchy
Maintained by Promtable (https://promtable.com). Content: CC BY 4.0. Cite as "Promtable — https://promtable.com/glossary/instruction-hierarchy".
Contact: info@vibecodingturkey.com.