Cheatsheet

Gemini prompt patterns cheatsheet (2026 production playbook)

Production-tested Gemini prompt patterns: leveraging the 1M context window, structured-output mode, native multimodal input, Gemini's quirks vs Claude/GPT, and the Workspace integration tricks.

Use the 1M context window properly

Gemini 2 Pro's headline feature is 1M tokens. Make sure you're getting value from it.

ItemDescriptionExample
Put critical content at the head + tailRecall is best at boundaries; the middle drops accuracy.
Use needle-in-haystack tests on your dataPublic benchmarks don't predict your domain — run your own.
Prompt cache the static frameIf you reuse a long context block, the cached prefix is materially cheaper.

Structured output

ItemDescriptionExample
response_mime_type: 'application/json'Force JSON output.
response_schema: { ... }Constrained generation against a JSON schema. The cleanest way to extract structured data.
Function callingSame shape as OpenAI / Anthropic. Works well for tool-use loops.

Multimodal input

ItemDescriptionExample
Pass images as inline_dataBase64 + mime type. Easy for image input.
Video understandingGemini 2 Pro natively summarises and answers questions over video files.
Audio inputNative audio transcription + understanding (no separate ASR needed for many flows).

Gemini quirks vs Claude / GPT

ItemDescriptionExample
Refusal rate is higherGemini refuses more edge prompts than Claude or GPT-4o. Rephrase if you hit a refusal on a benign task.
Tighter system-prompt adherence than 2024 versionsWorth a re-test if you last evaluated Gemini in 2024.
Workspace contextInside Workspace (Docs, Sheets, Slides, Gmail) Gemini sees the document — exploit this for long-document workflows.

Gemini Deep Research

ItemDescriptionExample
Use forLong-form research synthesis across 20+ web sources.
TipTight scope wins — "Compare X vs Y on A, B, C with cited sources" beats "tell me about X".

FAQ

How do I force JSON output on Gemini?

Set response_mime_type='application/json' and supply response_schema for constrained generation.

Does Gemini really use the full 1M context?

Recall past ~600K shows degradation. For summarisation and chunk-level QA, 1M still works well; for needle-in-haystack reliability, keep critical content at head + tail.

Should I use Gemini for coding?

Capable but trails Claude 4.6 Sonnet for serious code work in 2026. Strong as a long-context code-reading model.

Last updated: 2026-06-01.