Token Saving Techniques

CLI-focused techniques for Claude Code, Cursor Agent (agent), Codex CLI, and Gemini CLI. Output tokens cost 5× more than input across all Claude models (Sonnet 4.6: $3 in / $15 out per MTok). Thinking tokens bill as output. Codex and Gemini show 4×–8× asymmetry. Every technique here reduces one or both.

Claude Code

Cursor Agent CLI

Codex CLI

Gemini CLI

● /slash commands

● --flags & env vars

● files & config paths

● prompt examples

● notes

🧹

Context Hygiene

Clear Context Between Tasks

All

Reset when switching tasks. Stale context re-sends every turn.

Claude Code

/clear

Cursor

/clear

Codex

/new

Gemini

/clear

Impact

Keep Your Rules File Lean

All

Your rules file is re-read on every turn. A 200-line file across a 20-turn session costs 4,000 lines of input. Keep it short — move workflows to skills, use scoped rules, and split by directory.

Claude Code

CLAUDE.md

Keep under 200 lines
Move workflows to skills

Cursor

.cursor/rules/*.mdc

Each .mdc targets specific
files (e.g. *.tsx) — skipped
unless you touch those files

Codex

AGENTS.md

Put one per directory —
child dirs see parent rules
plus their own additions

Gemini

GEMINI.md

@import to split files
Hierarchical inheritance

Impact

Surgical File References

All

Point to specific files — don't dump whole directories.

Claude Code

@file · /add-dir

Cursor

@file

Codex

/mention

--add-dir

Gemini

@file

Impact

Exclude Irrelevant Files

All

Stop the agent from reading noisy directories (node_modules, build output, logs). Each tool uses a config file to define what's excluded.

Claude Code

.claude/settings.json

permissions.deny list

Cursor

.cursorignore

Same syntax as .gitignore

Codex

~/.codex/config.toml

sandbox_permissions list

Gemini

.geminiignore

.gitignore also respected

Impact

Compact Before Hitting the Limit

All

Summarise conversation to free tokens. All four support manual + auto compaction.

Claude Code

/compact

Cursor

/compress

Codex

/compact

Gemini

/compress

Impact

Tame Command Output Bloat

All

Pipe through tail/head. Delegate verbose tasks to subagents. Paste only errors, not full logs.

Claude Code

| tail -30

Cursor

| tail -30

Codex

| tail -30

Gemini

| tail -30

Impact

Mind the Prompt Cache TTL

Claude Code cached reads cost 10% of input price. 5-min default TTL. Cache miss = full re-read at full price. Docs →

Claude Code

⏱ 5-min TTL
Cache = 10% of input

Cursor

Managed internally

Codex

Responses API caching

Gemini

Managed internally

Impact

🔔 Enable Completion Sounds

All

Audio notifications free you from watching the terminal. Different sounds for "done" vs "needs approval". Prevents cache timeouts — you hear the bell, respond promptly, cache stays warm.

Claude Code

~/.claude.json

preferredNotifChannel

Set to "terminal_bell"

Cursor

No CLI config available

Codex

/statusline

Terminal bell on completion

Gemini

~/.gemini/settings.json

enableNotifications

Terminal bell on completion

Impact

🎯

Prompting Strategy

Use Plan Mode First

All

Read-only analysis before coding. Uses fewer tokens. Prevents trial-and-error loops.

Claude Code

/plan

Cursor

/plan

Codex

/plan

Gemini

/plan

Impact

Feed Precise Context

All

Front-load constraints and file paths. Vague prompts cause exploring — burning tokens to find what you already know.

Claude Code

"In src/auth.ts add JWT
using types/auth.d.ts"

Cursor

@src/auth.ts "Add JWT"

Codex

/mention files + prompt

Gemini

@file + specific prompt

Impact

Don't Paste Large Blobs

All

Pasting full docs, logs, images, or long stack traces floods your context and gets re-sent every turn. Instead: point to files by path, pipe logs through tail/head, paste only the relevant error, and use @file references for docs. Let the agent read what it needs.

Claude Code

As per title

Cursor

As per title

Codex

As per title

Gemini

As per title

Impact

Batch Related Changes

All

Combine related changes in one prompt. Each turn re-sends full context — fewer turns = fewer re-reads.

Claude Code

"Add validation, handler,
and tests for /users"

Cursor

Same — batch prompts

Codex

Same — batch prompts

Gemini

Same — batch prompts

Impact

Watch, Don't Interrupt

All

Each interruption re-ingests context. Queue feedback, deliver in one shot. Corrected twice and still wrong? Rewrite instructions, start fresh.

Claude Code

As per title

Cursor

As per title

Codex

As per title

Gemini

As per title

Impact

Use Ask / Read-Only Mode

All

Explore code without changes — cheaper than full agent mode.

Claude Code

/plan

Cursor

/ask

Codex

/plan

Gemini

/plan

Impact

Use /btw for Side Questions

Ask a quick question without adding to conversation history. Uses parent cache so minimal cost.

Claude Code

/btw

Doesn't add to history

Cursor

No equivalent

Codex

No equivalent

Gemini

No equivalent

Impact

⚡

Inspect Before You Commit

Claude Code: /context · /cost Cursor: /usage · Usage panel Codex: /status · /statusline Gemini: /stats · /stats model
These cost almost nothing — the insight saves entire conversations.

🔀

Model Routing

Use the right model for the job. All tools: /model to switch.

Flagship

Opus 4.6 · GPT-5.4
Gemini 3.1 Pro

Planning or complex reasoning

Balanced

Sonnet 4.6 · GPT-5.3-Codex
Gemini 3 Pro

Implementation

Fast / Light

Haiku 4.5 · GPT-5.1-Codex-mini
Gemini 3 Flash · Auto Mode

Subagents · cheap

🏗

Agent Architecture

Subagents = Fresh Context

All

Each subagent starts with clear context.

Claude Code

Use prompt text to ask to
run task in subagent

Cursor

Use prompt text to ask to
run task in subagent

Codex

Use prompt text to ask to
run task in subagent

Gemini

Use prompt text to ask to
run task in subagent

Impact

Disconnect Unused MCPs

All

Every MCP injects tool schemas into context. Disable what you're not using. All four support per-server tool filtering.

Claude Code

/mcp

Cursor

/mcp

Codex

/mcp

Gemini

/mcp

Impact

Prefer Native CLIs Over MCPs

All

git, npm, docker — run directly. No tool schema overhead.

Claude Code

As per title

Cursor

As per title

Codex

As per title

Gemini

As per title

Impact

Manage Plugins

All

Keep plugins to a minimum. Each one injects tool schemas into your context window every turn.

Claude Code

/plugins

Manage from here

Cursor

/plugins

Manage from here

Codex

/plugins

Manage from here

Gemini

No plugin system

Impact

Use Skills

All

Skills load on-demand instead of every turn. Move workflows into skills to keep base context small.

Claude Code

.claude/skills/*/SKILL.md

Cursor

.cursor/skills/

Codex

.codex/skills/*/SKILL.md

Gemini

.gemini/skills/

Impact

Reduce Thinking Effort

All

Thinking/reasoning tokens bill as output (5× input). Lower effort for simple tasks.

Claude Code

/effort

Menu to set low/med/high/max

Cursor

/max-mode [on|off]

Codex

/model

Menu to pick model + effort

Gemini

/model manage

Menu to pick model + effort

Impact

Automate With Hooks

All

Hooks run shell commands on agent events — block expensive tool calls, filter verbose output, trigger early compaction. All four CLIs support them.

Claude Code

.claude/settings.json

PreCompact · PreToolUse
30+ events · 4 hook types

Cursor

.cursor/hooks.json

beforeShellExecution
beforeReadFile · afterFileEdit

Codex

.codex/hooks.json

PreToolUse · PostToolUse
5 events · command type

Gemini

.gemini/settings.json

PreCompress · BeforeTool
11 events · migrate from CC

Impact

💰

Cost & Limit Management

Persist Decisions & Learnings

All

Without this, next session re-derives everything from scratch.

Claude Code

CLAUDE.md

/memory

Cursor

.cursor/rules/

Codex

AGENTS.md

Gemini

GEMINI.md

/memory add

Impact

Track Spend Per Session

All

Know your burn rate. Each tool has a built-in cost inspector.

Claude Code

/cost

Cursor

/usage

Codex

/status · /statusline

Gemini

/stats · /footer

/footer shows token info
in prompt

Impact

Cap Context Window & Thinking Budget

Adaptive thinking burns output tokens at 5× input price. Tune context window and thinking budget via env vars. Docs →

Claude Code

CLAUDE_CODE_DISABLE_1M_CONTEXT=1
CLAUDE_CODE_AUTO_COMPACT_WINDOW=50000
CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1
MAX_THINKING_TOKENS=<n>

Add to .zshrc / .bashrc

Cursor

No equivalent

Codex

No equivalent

Gemini

No equivalent

Impact

Resume Instead of Rebuilding

All

Resume a previous session with full transcript instead of re-explaining from scratch.

Claude Code

/resume

Cursor

/resume

Codex

/resume

Gemini

/resume

Impact

Schedule Around Peak Hours

CC · CU

Rate limits can be tighter during peak hours. Use schedules and automations to run batch work off-peak.

Claude Code

/schedule

Use schedules

Cursor

Use automations

Codex

No equivalent

Gemini

No equivalent

Impact

🧠

The Meta-Rule

Every token is re-read every turn. A 50-line rules file across a 20-turn session is read 20 times — 1,000 lines from one file. Context isn't cost-per-message, it's cost × conversation length. Keep it tight, clear often, persist decisions, use the cheapest mode that works.

⌨

Quick Command Reference

⌘

Claude Code

/clear · /compact · /cost

/context · /plan · /effort

/model · /memory · /resume

permissions.deny

Shift+Tab

hooks

CLAUDE_CODE_*

Cursor Agent CLI

/clear · /plan · /ask

/model · /compress · /resume

@file · & msg → cloud

.cursor/rules/ · .cursorignore

/mcp · /usage

hooks.json

Codex CLI

/new · /compact · /status

/model · /statusline · /diff

/mention · /agent · /review

AGENTS.md

!cmd · Tab

hooks.json

Gemini CLI

/clear · /compress · /stats

/plan · /model · /memory

/mcp · /agents · @file

GEMINI.md · .geminiignore

!cmd · /resume

hooks