GitHub Copilot Prompt & Token Optimization
Under usage-based billing, input, output, and cached tokens are metered against your AI-credit pool. Output tokens cost roughly 5x more than input. Most input is not what you type: it is instructions, open tabs, tool schemas, and chat history replayed across turns.
1. The 12 highest-ROI practices
Constrain output
Add Code only, no explanation. Bullets over paragraphs. to repo instructions.
Shrink always-on context
Compress copilot-instructions.md and AGENTS.md to landmines only. Delete generated boilerplate the agent can rediscover.
Scope context with applyTo:
Split one large instructions file into smaller scoped files that load only for matching paths or file types.
Ask Mode for simple questions
Reserve Agent Mode for true multi-step work. Pick Ask Mode for one-shot Q&A.
Use Auto model selection
Default to Auto. Pin a higher-tier model only when the task warrants it.
Audit MCP servers
Disable MCPs you do not actively use. Each tool schema can cost tokens per agent step.
Prefer Skills over always-on MCPs
Skills load title and description until invoked. MCP tool schemas are loaded into agent context.
Use CLI tools for deterministic work
For repeatable browser, file, or command flows, pipe CLI commands instead of multi-turn agent reasoning.
Be precise in prompts
Add a null check to getUser() and return 404 beats can you maybe handle errors here?
Retune prompts per model
Vendor prompting guides differ by model and version. Ask Copilot to adapt your instructions for the model you use.
Run /chronicle improve
Run weekly in Copilot CLI or VS Code to scan recent sessions for recurring misunderstandings and generate instruction patches.
Start fresh sessions often
History is replayed every turn. When the topic changes, open a new chat.
2. Context is everything: five context surfaces
| Surface | File | When loaded | Best for |
|---|---|---|---|
| Repo instructions | copilot-instructions.md, AGENTS.md | Always, every request | Project-wide AI README |
| Custom instructions | *.instructions.md | By file pattern with applyTo: | File-specific rules |
| Prompts | *.prompt.md | When invoked | Task templates |
| Agents | *.agent.md | When mentioned | Specialist personas |
| Skills | SKILL.md | Automatically when relevant | Specialized expertise |
The further right the context surface is, the more demand-loaded it is and the lower the steady-state token cost. Push as much as possible from always-on repo instructions into on-demand Skills and prompts.
3. MCP vs Skills vs CLI
| Tool | Best for | Loading | Tradeoff |
|---|---|---|---|
| MCP | Remote APIs, SaaS, databases, cross-platform integrations | Tool metadata loaded into agent context | Strong auth and governance, but every tool call can become another LLM turn |
| Skills | Repeatable workflows, org knowledge, orchestration | Progressive loading: title and description first, full content on invoke | Human-readable and lower steady-state context |
| CLI | Local dev, speed, low cost, composable deterministic operations | Zero schema overhead | Best for shell pipes and batch commands; govern carefully at scale |
Every MCP or tool call can be a separate LLM turn with system prompt, chat history, and loaded tool schemas sent again. A CLI command can chain many operations in one line and return one filtered result. For deterministic work, prefer CLI. Reach for MCP when you need typed protocol, auth, sandboxing, or remote APIs.
4. Where tokens actually go
| Source | Always sent? | Lever |
|---|---|---|
.github/copilot-instructions.md | Every turn | Compress |
AGENTS.md | Every agent step | Compress |
| Open editor tabs and selection | Often | Close unused tabs |
| Chat history | Grows over time | Start fresh |
| MCP tool schemas | Per agent step | Audit or convert to Skill or CLI |
| Skill metadata | Name and description only | Prefer Skills where appropriate |
| CLI command output | Only what you return | Filter before returning with | head or | jq |
| Model output | Billed at higher rates than input | Constrain output |
5. Mode selection
| Task | Best mode | Why |
|---|---|---|
| Explain this function | Ask | One-shot, no tools needed |
| Refactor file X | Edit / Inline | Bounded scope, fewer tools |
| Implement feature across three files and run tests | Agent | Genuinely multi-step |
| Reproducible browser or file pipeline | CLI tool | Deterministic, no LLM round-trip per step |
| Recurring confusion in past sessions | /chronicle improve | Patches instructions permanently |
6. Anti-patterns
Bloated instructions
Cost: paid every turn. Fix: strip filler and keep landmines.
Every MCP enabled just in case
Cost: large upfront and per-task overhead. Fix: disable until needed or convert to Skill/CLI.
MCP for work one shell pipe can do
Cost: one LLM turn per step. Fix: use a chained CLI command.
Agent for simple explanation
Cost: 5-10x overhead. Fix: use Ask.
One mega-thread for days
Cost: linear history replay. Fix: new chat per topic.
Verbose prompts and pinned premium models
Cost: more input, more output, and higher billed rate. Fix: direct imperatives and Auto by default.
7. Five-minute setup checklist
Add Code only, no explanation. Bullets over paragraphs. Be terse. to repo instructions.
Delete generated boilerplate from copilot-instructions.md and AGENTS.md.
Open the MCP panel and disable unused servers.
Move repeated workflows from MCP or instructions into a Skill.
Set model selection to Auto.
Bookmark /chronicle improve for weekly use.
Close editor tabs you are not actively using.
Further reading
Anthropic public pricing, for input/output pricing asymmetry examples.
Practitioner resource; for official guidance, see GitHub Copilot documentation and GitHub billing documentation.