Office Hours quick reference · v02 · May 2026

GitHub Copilot Prompt & Token Optimization

Why this matters.

Under usage-based billing, input, output, and cached tokens are metered against your AI-credit pool. Output tokens cost roughly 5x more than input. Most input is not what you type: it is instructions, open tabs, tool schemas, and chat history replayed across turns.

1. The 12 highest-ROI practices

1

Constrain output

Add Code only, no explanation. Bullets over paragraphs. to repo instructions. Effect: 40-70% fewer output tokens on code tasks. Setup: 1 min.

2

Shrink always-on context

Compress copilot-instructions.md and AGENTS.md to landmines only. Delete generated boilerplate the agent can rediscover. Effect: 40-60% fewer instruction tokens every call. Setup: 15 min.

3

Scope context with applyTo:

Split one large instructions file into smaller scoped files that load only for matching paths or file types. Effect: 50-80% less scoped load when the file does not apply. Setup: 15 min.

4

Ask Mode for simple questions

Reserve Agent Mode for true multi-step work. Pick Ask Mode for one-shot Q&A. Effect: 60-90% lower token use vs agent loops. Setup: 0 min.

5

Use Auto model selection

Default to Auto. Pin a higher-tier model only when the task warrants it. Effect: lower billed rate on eligible usage. Setup: 0 min.

6

Audit MCP servers

Disable MCPs you do not actively use. Each tool schema can cost tokens per agent step. Effect: 5K-190K fewer tokens per agent task. Setup: 5 min.

7

Prefer Skills over always-on MCPs

Skills load title and description until invoked. MCP tool schemas are loaded into agent context. Effect: drastically lower steady-state context. Setup: 10 min.

8

Use CLI tools for deterministic work

For repeatable browser, file, or command flows, pipe CLI commands instead of multi-turn agent reasoning. Effect: fewer LLM round-trips and less tool-schema replay.

9

Be precise in prompts

Add a null check to getUser() and return 404 beats can you maybe handle errors here? Effect: better first pass, fewer follow-up turns. Setup: 0 min.

10

Retune prompts per model

Vendor prompting guides differ by model and version. Ask Copilot to adapt your instructions for the model you use. Effect: cuts rework from poor first-pass output. Setup: 10 min/model.

11

Run /chronicle improve

Run weekly in Copilot CLI or VS Code to scan recent sessions for recurring misunderstandings and generate instruction patches. Effect: 10K-30K tokens saved per repeated pattern. Setup: 2 min.

12

Start fresh sessions often

History is replayed every turn. When the topic changes, open a new chat. Effect: removes carry-over input cost. Setup: 0 min.

Figures are scoped to the mechanism named in each row and are not additive.

2. Context is everything: five context surfaces

SurfaceFileWhen loadedBest for
Repo instructionscopilot-instructions.md, AGENTS.mdAlways, every requestProject-wide AI README
Custom instructions*.instructions.mdBy file pattern with applyTo:File-specific rules
Prompts*.prompt.mdWhen invokedTask templates
Agents*.agent.mdWhen mentionedSpecialist personas
SkillsSKILL.mdAutomatically when relevantSpecialized expertise
Cost rule of thumb.

The further right the context surface is, the more demand-loaded it is and the lower the steady-state token cost. Push as much as possible from always-on repo instructions into on-demand Skills and prompts.

3. MCP vs Skills vs CLI

ToolBest forLoadingTradeoff
MCPRemote APIs, SaaS, databases, cross-platform integrationsTool metadata loaded into agent contextStrong auth and governance, but every tool call can become another LLM turn
SkillsRepeatable workflows, org knowledge, orchestrationProgressive loading: title and description first, full content on invokeHuman-readable and lower steady-state context
CLILocal dev, speed, low cost, composable deterministic operationsZero schema overheadBest for shell pipes and batch commands; govern carefully at scale
The roundtrip rule.

Every MCP or tool call can be a separate LLM turn with system prompt, chat history, and loaded tool schemas sent again. A CLI command can chain many operations in one line and return one filtered result. For deterministic work, prefer CLI. Reach for MCP when you need typed protocol, auth, sandboxing, or remote APIs.

4. Where tokens actually go

SourceAlways sent?Lever
.github/copilot-instructions.mdEvery turnCompress
AGENTS.mdEvery agent stepCompress
Open editor tabs and selectionOftenClose unused tabs
Chat historyGrows over timeStart fresh
MCP tool schemasPer agent stepAudit or convert to Skill or CLI
Skill metadataName and description onlyPrefer Skills where appropriate
CLI command outputOnly what you returnFilter before returning with | head or | jq
Model outputBilled at higher rates than inputConstrain output

5. Mode selection

TaskBest modeWhy
Explain this functionAskOne-shot, no tools needed
Refactor file XEdit / InlineBounded scope, fewer tools
Implement feature across three files and run testsAgentGenuinely multi-step
Reproducible browser or file pipelineCLI toolDeterministic, no LLM round-trip per step
Recurring confusion in past sessions/chronicle improvePatches instructions permanently

6. Anti-patterns

!

Bloated instructions

Cost: paid every turn. Fix: strip filler and keep landmines.

!

Every MCP enabled just in case

Cost: large upfront and per-task overhead. Fix: disable until needed or convert to Skill/CLI.

!

MCP for work one shell pipe can do

Cost: one LLM turn per step. Fix: use a chained CLI command.

!

Agent for simple explanation

Cost: 5-10x overhead. Fix: use Ask.

!

One mega-thread for days

Cost: linear history replay. Fix: new chat per topic.

!

Verbose prompts and pinned premium models

Cost: more input, more output, and higher billed rate. Fix: direct imperatives and Auto by default.

7. Five-minute setup checklist

1

Add Code only, no explanation. Bullets over paragraphs. Be terse. to repo instructions.

2

Delete generated boilerplate from copilot-instructions.md and AGENTS.md.

3

Open the MCP panel and disable unused servers.

4

Move repeated workflows from MCP or instructions into a Skill.

5

Set model selection to Auto.

6

Bookmark /chronicle improve for weekly use.

7

Close editor tabs you are not actively using.

Further reading

Practitioner resource; for official guidance, see GitHub Copilot documentation and GitHub billing documentation.