Usage-based billing

Getting more from every Copilot interaction

The goal is not "use less Copilot." The goal is waste less interaction.A practical guide to tighter prompts, smaller context, shorter output, and right-sized agent workflows.

Mental model

What drives usage

Costs and latency are shaped by more than the words you type. Large context surfaces, verbose answers, model choice, and repeated agent steps all add drag.

01

Prompt input

The words, files, logs, and instructions sent into the interaction.

02

Model output

The answer itself, including repeated explanations, code, and summaries.

03

Context overhead

Open tabs, persistent instructions, chat history, generated context, and memory.

04

Tool overhead

Agent planning, tool schemas, intermediate reads, retries, and MCP catalogs.

Budgets cap spend. Habits reduce waste.

Usage-based billing makes long chats, broad agent sessions, model choice, and verbose output visible as cost drivers.

Workshop flow

From vague ask to reusable defaults

Use this flow to tighten one Copilot interaction at a time: start with the task, reduce the noise, then keep the habits that make the next loop faster.

0-5

Usage-based billing and why workflow quality matters.

5-15

Quick wins you can apply today.

15-25

Prompting patterns that reduce discovery work.

25-35

Context hygiene: instructions, tabs, history, and generated files.

35-52

Output, agent, model, and MCP/tool overhead.

52-60

Measure impact, set team defaults, and decide what sticks.

Defaults

Quick wins you can apply today

Start with defaults that change the shape of every interaction without asking developers to become billing experts.

01

Make answers terse

Set a default like: code first, minimal explanation unless asked, bullets over paragraphs.

code only 3 bullets max no recap
02

Pick the smaller workflow

Use Ask for bounded questions, Edit when you know the target file, and Agent only when autonomy pays for itself.

Ask Edit Agent
03

Default to Auto

Let model selection float unless you have a concrete reason to pin a premium model or higher effort.

Auto baseline select deliberately
Team default:

Answer with the smallest useful response. If code is needed, lead with code.

Prompting pattern

The tight prompt recipe

Good prompts reduce search space. They tell Copilot where to look, what to change, what constraints matter, and what output shape you want.

Target

File, function, component, issue, failing test, or exact workflow.

Change

What needs to be different when the work is done.

Constraints

Tests, style, edge cases, APIs, security requirements, and boundaries.

Output

Diff, code-only, compact table, JSON, or brief rationale.

Before

Can you fix this login bug?

After

In src/auth/session.ts, find why expired refresh tokens still pass. Change only validation logic, add one regression test, and show the diff.

Context management

Context is the biggest hidden cost

Always-on context feels helpful until it makes every turn heavier. The fix is not zero context. The fix is scoped context.

Keep

+

Non-obvious hazards, required commands, and security-sensitive cautions.

+

Conventions that are easy to violate and expensive to rediscover.

+

Path-scoped guidance for API, UI, tests, migrations, or platform-specific areas.

Delete or move on-demand

-

Facts visible in code, duplicated docs, generated boilerplate, and outdated tool advice.

-

Unrelated editor tabs, stale chat history, untrimmed logs, and vendor files.

-

PR checklists, migration notes, and playbooks that only matter for one task. Move repeatable work into prompts or Skills.

20-23%

More tokens with generated context files in tested settings.

-2%

Average correctness change in cited AGENTBENCH results.

1.6x

Tool anchoring when context names a specific tool.

Context surface When it loads Use it for Cost habit
Repo instructions
copilot-instructions.md, AGENTS.md
Always Landmines, required commands, project-wide constraints. Keep short. Delete filler and generated boilerplate.
Scoped instructions
*.instructions.md
When file patterns match Rules for APIs, UI, tests, migrations, or file types. Use applyTo: so rules load only where they apply.
Prompts and agents
*.prompt.md, *.agent.md
When invoked Repeatable tasks and role-based workflows. Move rarely used playbooks out of always-on context.
Skills
SKILL.md
Name and description first, full content on demand Specialized workflows, org knowledge, and tool orchestration. Prefer Skills over always-on instructions or unused MCP servers.
Agent and tools

Keep the chain short

Every enabled tool brings a name, description, and parameter schema. In agent mode, that catalog can be replayed many times during one task.

Working rule

1

Name the outcome and stop condition before broad exploration starts.

2

Enable only the servers and tools needed for the task.

3

Prefer Skills for repeatable workflows that do not need every tool schema loaded every turn.

4

Use CLI commands for deterministic work: grep, jq, gh, npm, tests, and repeatable Playwright flows.

5

Batch scoped reads instead of repeated back-and-forth.

servers
x tools/server
x schema tokens
x agent steps
= overhead

heavy setup:
15 servers, 187 tools
265K overhead tokens

focused setup:
3 servers, 50 tools
75K overhead tokens
Use Best for Token shape Watch out for
MCP Remote APIs, SaaS tools, databases, auth, sandboxed integrations. Tool metadata is available to the model during agent turns. Each tool call can become another full LLM turn with history and schemas replayed.
Skills Repeatable workflows, team standards, and specialized playbooks. Light metadata first, full instructions when invoked. Keep descriptions crisp so Copilot picks the right Skill at the right time.
CLI Deterministic local work, chained shell operations, builds, tests, and filters. No tool-schema load. The model only sees the command and returned output. Filter output before returning it: | head, | jq, targeted test commands.
Hooks Deterministic actions after a known lifecycle event — tests, formatting, deploys, gates. Zero LLM cost. The hook is shell execution; the model is not re-invoked. Hook spam from broad triggers. Filter by tool name and file path, and pin to known-good config (Preview).
Lifecycle automation

Hooks: zero-cost execution

Skills make decisions. Hooks execute them. When a workflow has a deterministic "if X then always Y" step, hooks let you cut LLM calls out of the loop. See how hooks compare to MCP, Skills, and CLI, or the official hooks reference.

Preview status.

Hooks are currently in Preview. Configuration format and events may change. Pin to a known-good config and re-check the docs before rolling out widely.

Common hook events

PostToolUse

After a tool completes

Run tests, format code, generate artifacts, or trigger deployments without a follow-up LLM turn.

SessionStart

New session begins

Load project context, fetch secrets, or validate the environment before the first prompt.

PreToolUse

Before a tool runs

Block dangerous commands, require approval gates, or rewrite input before execution.

How the savings work

# skills-only: LLM chain every run
user request
  ↓ Skill analyzes      tokens
  ↓ LLM decides         tokens
  ↓ Skill executes      tokens
# skills + hooks: LLM only on first run
user request
  ↓ Skill analyzes      tokens (once)
  ↓ writes decision.json

hook detects write
  ↓ shell script runs   zero tokens
  ↓ subsequent runs     zero tokens

Configuration

Hooks are defined in .github/hooks/*.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "type": "command",
        "command": "./scripts/process-output.sh",
        "timeout": 30,
        "cwd": "."
      }
    ]
  }
}

Working rules

01

Skills write structured data

Emit summary.json, config.yaml, or a decision file the hook can read.

02

Hooks watch for triggers

Filter PostToolUse by tool name and file path so the hook fires when it matters.

03

Keep hooks deterministic

No LLM calls inside hook scripts. Shell, build, lint, deploy — that is the value.

04

Filter hook spam

Default behavior fires on every tool use. Constrain by event, path, and exit conditions.

Watch out for

!

Hook spamPostToolUse fires after every tool use. Filter by tool name and file path.

!

Cloud agent constraints — Bash only, restricted network, ephemeral filesystem.

!

Preview status — Format and events may change. Pin to known-good config.

!

Silent failures — Add logging (echo "Hook fired" >> /tmp/hook.log).

!

Timeout limits — Default 30 seconds. Increase for long-running tasks.

Decision tree
  1. Does it need LLM judgment? → Skill
  2. Is it always the same after a trigger? → Hook
  3. Is it a one-off shell command? → CLI
  4. Does it need auth to external services? → MCP
Team and admin layer

Cap spend, then improve habits

Prompt habits improve efficiency, but admin controls provide budget enforcement. Both matter once usage is visible.

Budgets

Set alerting early

Set enterprise, cost-center, and user-level AI-credit budgets. Review usage after baselines stabilize.

Model policy

Approve deliberately

Review premium model access by workflow, team, and measurable need rather than leaving high-cost models pinned.

Review loop

Measure what changes

Track fewer turns, shorter default responses, leaner agent steps per task, and faster review loops.

The 3-pack

Do this today

Pick two or three defaults, trial them across repos and editors, then keep the workflows that move the needle.

01

Set terse defaults

Code first. Minimal explanation unless asked. Bullets over paragraphs.

02

Prune context

Keep only the instructions that prevent expensive mistakes.

03

Right-size mode

Ask for one-step work. Agent for multi-step work with clear stop conditions.

Same outcomes, fewer tokens, faster loops.

Bring prompts, context files, and agent workflows. Tighten them live, then turn the best patterns into team defaults.

Resource

Copilot interaction checklist

Two companion handouts plus a short walkthrough of the official Copilot billing preview flow.

Checklist handout

Use the PDF as a slide companion, workshop takeaway, or team onboarding resource. It includes the tactical checklist plus billing and admin checks from the companion article.

Prompt & token optimization reference

A practical Copilot quick reference with community-curated prompts, checklists, and patterns for better everyday usage.

Usage report walkthrough video

Watch the CSV upload flow: download the report, open the billing preview tool, and review the overview, users, models, products, organizations, and cost centers tabs.

Token optimization deep dive

Marco Olivo's chapter-by-chapter companion guide. Goes deeper on output control, MCP and tool costs, and team habits. Useful when the checklist isn't enough.