Expose per-model token usage (model_usage array) in the status line payload

## Feature request: expose per-model token usage in the status line payload

### Context

The configurable status line (`statusLine.command`) receives a JSON payload on stdin. Token usage is currently exposed two ways:

```js
context_window: {
  total_input_tokens,         // ← SUM across ALL models used in the session
  total_output_tokens,        // ← same
  total_cache_read_tokens,    // ← same
  total_cache_write_tokens,   // ← same
  total_reasoning_tokens,     // ← same
  ...
  current_usage: {            // ← ONLY for the currently selected model
    input_tokens,
    output_tokens,
    cache_creation_input_tokens,
    cache_read_input_tokens
  }
}
```

The runtime tracks usage per model in its internal `modelMetrics` map (this is visible in `session.shutdown` events and surfaced by `/usage`), but **only the cross-model sum and the currently-selected model's slice are exposed** to the status line script.

### Problem

This makes it impossible for a status line script to compute an accurate session cost when the user switches models mid-session (e.g. starts in GPT-5.5, `/model` to Claude Opus 4.7). The script can only apply the currently selected model's rates to all cumulative tokens — which under- or over-estimates depending on which model is pricier.

Concrete example: spend 1M output tokens on GPT-5.5 ($30/M output), then switch to Opus 4.7 ($25/M output). True cost is $30. A status line script can only see "current model = Opus, total_output = 1M" and reports $25. Off by 20%.

### Proposal

Expose the per-model breakdown alongside the existing totals:

```js
context_window: {
  total_input_tokens,
  total_output_tokens,
  total_cache_read_tokens,
  total_cache_write_tokens,
  total_reasoning_tokens,
  ...

  // NEW: array of per-model usage slices
  model_usage: [
    {
      model_id: "gpt-5.5",
      model_display_name: "GPT-5.5",
      input_tokens: 700000,
      output_tokens: 700000,
      cache_read_tokens: 700000,
      cache_write_tokens: 35000,
      reasoning_tokens: 0,
      requests: 12
    },
    {
      model_id: "claude-opus-4.7",
      model_display_name: "Claude Opus 4.7",
      input_tokens: 300000,
      output_tokens: 300000,
      cache_read_tokens: 300000,
      cache_write_tokens: 15000,
      reasoning_tokens: 5000,
      requests: 4
    }
  ]
}
```

This data already exists internally as `modelMetrics` and is exactly what `/usage` displays today. Exposing it would be a one-line payload change.

### Use case

Any status line script that surfaces session cost in USD (using the published model pricing) would become correct for multi-model sessions. The script just iterates `model_usage`, looks up each model's rates, and sums.

### Current workaround (and why it's bad)

Parsing `~/.copilot/session-state/<id>/events.jsonl` for `assistant.message.outputTokens` + `model` per turn — fragile (race conditions with the writer), incomplete (input/cache tokens are NOT logged per call, only aggregated at `session.shutdown`), and forces every status line author to reinvent the same incremental-parsing + caching machinery.

### Related

Companion request for `total_files_modified`: #3404


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose per-model token usage (model_usage array) in the status line payload #3405

Feature request: expose per-model token usage in the status line payload

Context

Problem

Proposal

Use case

Current workaround (and why it's bad)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expose per-model token usage (model_usage array) in the status line payload #3405

Description

Feature request: expose per-model token usage in the status line payload

Context

Problem

Proposal

Use case

Current workaround (and why it's bad)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions