feat: Token count for agents#860
Open
Ayaz-Microsoft wants to merge 11 commits into
Open
Conversation
- Implemented TokenUsageAccumulator to track per-request, per-agent, and per-model token usage. - Emitted custom events to Azure Application Insights for monitoring. - Created KQL queries for visualizing token usage metrics in Application Insights. - Developed a workbook for easy access to token usage insights. - Updated orchestrator to integrate token usage tracking during message processing and response handling.
…nto its own template
Contributor
Contributor
There was a problem hiding this comment.
Pull request overview
Adds end-to-end LLM token usage telemetry for agent/workflow executions in the backend, plus Azure Monitor artifacts (workbook + KQL) to analyze usage by request, agent, model, and stage.
Changes:
- Added
TokenUsageAccumulator+ extraction helpers to capture token usage from Agent Framework responses/stream updates and emitLLM_*_Token_UsageApp Insights custom events. - Threaded
user_idthrough orchestrator entrypoints and API handlers; added per-request ContextVar propagation to tag telemetry emitted from deeper helpers (e.g., image generation). - Added standalone Bicep deployments for monitoring add-on resources and a “Token Usage” workbook, plus workbook JSON, KQL query pack, and docs.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/backend/token_usage.py | New module to extract/accumulate token counts and emit App Insights custom events. |
| src/backend/orchestrator.py | Creates/records/flushes token usage across workflow streaming, brief parsing, generation, and image paths; propagates user_id. |
| src/backend/app.py | Passes user_id into orchestrator calls for telemetry correlation. |
| infra/workbook/workbook.bicep | Standalone deployment of the Token Usage workbook targeting an App Insights resource (optional binding). |
| infra/workbook/README.md | Deployment instructions for the standalone workbook template. |
| infra/monitoring/monitoring.bicep | Standalone “add monitoring later” deployment (LA + App Insights). |
| infra/monitoring/README.md | Instructions for post-deploy monitoring enablement and wiring. |
| infra/dashboards/token-usage-workbook.json | Serialized workbook definition with tiles/queries for token usage analysis. |
| infra/dashboards/token-usage-queries.kql | KQL query pack for App Insights / Log Analytics. |
| docs/TokenUsageTelemetry.md | Documentation for emitted events, enabling telemetry, and querying/visualizing usage. |
| infra/main.bicep | Notes workbook is deployed separately; adds ACI tag hashing to force restart on monitoring config change. |
| infra/main_custom.bicep | Notes workbook deployed separately; adds ACI tag hashing; changes default gptModelCapacity. |
| infra/main.json | Recompiled ARM output with additional infra deltas beyond token telemetry. |
| .gitignore | Fixes rai_results ignore entry and adds Python coverage artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… improve Application Insights event emission
Comment on lines
+324
to
+328
| input_audio_tokens=_to_int(_get(in_details, "audio_tokens")), | ||
| input_text_tokens=_to_int(_get(in_details, "text_tokens")), | ||
| input_cached_tokens=_to_int(_get(in_details, "cached_tokens")), | ||
| output_audio_tokens=_to_int(_get(out_details, "audio_tokens")), | ||
| output_text_tokens=_to_int(_get(out_details, "text_tokens")), |
Comment on lines
+887
to
+894
| start_ns = time.perf_counter_ns() | ||
| try: | ||
| found = extract_usage(source) or extract_usage_from_stream_chunk(source) | ||
| except Exception as exc: # belt + braces; extractors are already safe | ||
| logger.debug("TokenUsageScope.add failed: %s", exc, exc_info=True) | ||
| return None | ||
| finally: | ||
| self._extract_ns += time.perf_counter_ns() - start_ns |
Comment on lines
+18271
to
18276
| "jumpboxDcr": { | ||
| "condition": "[and(variables('deployAdminAccessResources'), parameters('enableMonitoring'))]", | ||
| "type": "Microsoft.Resources/deployments", | ||
| "apiVersion": "2025-04-01", | ||
| "name": "[take(format('avm.res.network.private-dns-zone.{0}', replace(variables('privateDnsZones')[copyIndex()], '.', '-')), 64)]", | ||
| "name": "[take(format('avm.res.insights.data-collection-rule.{0}', variables('jumpboxDcrName')), 64)]", | ||
| "properties": { |
Comment on lines
+96
to
+99
| - **Out of scope (intentional).** The current implementation does not persist | ||
| token totals to Cosmos DB and does not push real-time updates to the | ||
| frontend. Operators add cost-estimation queries as needed by multiplying | ||
| token counts by their negotiated per-1K-token rates. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Count total input and output tokens used by each agent at various stages and show in workbook for analysis.
Does this introduce a breaking change?
Golden Path Validation
Deployment Validation
What to Check
Verify that the following are valid
Other Information