feat: Token count for agents by Ayaz-Microsoft · Pull Request #860 · microsoft/content-generation-solution-accelerator

Ayaz-Microsoft · 2026-05-25T11:50:01Z

Purpose

Count total input and output tokens used by each agent at various stages and show in workbook for analysis.

Does this introduce a breaking change?

Yes
No

Golden Path Validation

I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

...

Other Information

- Implemented TokenUsageAccumulator to track per-request, per-agent, and per-model token usage. - Emitted custom events to Azure Application Insights for monitoring. - Created KQL queries for visualizing token usage metrics in Application Insights. - Developed a workbook for easy access to token usage insights. - Updated orchestrator to integrate token usage tracking during message processing and response handling.

…amic tagging

…nto its own template

github-actions · 2026-05-25T11:51:15Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
src/backend
app.py	720	135	81%	47, 64, 71–76, 79, 84, 119–120, 165, 244, 262, 281, 288, 412–413, 519, 522, 525–527, 536–537, 540, 542–544, 553–556, 566–567, 570–576, 579–581, 587–588, 590, 601, 603–608, 611–613, 620, 630–631, 633, 730–734, 742, 745, 753, 756–759, 768–769, 772, 781, 783, 811, 820–821, 835–836, 855–856, 858–859, 866–867, 870–871, 1015–1017, 1021–1023, 1062–1063, 1101, 1104, 1106, 1132–1133, 1135–1137, 1139, 1159–1160, 1162–1163, 1234–1235, 1266–1267, 1344–1345, 1605–1607, 1609–1610, 1617–1619, 1621, 1756, 1774, 1836–1837, 1841–1842
llm_token_telemetry.py	412	282	31%	120–121, 123–126, 128, 141, 146, 153–156, 164–177, 183, 185, 195–196, 198–199, 202–204, 212–227, 229–233, 251, 257, 261–263, 270–283, 293–299, 307–309, 311–315, 317–318, 320, 331, 340–341, 354–370, 452–453, 459–465, 471, 489, 493, 498–504, 508–512, 519–526, 538–546, 556–566, 568–569, 573–577, 579–586, 591, 607–609, 618–620, 631–633, 649–651, 666–668, 682–684, 704–705, 708, 717–718, 720–721, 732–734, 765–766, 768–771, 776, 779, 785–786, 791–792, 798–799, 806–807, 817, 865–872, 877–878, 887–892, 894–897, 900, 903–904, 910, 915, 920, 925–927, 939–940, 948, 976, 978–979, 981–988, 990–996, 998–999, 1004, 1014, 1018
orchestrator.py	758	189	75%	40–42, 546, 549–557, 561, 567–568, 579, 584–588, 595, 599–602, 611, 616–617, 623–624, 629–630, 637, 722–723, 743–744, 758, 926, 969–970, 974, 983, 985–987, 989, 995–997, 999, 1035–1036, 1039–1045, 1074, 1099–1102, 1104–1105, 1112–1113, 1115–1117, 1120–1122, 1124, 1134–1137, 1141–1142, 1144–1145, 1156–1157, 1159–1166, 1175–1176, 1179–1180, 1205, 1244–1245, 1264–1265, 1321–1322, 1325–1326, 1336–1338, 1424, 1450, 1492–1494, 1496–1498, 1528–1531, 1621–1622, 1627–1629, 1645–1647, 1649–1651, 1665–1668, 1704–1705, 1734, 1782–1783, 1804, 1808, 1846, 1865–1866, 1882, 1884–1889, 1892, 1917–1918, 1920–1922, 1939–1940, 1962, 1965, 1971–1973, 1978–1979, 2017, 2070, 2074, 2079, 2152–2153, 2164–2172, 2202–2204
telemetry.py	46	24	47%	43–47, 54, 56–58, 60, 67–80
TOTAL	8332	761	90%

Tests	Skipped	Failures	Errors	Time
422	0 💤	0 ❌	0 🔥	13.130s ⏱️

Copilot

Pull request overview

Adds end-to-end LLM token usage telemetry for agent/workflow executions in the backend, plus Azure Monitor artifacts (workbook + KQL) to analyze usage by request, agent, model, and stage.

Changes:

Added TokenUsageAccumulator + extraction helpers to capture token usage from Agent Framework responses/stream updates and emit LLM_*_Token_Usage App Insights custom events.
Threaded user_id through orchestrator entrypoints and API handlers; added per-request ContextVar propagation to tag telemetry emitted from deeper helpers (e.g., image generation).
Added standalone Bicep deployments for monitoring add-on resources and a “Token Usage” workbook, plus workbook JSON, KQL query pack, and docs.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
src/backend/token_usage.py	New module to extract/accumulate token counts and emit App Insights custom events.
src/backend/orchestrator.py	Creates/records/flushes token usage across workflow streaming, brief parsing, generation, and image paths; propagates `user_id`.
src/backend/app.py	Passes `user_id` into orchestrator calls for telemetry correlation.
infra/workbook/workbook.bicep	Standalone deployment of the Token Usage workbook targeting an App Insights resource (optional binding).
infra/workbook/README.md	Deployment instructions for the standalone workbook template.
infra/monitoring/monitoring.bicep	Standalone “add monitoring later” deployment (LA + App Insights).
infra/monitoring/README.md	Instructions for post-deploy monitoring enablement and wiring.
infra/dashboards/token-usage-workbook.json	Serialized workbook definition with tiles/queries for token usage analysis.
infra/dashboards/token-usage-queries.kql	KQL query pack for App Insights / Log Analytics.
docs/TokenUsageTelemetry.md	Documentation for emitted events, enabling telemetry, and querying/visualizing usage.
infra/main.bicep	Notes workbook is deployed separately; adds ACI tag hashing to force restart on monitoring config change.
infra/main_custom.bicep	Notes workbook deployed separately; adds ACI tag hashing; changes default `gptModelCapacity`.
infra/main.json	Recompiled ARM output with additional infra deltas beyond token telemetry.
.gitignore	Fixes `rai_results` ignore entry and adds Python coverage artifacts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… improve Application Insights event emission

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

+        input_audio_tokens=_to_int(_get(in_details, "audio_tokens")),
+        input_text_tokens=_to_int(_get(in_details, "text_tokens")),
+        input_cached_tokens=_to_int(_get(in_details, "cached_tokens")),
+        output_audio_tokens=_to_int(_get(out_details, "audio_tokens")),
+        output_text_tokens=_to_int(_get(out_details, "text_tokens")),


+        start_ns = time.perf_counter_ns()
+        try:
+            found = extract_usage(source) or extract_usage_from_stream_chunk(source)
+        except Exception as exc:  # belt + braces; extractors are already safe
+            logger.debug("TokenUsageScope.add failed: %s", exc, exc_info=True)
+            return None
+        finally:
+            self._extract_ns += time.perf_counter_ns() - start_ns


+    "jumpboxDcr": {
+      "condition": "[and(variables('deployAdminAccessResources'), parameters('enableMonitoring'))]",
      "type": "Microsoft.Resources/deployments",
      "apiVersion": "2025-04-01",
-      "name": "[take(format('avm.res.network.private-dns-zone.{0}', replace(variables('privateDnsZones')[copyIndex()], '.', '-')), 64)]",
+      "name": "[take(format('avm.res.insights.data-collection-rule.{0}', variables('jumpboxDcrName')), 64)]",
      "properties": {


+- **Out of scope (intentional).** The current implementation does not persist
+  token totals to Cosmos DB and does not push real-time updates to the
+  frontend. Operators add cost-estimation queries as needed by multiplying
+  token counts by their negotiated per-1K-token rates.


Ayaz-Microsoft added 7 commits May 11, 2026 18:39

feat: add Token Usage Application Insights workbook for LLM monitoring

41b1e4a

feat: add monitoring configuration hash to container instance for dyn…

4f19b1c

…amic tagging

sync main_custom.bicep with main.bicep

f21cb66

feat: separate Token Usage Application Insights workbook deployment i…

98ba811

…nto its own template

Refactor code structure for improved readability and maintainability

19a98c4

restored main.bicep and azure.yaml

1d22af3

Ayaz-Microsoft temporarily deployed to production May 25, 2026 11:50 — with GitHub Actions Inactive

remove unused field import from dataclass in token_usage.py

a3bef36

Ayaz-Microsoft temporarily deployed to production May 25, 2026 12:08 — with GitHub Actions Inactive

Refactor code

5e8eac2

Ayaz-Microsoft temporarily deployed to production May 25, 2026 12:38 — with GitHub Actions Inactive

Ayaz-Microsoft marked this pull request as ready for review May 25, 2026 12:44

Copilot AI review requested due to automatic review settings May 25, 2026 12:44

Ayaz-Microsoft requested review from Avijit-Microsoft, Prajwal-Microsoft, Roopan-Microsoft, Vinay-Microsoft, aniaroramsft, malrose07, nchandhi and toherman-msft as code owners May 25, 2026 12:44

Ayaz-Microsoft temporarily deployed to production May 25, 2026 12:44 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Ayaz-Microsoft May 25, 2026 12:44 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

feat: enhance token usage telemetry with conversation ID tracking and…

b6ec61b

… improve Application Insights event emission

Ayaz-Microsoft temporarily deployed to production May 25, 2026 13:47 — with GitHub Actions Inactive

aligned code with other GSAs

c3f470a

Copilot AI review requested due to automatic review settings May 27, 2026 16:14

Ayaz-Microsoft deployed to production May 27, 2026 16:14 — with GitHub Actions Active

Copilot started reviewing on behalf of Ayaz-Microsoft May 27, 2026 16:14 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Token count for agents#860

feat: Token count for agents#860
Ayaz-Microsoft wants to merge 11 commits into
devfrom
token-count

Ayaz-Microsoft commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ayaz-Microsoft commented May 25, 2026

Purpose

Does this introduce a breaking change?

Golden Path Validation

Deployment Validation

What to Check

Other Information

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 25, 2026 •

edited

Loading