[WIP] feature: shell integration 💻 by tnaum-ms · Pull Request #508 · microsoft/vscode-documentdb

tnaum-ms · 2026-02-17T14:28:11Z

Shell Integration — DocumentDB Query Language & Autocomplete

Umbrella PR for the shell integration feature: a custom documentdb-query Monaco language with intelligent autocomplete, hover docs, and validation across all query editor surfaces (filter, project, sort, aggregation, shell).

Work is organized as incremental steps, each delivered via a dedicated sub-PR merged into feature/shell-integration.

Progress

Step 1 — Schema Tool Decision — Evaluated schema analysis approaches, decided to enhance SchemaAnalyzer (JSON Schema output, incremental merge, 24 BSON types)
Step 2 — SchemaAnalyzer Refactoring · refactor: SchemaAnalyzer class + enhanced FieldEntry + new schema transformers #506 — Extracted @vscode-documentdb/schema-analyzer package, enriched FieldEntry with BSON types, added schema transformers, introduced monorepo structure
Step 3 — documentdb-constants Package · feat: add documentdb-constants package — operator metadata for autocomplete #513 — 308 operator entries (DocumentDB API query operators, update operators, stages, accumulators, BSON constructors, system variables) as static metadata for autocomplete
Step 3.5 — Monaco Language Architecture — Selected documentdb-query custom language with JS Monarch tokenizer (no TS worker), validated via POC across 8 test criteria
Step 4 — Filter CompletionItemProvider · In progress · [WIP] documentdb-query language #518 — documentdb-query language registration, per-editor model URIs, completion data store, CompletionItemProvider (filter/project/sort), HoverProvider, acorn validation, $-prefix fix, remove legacy JSON Schema pipeline
Step 5 — Legacy Scrapbook Removal
Step 6 — Scrapbook Rebuild (New Shell)
Step 7 — Shell CompletionItemProvider
Step 8 — Aggregation CompletionItemProvider

Key Architecture Decisions

Decision	Outcome
Language strategy	`documentdb-query` custom language — JS Monarch tokenizer, no TS worker (~400-600 KB saved)
Completion providers	Single `CompletionItemProvider` + URI routing (`documentdb://{editorType}/{sessionId}`)
Completion data	`documentdb-constants` bundled at build time; field data pushed via tRPC subscription
Validation	`acorn.parseExpressionAt()` for syntax errors; `acorn-walk` + `documentdb-constants` for identifier validation
Document editors	Stay on `language="json"` with JSON Schema validation
Shell/Scrapbook (future)	`language="javascript"` with full TS service + `.d.ts` via `addExtraLib()`

Sub-PRs

PR	Step	Title	Status
#506	2	refactor: SchemaAnalyzer class + enhanced FieldEntry + new schema transformers	✅ Merged
#513	3	feat: add documentdb-constants package — operator metadata for autocomplete	✅ Merged
#518	4	[WIP] documentdb-query language	🔄 Open

… stats bugs Group A of SchemaAnalyzer refactor: - Fix A1: array element stats overwrite bug (isNewTypeEntry) - Fix A2: probability >100% for array-embedded objects (x-documentsInspected) - Rename folder: src/utils/json/mongo/ → src/utils/json/data-api/ - Rename enum: MongoBSONTypes → BSONTypes - Rename file: MongoValueFormatters → ValueFormatters - Add 9 new tests for array stats and probability

Group B of SchemaAnalyzer refactor: - B1: SchemaAnalyzer class with addDocument(), getSchema(), reset(), getDocumentCount() - B2: clone() method using structuredClone for schema branching - B3: addDocuments() batch convenience method - B4: static fromDocument()/fromDocuments() factories (replaces getSchemaFromDocument) - B5: Migrate ClusterSession to use SchemaAnalyzer instance - B6-B7: Remove old free functions (updateSchemaWithDocument, getSchemaFromDocument) - Keep getPropertyNamesAtLevel, getSchemaAtPath, buildFullPaths as standalone exports

…x properties type Group C of SchemaAnalyzer refactor: - C1: Add typed x-minValue, x-maxValue, x-minLength, x-maxLength, x-minDate, x-maxDate, x-trueCount, x-falseCount, x-minItems, x-maxItems, x-minProperties, x-maxProperties to JSONSchema interface - C2: Fix properties type: properties?: JSONSchema → properties?: JSONSchemaMap - C3: Fix downstream type errors in SchemaAnalyzer.test.ts (JSONSchemaRef casts)

…temBsonType Group D of SchemaAnalyzer refactor: - D1: Add bsonType to FieldEntry (dominant BSON type from x-bsonType) - D2: Add bsonTypes[] for polymorphic fields (2+ distinct types) - D3: Add isOptional flag (x-occurrence < parent x-documentsInspected) - D4: Add arrayItemBsonType for array fields (dominant element BSON type) - D5: Sort results: _id first, then alphabetical by path - D6: Verified generateMongoFindJsonSchema still works (additive changes) - G4: Add 7 getKnownFields tests covering all new fields

… toFieldCompletionItems) Group E of SchemaAnalyzer refactor: - E1: generateDescriptions() — post-processor adding human-readable description strings with type info, occurrence percentage, and min/max stats - E2: toTypeScriptDefinition() — generates TypeScript interface strings from JSONSchema for shell addExtraLib() integration - E3: toFieldCompletionItems() — converts FieldEntry[] to CompletionItemProvider- ready FieldCompletionData[] with insert text escaping and $ references Also: - Rename isOptional → isSparse in FieldEntry and FieldCompletionData (all fields are implicitly optional in MongoDB API / DocumentDB API; isSparse is a statistical observation, not a constraint) - Fix lint errors (inline type specifiers) - 18 new tests for transformers + updated existing tests

- Add 5 tests for clone(), reset(), fromDocument(), fromDocuments(), addDocuments() - Mark all checklist items A-G as complete, F1-F2 as deferred - Add Manual Test Plan section (§14) with 5 end-to-end test scenarios - Document clone() limitation with BSON Binary types (structuredClone)

- Add monotonic version counter to SchemaAnalyzer (incremented on mutations) - Cache getKnownFields() with version-based staleness check - Add ClusterSession.getKnownFields() accessor (delegates to cached analyzer) - Wire collectionViewRouter to use session.getKnownFields() instead of standalone function - Add ext.outputChannel.trace for schema accumulation and reset events

Co-authored-by: tnaum-ms <171359267+tnaum-ms@users.noreply.github.com>

…ts (#507)

…ng behavior

…utput

…ypeScript definitions and completion items

Move SchemaAnalyzer, JSONSchema types, BSONTypes, ValueFormatters, and getKnownFields into packages/schema-analyzer as @vscode-documentdb/schema-analyzer. - Set up npm workspaces (packages/*) and TS project references - Update all extension-side imports to use the new package - Configure Jest multi-project for both extension and package tests - Remove @vscode/l10n dependency from core (replaced with plain Error) - Fix strict-mode type issues (localeCompare bug, index signatures) - Update .gitignore to include root packages/ directory - Add packages/ to prettier glob

…itions The bsonToTypeScriptMap emits non-built-in type names (ObjectId, Binary, Timestamp, etc.) without corresponding import statements or declare stubs. Currently harmless since the output is for display/hover only, but should be addressed if the TS definition is ever consumed by a real TS language service. Addresses PR #506 review comment from copilot.

…ion names - Prefix with _ when PascalCase result starts with a digit (e.g. '123abc' → '_123abcDocument') - Fall back to 'CollectionDocument' when name is empty or only separators - Filter empty segments from split result - Add tests for edge cases Addresses PR #506 review comment from copilot.

Add comment explaining why the cast to JSONSchema is safe: our SchemaAnalyzer never produces boolean schema refs. Notes that a typeof guard should be added if the function is ever reused with externally-sourced schemas. Addresses PR #506 review comment from copilot.

…lashes - Replace SPECIAL_CHARS_PATTERN with JS_IDENTIFIER_PATTERN for proper identifier validity check (catches dashes, brackets, digits, quotes, etc.) - Escape embedded double quotes and backslashes when quoting insertText - Add tests for all edge cases (dashes, brackets, digits, quotes, backslashes) - Mark future-work item #1 as resolved; item #2 (referenceText/$getField) remains open for aggregation completion provider phase Addresses PR #506 review comment from copilot.

…lity

…oved consistency

…PI alongside MongoDB API

…nsformers (#506)

…lback logic

… and completions - Introduced new test files for operator reference verification, parsing, and structural invariants. - Implemented tests for `getFilteredCompletions` and `docLinks` to ensure correct URL generation. - Enhanced Jest configuration to limit workers and updated test match patterns. - Added `parseOperatorReference` utility to structure operator data from markdown dumps. - Verified that all operators in the implementation match the documented reference and that no unsupported operators are present. - Ensured all entries have required fields, valid meta tags, and non-empty descriptions.

- Replaced dynamic documentation links with direct URLs for operators in accumulators, expression operators, query operators, update operators, window operators, and stages. - Updated metadata references in docLinks to reflect accurate categories for boolean and comparison expressions. - Enhanced test coverage for operator reference verification, ensuring merged dump and overrides are consistent with implementation. - Removed unused getDocLink imports from system variables.

- Updated the paths for scraped operator data and overrides in the generate-from-reference script. - Introduced snippet templates for operators, enhancing the generation process. - Adjusted the parsing logic to accommodate new snippet structures. - Modified the output files to reflect the new resource paths. - Updated tests and documentation to align with the new file structure.

…stants - Reordered import statements for consistency across files in `stages.ts`, `systemVariables.ts`, `updateOperators.ts`, and `windowOperators.ts`. - Consolidated multiline descriptions into single lines for clarity and consistency. - Updated registration of operators to use spread syntax for improved readability. - Ensured all links are marked as inferred from another category for better documentation.

…istency - Updated descriptions in query operators to ensure consistent formatting and clarity. - Reorganized imports in query operators, stages, system variables, update operators, and window operators for better structure. - Enhanced documentation links for various operators to provide clearer guidance. - Removed unnecessary whitespace and ensured consistent snippet formatting across all operator files.

…pletions

…hatch MetaTag is now (typeof ALL_META_TAGS)[number] | (string & {}), giving IntelliSense for known tags while preserving extensibility for runtime- injected tags like field:identifier.

…root Aligns with Approach A (monorepo delegation): dev tooling (jest, ts-jest, prettier, ts-node, @types/jest) is provided by the root package.json. Eliminates the Jest 29 vs 30 version mismatch. Added root-install note to README.

- Root prettier check now covers packages/ (was missing, only prettier-fix had it) - schema-analyzer jest.config.js: add maxWorkers '50%' to match documentdb-constants - schema-analyzer README: add monorepo root-install note (matching documentdb-constants) - No .nvmrc or prettier configs needed per-package (inherited from root)

…type checking

…icit loadOperators() factory

…dir mappings in getDocLink

…dir mappings, update generator template

…tor registry

…lbacks in applyOverrides

…ert generator inferred-link override (scraper ground truth takes precedence)

… to prevent registry mutation

…eld with planned values and usage context

…4s are tolerated, connectivity errors are not

…returns a safe copy via Array.filter

…mplete (#513)

packages/documentdb-constants/scripts/scrape-operator-docs.ts

Resolves CodeQL 'Incomplete string escaping or encoding' alert by escaping backslash characters before escaping pipe characters.

tnaum-ms and others added 27 commits February 16, 2026 20:16

Initial plan

633f0b4

refactor: remove debug console.log statements from tests

74eeeac

Co-authored-by: tnaum-ms <171359267+tnaum-ms@users.noreply.github.com>

refactor: remove debug console.log statements from SchemaAnalyzer tes…

eb71916

…ts (#507)

test: add comprehensive tests for SchemaAnalyzer versioning and cachi…

d8d0709

…ng behavior

refactor: remove console.log statements from test files for cleaner o…

ebdde30

…utput

refactor: enhance handling of special characters in field names for T…

c23b604

…ypeScript definitions and completion items

docs: add README and bump schema-analyzer to v1.0.0

2fec69d

build: add prebuild and prejesttest scripts for workspace package builds

a667c35

chore: bump schema-analyzer version to 1.0.0 in package-lock.json

cbaa573

Refactor code structure for improved readability and maintainability

f1d006d

refactor: streamline TypeScript definition tests for improved readabi…

35a13a1

…lity

docs: add terminology guidelines for DocumentDB and MongoDB API usage

1cb9e5b

refactor: replace 'console' assert with 'node:assert/strict' for impr…

43915a5

…oved consistency

refactor: update documentation to consistently reference DocumentDB A…

75536e9

…PI alongside MongoDB API

refactor: SchemaAnalyzer class + enhanced FieldEntry + new schema tra…

5094ca6

…nsformers (#506)

tnaum-ms linked an issue Feb 17, 2026 that may be closed by this pull request

Improve Scrapbook Experience (shell integration) 🚀 #66

Open

tnaum-ms removed a link to an issue Feb 17, 2026

Improve Scrapbook Experience (shell integration) 🚀 #66

Open

tnaum-ms added this to the 0.8.0 - February 2026 milestone Feb 17, 2026

tnaum-ms added 24 commits February 20, 2026 12:33

refactor: enhance findOverride function to support cross-category fal…

eaf0a9b

…lback logic

fix: update Jest test path for operator reference validation in README

ab68d89

docs: fix JSDoc — 'frozen array' → 'readonly array' in getFilteredCom…

9290f70

…pletions

refactor: derive MetaTag from ALL_META_TAGS union with string escape …

b13745d

…hatch MetaTag is now (typeof ALL_META_TAGS)[number] | (string & {}), giving IntelliSense for known tags while preserving extensibility for runtime- injected tags like field:identifier.

chore(documentdb-constants): add tsconfig.scripts.json for CI script …

6c7e614

…type checking

refactor(documentdb-constants): replace side-effect imports with expl…

0ece75a

…icit loadOperators() factory

fix(documentdb-constants): correct expr:bool and expr:comparison doc …

5b99454

…dir mappings in getDocLink

fix(documentdb-constants): correct expr:bool and expr:comparison doc …

1e74364

…dir mappings, update generator template

fix(documentdb-constants): add idempotent registration guard to opera…

e8a13f4

…tor registry

fix(documentdb-constants): log ambiguous and unambiguous override fal…

df45bfb

…lbacks in applyOverrides

fix(documentdb-constants): integrate script typecheck into build; rev…

cee31fa

…ert generator inferred-link override (scraper ground truth takes precedence)

fix(documentdb-constants): return shallow copy from getAllCompletions…

1052626

… to prevent registry mutation

docs(documentdb-constants): expand returnType JSDoc — experimental fi…

cadb1f6

…eld with planned values and usage context

fix(documentdb-constants): exit 1 on network failures in scraper — 40…

782eb2d

…4s are tolerated, connectivity errors are not

docs(documentdb-constants): note that getFilteredCompletions already …

812702c

…returns a safe copy via Array.filter

feat: add documentdb-constants package — operator metadata for autoco…

d700375

…mplete (#513)

Merge branch 'next' into feature/shell-integration

3d0674f

github-advanced-security bot found potential problems Feb 23, 2026

View reviewed changes

packages/documentdb-constants/scripts/scrape-operator-docs.ts Fixed Show fixed Hide fixed

tnaum-ms added 3 commits February 23, 2026 18:29

chore: package-lock.json update

bc5adc3

fix: prettier-fix

23dd1eb

fix: escape backslashes before pipes in escapeTableCell

a985c16

Resolves CodeQL 'Incomplete string escaping or encoding' alert by escaping backslash characters before escaping pipe characters.

tnaum-ms changed the title ~~feature: shell integration~~ feature: shell integration 💻 Feb 23, 2026

tnaum-ms changed the title ~~feature: shell integration 💻~~ [WIP] feature: shell integration 💻 Feb 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] feature: shell integration 💻#508

[WIP] feature: shell integration 💻#508
tnaum-ms wants to merge 59 commits intonextfrom
feature/shell-integration

tnaum-ms commented Feb 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tnaum-ms commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Shell Integration — DocumentDB Query Language & Autocomplete

Progress

Key Architecture Decisions

Sub-PRs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tnaum-ms commented Feb 17, 2026 •

edited

Loading