Skip to content

Add YAML input support (-I yaml) #168

@vmvarela

Description

@vmvarela

Description

Add YAML as a supported input format. Parse YAML files containing lists of maps as tabular data.

# users.yaml
- id: 1
  name: Alice
  balance: 250.50
- id: 2
  name: Bob
  balance: 150.75
sql-pipe -I yaml users.yaml 'SELECT SUM(balance) FROM t'
# Auto-detect from .yaml/.yml extension:
sql-pipe users.yaml 'SELECT * FROM t WHERE id > 1'

Motivation

YAML is ubiquitous in CI/CD (GitHub Actions, GitLab CI), configuration files (Docker Compose, Kubernetes, Ansible), and infrastructure-as-code. It is the biggest format gap in sql-pipe's input support.

Acceptance Criteria

  • Add yaml to the InputFormat enum
  • Auto-detect YAML from .yaml and .yml file extensions
  • Parse YAML list-of-maps as rows with map keys as column names
  • Support nested YAML structures (flatten like JSON does)
  • Type inference works on YAML values (int, float, bool, null, string)
  • -I yaml explicit override works for stdin and non-standard extensions
  • Integration tests with various YAML structures
  • Help text updated

Implementation Notes

  • Requires a YAML parser — options: bundle libyaml (C library, like SQLite amalgamation) or implement a minimal Zig parser
  • YAML is complex (anchors, aliases, multi-line strings, flow vs block style) — bundling libyaml is the pragmatic choice
  • The existing JSON flattening logic can be reused for nested structures
  • Estimated effort: 1-2 days

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:highMust be in the next sprintsize:lLarge — 1 to 2 daysstatus:readyRefined and ready for sprint selectiontype:featureNew functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions