feat: add Microsoft Fabric and SQL Server support to elementary CLI#2140
feat: add Microsoft Fabric and SQL Server support to elementary CLI#2140
Conversation
- Add dbt-fabric as optional dependency with fabric extra in pyproject.toml - Update packages.yml to point to Fabric-enabled dbt-data-reliability branch - Add fabric__get_adapter_unique_id macro (uses target.server) - Add Fabric transient error patterns to transient_errors.py - Add Fabric to data type mappings in test schema.yml - Update package-lock.yml for new dbt-data-reliability revision Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Add local Docker SQL Server profile for Fabric adapter testing. Uses same local Docker test credentials pattern as other adapters (e.g. postgres admin/admin, dremio dremio123). Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
- Add dbt-sqlserver as optional dependency with sqlserver extra in pyproject.toml - Add sqlserver__get_adapter_unique_id macro (uses target.server) - Add SQL Server transient error patterns to transient_errors.py - Add sqlserver to data type mappings in test schema.yml Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Add local Docker SQL Server profile for sqlserver adapter testing, alongside the existing fabric profile. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
👋 @devin-ai-integration[bot] |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Azure Fabric and SQL Server support across transient-error patterns, adapter-identity macros, optional dbt packages/extras, CI/workflows, docker-compose for e2e, test profiles, model type mappings, and several SQL macro concatenation normalizations. Changes
Sequence Diagram(s)sequenceDiagram
participant CI as CI Workflow
participant Docker as Docker Host
participant SQLSvc as SQL Server container
participant Runner as Runner Host
participant DBT as dbt (macros & profiles)
participant Tests as Test Suite
CI->>Docker: start sqlserver service
Docker-->>SQLSvc: container runs + healthcheck
SQLSvc-->>CI: healthcheck OK
CI->>Runner: install ODBC driver (msodbcsql18) for sqlserver/fabric
CI->>DBT: invoke dbt with profile (uses target.server via new macros)
DBT->>Tests: macros return adapter id
Tests->>DBT: run tests (transient error patterns considered per adapter)
DBT-->>CI: test results
CI->>Tests: trigger e2e workflows / report
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
- Add fabric and sqlserver to warehouse-type matrix in test-all-warehouses.yml - Add fabric and sqlserver to workflow_dispatch options in test-warehouse.yml - Add SQL Server Docker service to docker-compose.yml - Add SQL Server startup step and ODBC driver install to CI workflow - Mark fabric/sqlserver as Docker-based adapters for schema naming Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
elementary/clients/dbt/transient_errors.py (1)
112-127: LGTM!The transient error patterns for
fabricandsqlserverare appropriate for T-SQL/ODBC connections. The patterns cover common connectivity issues like timeouts, connection failures, and SSL errors.The patterns are identical between both adapters—consider extracting them into a shared tuple (similar to
_DATABRICKS_PATTERNS) if future maintenance is a concern, though keeping them separate is also valid for independent evolution.♻️ Optional: Extract shared patterns
+_MSSQL_PATTERNS: Tuple[str, ...] = ( + "connection timed out", + "could not connect to the server", + "ssl syscall error", + "communication link failure", + "tcp provider: an existing connection was forcibly closed", + "login timeout expired", +) + _ADAPTER_PATTERNS: Dict[str, Tuple[str, ...]] = { # ... existing entries ... - "fabric": ( - "connection timed out", - "could not connect to the server", - "ssl syscall error", - "communication link failure", - "tcp provider: an existing connection was forcibly closed", - "login timeout expired", - ), - "sqlserver": ( - "connection timed out", - "could not connect to the server", - "ssl syscall error", - "communication link failure", - "tcp provider: an existing connection was forcibly closed", - "login timeout expired", - ), + "fabric": _MSSQL_PATTERNS, + "sqlserver": _MSSQL_PATTERNS, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@elementary/clients/dbt/transient_errors.py` around lines 112 - 127, The two identical pattern tuples for "fabric" and "sqlserver" should be extracted to a shared constant to avoid duplication and ease maintenance: create a shared tuple (e.g. _TSQL_PATTERNS) containing the repeated strings and replace the current per-adapter tuples for the "fabric" and "sqlserver" entries with a reference to that constant; update any import/usage comments to mirror the existing style used for _DATABRICKS_PATTERNS so the semantics remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@elementary/monitor/dbt_project/macros/get_adapter_type_and_unique_id.sql`:
- Around line 24-31: The two adapter macros fabric__get_adapter_unique_id and
sqlserver__get_adapter_unique_id currently read target.server directly which
breaks for dbt-sqlserver where the canonical field is host; update both macros
to return a tolerant lookup using target.get('host') or target.get('server')
(i.e., return target.get('host') or target.get('server')) so the macros work for
configurations that use either host or server.
In `@elementary/monitor/dbt_project/packages.yml`:
- Line 5: The packages.yml currently pins the dependency using a temporary
branch name in the revision field (revision: devin/ELE-5282-1772640713); update
that revision to a stable identifier by replacing the branch reference with the
final commit hash or release tag for the merged ELE-5282 change. Locate the
revision entry in packages.yml and substitute the branch string with the
concrete commit SHA or version (e.g., <commit-hash> or vX.Y.Z) so builds are
reproducible, then re-run dependency resolution to verify the lockfile resolves
correctly.
In `@pyproject.toml`:
- Around line 59-60: The pyproject.toml entries for dbt-fabric and dbt-sqlserver
currently allow a broad range (dbt-fabric and dbt-sqlserver keys) which can
permit adapter versions incompatible with the dbt-core version; update those two
dependency constraints to tightly track the dbt-core major.minor you support
(e.g., restrict each adapter to the same major.minor as your dbt-core range or
use an exact major.minor-compatible constraint) so adapters must be the same
dbt-core minor series as the project's dbt-core.
---
Nitpick comments:
In `@elementary/clients/dbt/transient_errors.py`:
- Around line 112-127: The two identical pattern tuples for "fabric" and
"sqlserver" should be extracted to a shared constant to avoid duplication and
ease maintenance: create a shared tuple (e.g. _TSQL_PATTERNS) containing the
repeated strings and replace the current per-adapter tuples for the "fabric" and
"sqlserver" entries with a reference to that constant; update any import/usage
comments to mirror the existing style used for _DATABRICKS_PATTERNS so the
semantics remain unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 756529fe-ba59-49e7-bee0-5ed79287d689
📒 Files selected for processing (7)
elementary/clients/dbt/transient_errors.pyelementary/monitor/dbt_project/macros/get_adapter_type_and_unique_id.sqlelementary/monitor/dbt_project/package-lock.ymlelementary/monitor/dbt_project/packages.ymlpyproject.tomltests/e2e_dbt_project/models/schema.ymltests/profiles/profiles.yml.j2
elementary/monitor/dbt_project/macros/get_adapter_type_and_unique_id.sql
Outdated
Show resolved
Hide resolved
… for dbt-data-reliability Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…rver as Docker target dbt-fabric only supports Azure AD authentication methods and cannot connect to a local SQL Server Docker container with SQL auth. Move fabric from Docker targets to cloud targets using ServicePrincipal authentication (matching the dbt-data-reliability configuration). Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
SQL Server / T-SQL does not support the || operator for string concatenation. Replace all occurrences with the cross-database concat() function which is supported by all target databases. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@elementary/monitor/dbt_project/macros/alerts/population/test_alerts.sql`:
- Around line 131-137: The handling of failed_tests.column_name is inconsistent:
change uses of coalesce(failed_tests.column_name, 'None') in the expressions
producing alert_class_id and elementary_unique_id so that empty string values
are treated the same as NULL; specifically replace
coalesce(failed_tests.column_name, 'None') with
coalesce(nullif(failed_tests.column_name, ''), 'None') in the concat for
alert_class_id and in the else branch used for elementary_unique_id (while
keeping the special cases for schema_change and tests.short_name =
'dimension_anomalies' intact) so alert_class_id and elementary_unique_id are
consistent with current_tests_run_results_query.sql and get_test_results.sql.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c7c5ff1d-1680-4d29-9da0-eee73438a1b1
📒 Files selected for processing (3)
elementary/monitor/dbt_project/macros/alerts/population/test_alerts.sqlelementary/monitor/dbt_project/macros/base_queries/current_tests_run_results_query.sqlelementary/monitor/dbt_project/macros/get_test_results.sql
| concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(failed_tests.column_name, 'None'), '.', coalesce(failed_tests.sub_type, 'None')) as alert_class_id, | ||
| case | ||
| when failed_tests.test_type = 'schema_change' then failed_tests.test_unique_id | ||
| {# In old versions of elementary, elementary_test_results doesn't contain test_short_name, so we use dbt_test short_name. #} | ||
| when tests.short_name = 'dimension_anomalies' then failed_tests.test_unique_id | ||
| else coalesce(failed_tests.test_unique_id, 'None') || '.' || coalesce(failed_tests.column_name, 'None') || '.' || coalesce(failed_tests.sub_type, 'None') | ||
| else concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(failed_tests.column_name, 'None'), '.', coalesce(failed_tests.sub_type, 'None')) | ||
| end as elementary_unique_id, |
There was a problem hiding this comment.
Inconsistent empty string handling for column_name.
The column_name handling differs from other files in this PR:
- Here:
coalesce(failed_tests.column_name, 'None')— empty string''remains'' - In
current_tests_run_results_query.sqlandget_test_results.sql:coalesce(nullif(column_name, ''), 'None')— empty string becomes'None'
This could produce mismatched alert_class_id / elementary_unique_id values for the same test when column_name is an empty string vs NULL.
🔧 Proposed fix to add nullif for consistency
- concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(failed_tests.column_name, 'None'), '.', coalesce(failed_tests.sub_type, 'None')) as alert_class_id,
+ concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(nullif(failed_tests.column_name, ''), 'None'), '.', coalesce(failed_tests.sub_type, 'None')) as alert_class_id,
case
when failed_tests.test_type = 'schema_change' then failed_tests.test_unique_id
{# In old versions of elementary, elementary_test_results doesn't contain test_short_name, so we use dbt_test short_name. #}
when tests.short_name = 'dimension_anomalies' then failed_tests.test_unique_id
- else concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(failed_tests.column_name, 'None'), '.', coalesce(failed_tests.sub_type, 'None'))
+ else concat(coalesce(failed_tests.test_unique_id, 'None'), '.', coalesce(nullif(failed_tests.column_name, ''), 'None'), '.', coalesce(failed_tests.sub_type, 'None'))
end as elementary_unique_id,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@elementary/monitor/dbt_project/macros/alerts/population/test_alerts.sql`
around lines 131 - 137, The handling of failed_tests.column_name is
inconsistent: change uses of coalesce(failed_tests.column_name, 'None') in the
expressions producing alert_class_id and elementary_unique_id so that empty
string values are treated the same as NULL; specifically replace
coalesce(failed_tests.column_name, 'None') with
coalesce(nullif(failed_tests.column_name, ''), 'None') in the concat for
alert_class_id and in the else branch used for elementary_unique_id (while
keeping the special cases for schema_change and tests.short_name =
'dimension_anomalies' intact) so alert_class_id and elementary_unique_id are
consistent with current_tests_run_results_query.sql and get_test_results.sql.
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
T-SQL does not allow nested CTEs (WITH inside WITH). The default get_test_results macro wraps current_tests_run_results_query (which starts with WITH) inside another WITH clause, creating nested CTEs. The fabric__ dispatch materialises the inner query into a temp table first, then builds ordered_test_results on top without nesting. sqlserver inherits from fabric automatically, so no sqlserver__ override is needed. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…lity T-SQL forbids ORDER BY in views/subqueries/CTEs unless TOP or OFFSET is also specified. The fabric__create_temp_table uses a VIEW workaround, so ORDER BY must be in the final SELECT, not in the materialised query. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…atibility Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…ble-quotes Double-quoted identifiers break Spark SQL. Use adapter dispatch with fabric__ variants that quote 'schema' and 'database' using T-SQL square brackets, while the default keeps the original unquoted aliases. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
… compatibility Redshift's concat() only accepts 2 arguments. The dbt.concat() macro generates the correct SQL per adapter (|| on Postgres/Redshift/Snowflake, concat() on others). Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
|
|
||
|
|
||
| {% macro fabric__get_model_resources(exclude_elementary=true) %} | ||
| {# T-SQL: 'schema' and 'database' are reserved keywords — quote with square brackets #} |
There was a problem hiding this comment.
Should we have a generic macros for quoting instead of duplicating?
Maybe edr_wuote already exists? Not sure.
There was a problem hiding this comment.
Good point. I checked — edr_quote in dbt-data-reliability is for quoting string values (wraps in single quotes), not identifiers. edr_quote_column uses adapter.quote() which does adapter-native identifier quoting ([col] on T-SQL, "col" on Postgres).
We could create a small helper like:
{% macro edr_quote_alias(alias_name) %}
{% if target.type in ('fabric', 'sqlserver') %}
{{ return('[' ~ alias_name ~ ']') }}
{% else %}
{{ return(alias_name) }}
{% endif %}
{% endmacro %}Then the default macros could use schema_name as {{ edr_quote_alias('schema') }} without needing full fabric__ duplicates. This would eliminate the duplication for get_models_runs, get_model_resources, and get_source_resources.
Alternatively, adapter.quote() could work but it would add quotes on all adapters (e.g., "schema" on Postgres), which changes case sensitivity behavior and — as we saw — breaks Spark SQL.
Want me to implement the edr_quote_alias approach to remove the duplication?
There was a problem hiding this comment.
Done — pushed d566f902. Created edr_quote_identifier(identifier) macro in macros/utils/edr_quote_identifier.sql that returns [identifier] on T-SQL and identifier on all others. Removed the adapter.dispatch + default__ + fabric__ pattern entirely from get_models_runs, get_model_resources, and get_source_resources. Net -93 lines.
|
|
||
|
|
||
| {% macro fabric__get_source_resources(exclude_elementary=true) %} | ||
| {# T-SQL: 'schema' and 'database' are reserved keywords — quote with square brackets #} |
There was a problem hiding this comment.
See comment regarding quoting
elementary/monitor/dbt_project/macros/get_adapter_type_and_unique_id.sql
Outdated
Show resolved
Hide resolved
…ates
Replaces the adapter dispatch + fabric__ duplication pattern with a single
edr_quote_identifier('schema') helper that returns [schema] on T-SQL and
schema on all other adapters. Applied to get_models_runs, get_model_resources,
and get_source_resources.
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
|
|
||
| test_invocation as ( | ||
| select distinct invocation_id, detected_at | ||
| select {% if target.type in ('fabric', 'sqlserver') %}top 1{% endif %} distinct invocation_id, detected_at |
There was a problem hiding this comment.
Use elementary.is_tsql
(Was just added to dbt-data-reliability)
Also anywhere else that does this condition
There was a problem hiding this comment.
Done — pushed 40c8044d. Replaced all 4 occurrences of target.type in ('fabric', 'sqlserver') with elementary.is_tsql() across edr_quote_identifier.sql, get_test_last_invocation.sql, get_latest_invocation.sql, and test_conn.sql.
Replace all 'target.type in (fabric, sqlserver)' conditions with elementary.is_tsql() macro from dbt-data-reliability. Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
haritamar
left a comment
There was a problem hiding this comment.
Nice work on the Fabric/SQL Server support. The T-SQL compatibility approach is well-structured — dbt.concat(), elementary.is_tsql(), and edr_quote_identifier are the right patterns.
Main feedback:
- CI default for
dbt-data-reliability-refleaks the dev branch to all warehouse runs — needs cleanup before/after merge. - Duplicate transient error tuples — extract a shared
_TSQL_TRANSIENTtuple. sqlserver__get_adapter_unique_idis redundant given the automatic fabric fallback.fabric__get_test_results— the biggest concern. The explicit column list will drift, and the post-processing Jinja loop is copy-pasted. Suggest extracting a shared processing helper and usingselect *in the temp table step to keep the fabric dispatch focused on just the T-SQL materialization strategy.
| warehouse-type: ${{ matrix.warehouse-type }} | ||
| elementary-ref: ${{ inputs.elementary-ref || ((github.event_name == 'pull_request_target' || github.event_name == 'pull_request') && github.event.pull_request.head.sha) || '' }} | ||
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref }} | ||
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref || 'devin/ELE-5282-1772640713' }} |
There was a problem hiding this comment.
This changes the default dbt-data-reliability-ref for all warehouse CI runs (not just fabric/sqlserver). Once ELE-5282 is merged, this default needs to be reverted to empty — otherwise all CI runs will pin to a stale branch.
There was a problem hiding this comment.
Acknowledged — the default will need to be reverted to empty once ELE-5282 is merged. I'll leave a TODO note or we can address it when that branch lands.
| "fabric": ( | ||
| "connection timed out", | ||
| "could not connect to the server", | ||
| "ssl syscall error", | ||
| "communication link failure", | ||
| "tcp provider: an existing connection was forcibly closed", | ||
| "login timeout expired", | ||
| ), | ||
| "sqlserver": ( | ||
| "connection timed out", | ||
| "could not connect to the server", | ||
| "ssl syscall error", | ||
| "communication link failure", | ||
| "tcp provider: an existing connection was forcibly closed", | ||
| "login timeout expired", | ||
| ), | ||
| } | ||
|
|
||
| # Pre-computed union of all adapter-specific patterns for the fallback path |
There was a problem hiding this comment.
These two entries are byte-for-byte identical. Consider extracting a shared tuple to reduce maintenance burden:
_TSQL_TRANSIENT = (
"connection timed out",
"could not connect to the server",
"ssl syscall error",
"communication link failure",
"tcp provider: an existing connection was forcibly closed",
"login timeout expired",
)
# Then in the dict:
"fabric": _TSQL_TRANSIENT,
"sqlserver": _TSQL_TRANSIENT,There was a problem hiding this comment.
Done in 39a3c8d4 — extracted _TSQL_TRANSIENT shared tuple, both "fabric" and "sqlserver" now reference it.
|
|
||
| {% macro sqlserver__get_adapter_unique_id() %} | ||
| {{ return(target.server) }} |
There was a problem hiding this comment.
Since the PR notes that sqlserver automatically falls back to fabric__ dispatch, this sqlserver__ variant is redundant and can be removed.
There was a problem hiding this comment.
Done in 39a3c8d4 — removed the redundant sqlserver__get_adapter_unique_id. sqlserver will now fall back to fabric__get_adapter_unique_id automatically.
…e redundant sqlserver dispatch, deduplicate get_test_results processing Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
| warehouse-type: ${{ matrix.warehouse-type }} | ||
| elementary-ref: ${{ inputs.elementary-ref || ((github.event_name == 'pull_request_target' || github.event_name == 'pull_request') && github.event.pull_request.head.sha) || '' }} | ||
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref }} | ||
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref || 'devin/ELE-5282-1772640713' }} |
There was a problem hiding this comment.
🔴 CI workflow hardcodes a development branch for dbt-data-reliability-ref
The dbt-data-reliability-ref input in the CI workflow defaults to the development branch devin/ELE-5282-1772640713 instead of an empty string (which would use the repo's default branch). This means every PR-triggered and pull_request_target-triggered CI run will check out this development branch of dbt-data-reliability instead of master/main. This branch may be deleted or force-pushed at any time, breaking all CI runs.
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref || 'devin/ELE-5282-1772640713' }} | |
| dbt-data-reliability-ref: ${{ inputs.dbt-data-reliability-ref }} |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
This is intentional and already documented in the PR description — this PR depends on the devin/ELE-5282-1772640713 branch (ELE-5282) which adds Fabric/SQL Server support to dbt-data-reliability. Both packages.yml and the CI workflow default need to reference this branch until ELE-5282 is merged, at which point they'll be updated to a stable commit hash/removed.
| version: [">=0.8.0", "<0.9.0"] | ||
| - git: https://github.com/elementary-data/dbt-data-reliability.git | ||
| revision: ab21363935c42490a60a779557ba99bed96b754c | ||
| revision: devin/ELE-5282-1772640713 |
There was a problem hiding this comment.
🔴 packages.yml pins to a development branch instead of a stable commit hash
The packages.yml shipped with the CLI sets revision: devin/ELE-5282-1772640713 — a development branch name — instead of a pinned commit hash. This means anyone installing the Elementary CLI and running dbt deps will pull from a volatile branch that can be rebased, force-pushed, or deleted. The previous code correctly used a commit hash (ab21363935c42490a60a779557ba99bed96b754c). This should be reverted to a stable commit hash before merging.
Prompt for agents
In elementary/monitor/dbt_project/packages.yml line 5, replace the development branch name `devin/ELE-5282-1772640713` with a stable commit hash from the dbt-data-reliability repository that contains the fabric/sqlserver support changes. Also update the package-lock.yml (elementary/monitor/dbt_project/package-lock.yml line 7) to match the chosen commit hash.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Same as above — this is a known, documented dependency. The PR description explicitly states: "Must be reverted to a commit hash or release once ELE-5282 is merged." Will be updated before final merge.
… branch pin Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
.github/workflows/test-warehouse.yml
Outdated
| working-directory: ${{ env.E2E_DBT_PROJECT_DIR }} | ||
| run: | | ||
| docker compose up -d sqlserver | ||
| timeout 120 bash -c 'until docker exec sqlserver /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "${MSSQL_SA_PASSWORD:-Elementary123!}" -C -Q "SELECT 1" -b 2>/dev/null; do sleep 5; done' |
There was a problem hiding this comment.
Could we add a healthcheck instead in the docker compose, and then wait for the docker to be healthy?
There was a problem hiding this comment.
Good call — the docker-compose already had a healthcheck defined, so I've updated the CI step to use docker compose up -d --wait sqlserver which waits for the healthcheck to pass. Removed the manual timeout/polling loop.
…nual polling Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
feat: add Microsoft Fabric and SQL Server support to elementary CLI
Summary
CLI-side counterpart to ELE-5282 (dbt-data-reliability Fabric/SQL Server support). Wires in
dbt-fabricanddbt-sqlserveras optional dependencies, adds adapter-specific overrides needed foredrcommands, and integrates both adapters into the CI test matrix.Changes:
pyproject.toml— Addeddbt-fabricanddbt-sqlserveras optional deps withfabricandsqlserverextras (pip install "elementary-data[fabric]"/pip install "elementary-data[sqlserver]")packages.yml— Points dbt-data-reliability to merged commit534afc63c75d28b87d7cbd3b222dd3ea9a980f7bon master (ELE-5282 merged)get_adapter_type_and_unique_id.sql— Addedfabric__dispatch macro returningtarget.server(sqlserver inherits via adapter fallback)transient_errors.py— Added shared_TSQL_TRANSIENTconstant with T-SQL/ODBC transient error patterns, referenced by bothfabricandsqlserveradapter entriesprofiles.yml.j2— Added local Docker test profiles for both adapters (connecting to SQL Server on localhost:1433)schema.yml— Addedfabricandsqlserverto thevarchardata type branches in test modelsdocker-compose.yml— Added SQL Server 2022 Docker service with healthcheck (used by bothfabricandsqlservertargets)test-all-warehouses.yml— Addedfabricandsqlserverto the CI warehouse-type matrixtest-warehouse.yml— Added SQL Server container startup usingdocker compose up -d --wait(healthcheck-based), ODBC driver installation, and marked both adapters as Docker-based for schema namingT-SQL compatibility fixes
Several T-SQL syntax incompatibilities were discovered and fixed to get
edr monitor reportworking on Fabric/SQL Server:||→dbt.concat()— T-SQL doesn't support||for string concatenation. Replaced withdbt.concat()(the dbt cross-DB macro) intest_alerts.sql,current_tests_run_results_query.sql, andget_test_results.sql. This generates||on Postgres/Redshift/Snowflake andconcat()on T-SQL adapters, so it works across all warehouses.LIMIT→TOP— T-SQL usesTOP Ninstead ofLIMIT N. Added conditional branches usingelementary.is_tsql()inget_latest_invocation.sql,get_test_last_invocation.sql, andtest_conn.sql.WITHclauses inside subqueries or other CTEs. Addedfabric__get_test_resultsmacro that materializes intermediate queries as temp tables (test_results_base,ordered_test_results) and cleans them up after use.sqlserverinherits this automatically via adapter fallback.ORDER BYin views — T-SQL doesn't allowORDER BYin view/subquery definitions withoutTOPorOFFSET. MovedORDER BYclauses outside ofcreate_temp_tablecalls to the finalrun_queryinstead.schemaanddatabaseare T-SQL reserved words. Creatededr_quote_identifiermacro inmacros/utils/edr_quote_identifier.sqlthat wraps identifiers in square brackets ([identifier]) for T-SQL adapters and returns them bare for all others. Applied toget_models_runs,get_model_resources, andget_source_resources— eliminating the need for fullfabric__duplicate macros.CI Status: 11/13 warehouse adapters passing — fabric, sqlserver, spark, snowflake, redshift, databricks, postgres, clickhouse, trino, athena, dremio all green. Two transient failures (duckdb network timeout, bigquery pre-existing test failures) unrelated to this PR's changes.
Updates since last revision
Post-merge cleanup (commits
54614537,69898366):packages.ymlfrom dev branch reference (devin/ELE-5282-1772640713) to merged master commit hash (534afc63c75d28b87d7cbd3b222dd3ea9a980f7b)test-all-warehouses.yml)docker compose up -d --wait sqlserverinstead of manual timeout/sqlcmd polling loop — leverages existing healthcheck for more robust startup detectionCode quality refactoring (commit
39a3c8d4):_TSQL_TRANSIENTshared tuple intransient_errors.pyto eliminate duplicate error patterns between fabric and sqlserver (net -12 lines)sqlserver__get_adapter_unique_iddispatch — sqlserver automatically falls back tofabric__implementation (net -4 lines)fabric__get_test_resultsto useselect *in Step 3 instead of 35 explicit columns — eliminates column drift maintenance burden_process_raw_test_resultshelper macro to deduplicate Jinja post-processing logic across default, fabric, and clickhouse dispatches (net -77 lines)Refactored T-SQL reserved keyword quoting (commit
d566f902):edr_quote_identifier(identifier)macro to conditionally wrap identifiers in[...]for T-SQL adaptersfabric__duplicate macros forget_models_runs,get_model_resources, andget_source_resourcesReplaced inline T-SQL checks with
elementary.is_tsql()macro (commit40c8044d):target.type in ('fabric', 'sqlserver')with calls toelementary.is_tsql()macro (from dbt-data-reliability master)edr_quote_identifier.sql,get_test_last_invocation.sql,get_latest_invocation.sql, andtest_conn.sqlReview & Testing Checklist for Human
_process_raw_test_resultshelper macro — This new shared helper is used by default, fabric, and clickhouse dispatches. Verify the logic is equivalent to what was in each dispatch previously and that consolidating them doesn't introduce subtle behavioral differences.fabric__get_test_resultsStep 3select *change — Changed from 35 explicit columns toselect *. Verify that this doesn't introduce unexpected columns or break downstream logic. Consider whether there are cases where the upstream temp table might have extra columns that shouldn't be included.edr monitor reporton a real Fabric or SQL Server warehouse would provide additional confidence. Verify alerts, test results, and model/source resources are fetched correctly.534afc63c75d28b87d7cbd3b222dd3ea9a980f7bis the correct merged commit on master and includes all requiredelementary.is_tsql()and Fabric/SQL Server support.Notes
534afc63c75d28b87d7cbd3b222dd3ea9a980f7bfrom dbt-data-reliability_TSQL_TRANSIENTconstant is now shared betweenfabricandsqlserveradapter entries in_ADAPTER_PATTERNS— both use identical T-SQL/ODBC error patterns.fabricandsqlservershare the same SQL Server Docker container in CI. The container starts using healthcheck-based startup (--waitflag) for robustness.sqlserver__dispatch doesn't exist, it automatically usesfabric__. This means onlyfabric__variants are needed for T-SQL compatibility.edr_quote_identifiermacro now useselementary.is_tsql()(adapter dispatch from dbt-data-reliability) instead of inlinetarget.typechecks for cleaner maintainability.dbt deps, bigquery has 60 pre-existing test failuresSummary by CodeRabbit
dbt.concat()), LIMIT/TOP, nested CTEs, ORDER BY in views, and reserved keyword quoting.edr_quote_identifiermacro to eliminate code duplication for T-SQL reserved keyword handling.elementary.is_tsql()macro for cleaner maintainability._TSQL_TRANSIENTshared tuple to eliminate duplicate transient error patterns._process_raw_test_resultshelper macro to deduplicate Jinja post-processing logic.docker compose --waitdbt.concat()cross-DB macro to ensure compatibility across all warehouse adapters (Postgres, Redshift, Snowflake, T-SQL, etc.).edr_quote_identifiermacro, reducing code by ~93 lines.target.type in ('fabric', 'sqlserver')checks withelementary.is_tsql()macro calls across 4 macro files.fabric__get_test_resultsto useselect *and extracted shared processing helper, reducing code by ~77 lines.