feat(backend/kernel): wire TSparkParameter through to kernel bind_param#789
Conversation
24e9a5c to
5a6f1f0
Compare
Lifts the NotSupportedError that execute_command currently raises
for parametrized queries. The kernel-side PyO3 binding for
Statement.bind_param landed in databricks-sql-kernel#18; this
commit wires the connector's TSparkParameter shape through to it.
Implementation:
- New `bind_tspark_params(kernel_stmt, parameters)` in
type_mapping.py forwards each TSparkParameter to the kernel as
`(ordinal, value.stringValue, type)`. ordinal is the 1-based
position in the parameters list; the connector's `ordinal: bool`
flag is checked only to reject named bindings (kernel v0 doesn't
accept them on the wire).
- execute_command no longer raises on `parameters=[...]`. The
query_tags branch stays — that's a separate gap.
Tests:
- 6 new unit tests in tests/unit/test_kernel_type_mapping.py for
the mapper:
- positional forwarding preserves ordering and (ordinal, value, type)
- None value forwards as SQL NULL
- VOID passes through verbatim (kernel parser ignores value for VOID)
- named bindings raise NotSupportedError with a pointed message
- missing TSparkParameter.type defaults to STRING (defensive)
- empty parameters list is a no-op
- `test_execute_command_rejects_parameters` (which previously
asserted the NotSupportedError) replaced with
`test_execute_command_forwards_parameters_to_bind_param` — stubs
the kernel statement and verifies bind_param is called once per
TSparkParameter in order with 1-based ordinals, and execute
fires after binding.
- 3 new e2e tests in tests/e2e/test_kernel_backend.py against
dogfood:
- mixed-type round-trip (INT, STRING, BOOLEAN) via the
connector's native IntegerParameter/StringParameter/BooleanParameter
- None parameter (VoidParameter → SQL NULL)
- DECIMAL parameter with precision/scale carried in the SQL type
string (auto-inferred — explicit-arg path has a pre-existing
bug in native.py where format-args are swapped)
106/106 kernel unit tests pass.
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
aa8bd24 to
19c0bb6
Compare
|
P1 — Important (should fix)
The PR docstring says compound types "route through the kernel parser which currently rejects them." Reality: ArrayParameter.as_tspark_param() and MapParameter.as_tspark_param() build
cur.execute("SELECT array_contains(?, 1)", [ArrayParameter([1,2,3])]) returns NULL with no error from either side. This contradicts the documented behavior. Fix: Detect compound types up front in bind_tspark_params and raise NotSupportedError:
The test builds _mk_param(type="VOID", value="None") → p.value = TSparkParameterValue(stringValue="None"). But VoidParameter._tspark_param_value() (parameters/native.py:337-339) returns Python None, so the Fix: Change _mk_param(type="VOID", value="None") to _mk_param(type="VOID", value=None) and assert [(1, None, "VOID")]. Correct the docstring.
Thrift defaults ordinal=None, name=None. The check if getattr(param, "ordinal", None) is False and getattr(param, "name", None): only triggers when ordinal is exactly False. A TSparkParameter with Fix: P2 — Minor
|
Address review feedback on #789: - ArrayParameter / MapParameter / StructParameter put their payload on TSparkParameter.arguments (not .value); the previous binding path forwarded value=None and silently bound a typed NULL. Reject compound types up front with NotSupportedError so callers see a clear error instead of silent data loss. - Tighten the named-binding guard to fire whenever a name is set and ordinal is not exactly True (Thrift defaults ordinal to None; the prior check only triggered on ordinal=False). - Correct test_bind_tspark_params_void_passes_through: VoidParameter yields value=None on the wire, so the production call is (1, None, "VOID"), not (1, "None", "VOID"). Fix the misleading docstring on _tspark_param_value_str to match. - Remove the misleading lazy-import comment in client.py — pyarrow is already loaded transitively via KernelResultSet. Move bind_tspark_params and NotSupportedError imports to module top. Co-authored-by: Isaac
|
Thanks @gopalldb — addressed in f384f1a. Summary: P1 #1 — compound types silently binding NULL (data loss) You were right, this was a real bug. Added a guard in P1 #2 — VOID test pinned the wrong shape Confirmed: P1 #3 — named-binding discriminator Tightened from P2 cleanups
P2 not changed (with rationale)
Tests
|
Address review feedback on #789: - ArrayParameter / MapParameter / StructParameter put their payload on TSparkParameter.arguments (not .value); the previous binding path forwarded value=None and silently bound a typed NULL. Reject compound types up front with NotSupportedError so callers see a clear error instead of silent data loss. - Tighten the named-binding guard to fire whenever a name is set and ordinal is not exactly True (Thrift defaults ordinal to None; the prior check only triggered on ordinal=False). - Correct test_bind_tspark_params_void_passes_through: VoidParameter yields value=None on the wire, so the production call is (1, None, "VOID"), not (1, "None", "VOID"). Fix the misleading docstring on _tspark_param_value_str to match. - Remove the misleading lazy-import comment in client.py — pyarrow is already loaded transitively via KernelResultSet. Move bind_tspark_params and NotSupportedError imports to module top. Co-authored-by: Isaac Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
f384f1a to
b60b3c9
Compare
Summary
Lifts the
NotSupportedErrorthe kernel backend currently raises for parametrized queries. The kernel-side PyO3 binding (Statement.bind_param) shipped in databricks-sql-kernel#18; this PR wires the connector'sTSparkParametershape through to it.Before:
cursor.execute("SELECT ?", [IntegerParameter(42)])onuse_kernel=TrueraisedNotSupportedError. After: positional parameter binding works end-to-end.What it does
bind_tspark_params(kernel_stmt, parameters)intype_mapping.pyforwards eachTSparkParameterto the kernel as(ordinal, value.stringValue, type):ordinalis the 1-based position in the parameters list (matches kernel's SEA-wire convention).value.stringValueis the connector's already-string-encoded value (orNonefor SQL NULL).typeis the SQL type name the connector produces ("INT","DECIMAL(10,2)", etc.).The connector's
ordinal: boolflag is checked only to reject named bindings — kernel v0 doesn't accept named bindings on the SEA wire, so we surface that at the connector layer with a pointedNotSupportedErrorrather than letting the server reject.What's supported
Every type the connector's native parameter classes emit:
BOOLEAN,TINYINT/SMALLINT/INT/BIGINT,FLOAT/DOUBLESTRING,DATE,TIMESTAMP/TIMESTAMP_NTZ,INTERVALDECIMAL(p,s)(precision/scale carried in the SQL type string; kernel parses them)VOID/ PythonNoneWhat's not supported (known gaps, raises
NotSupportedError)cursor.execute(":foo = ?", [param_named("foo", ...)])) — kernel v0 SEA wire is positional-only.ARRAY/MAP/STRUCT) — kernel parser rejects; surfaces asKernelError(InvalidArgument).BINARY— SEA wire limitation kernel-side; documented workaround is hex-encode +unhex(?)in SQL.Test plan
tests/unit/test_kernel_type_mapping.pyforbind_tspark_params— positional forwarding,None/VOID, named-binding rejection, missing-type defaulting, empty list.test_execute_command_rejects_parameters(which previously asserted theNotSupportedError) withtest_execute_command_forwards_parameters_to_bind_param: stubs the kernel statement, assertsbind_paramis called once perTSparkParameterin order with 1-based ordinals, andexecute()fires after binding.main).tests/e2e/test_kernel_backend.pyagainst dogfood:IntegerParameter,StringParameter,BooleanParameter)NoneparameterDECIMALparameter (precision/scale flows through the SQL type string)databricks-sql-kernelmain).This pull request and its description were written by Isaac.