Skip to content

fix: tighten dill MemoryError downgrade gating#710

Merged
mldangelo merged 6 commits intomainfrom
codex/investigate-joblib/dill-memoryerror-security-issue
Mar 18, 2026
Merged

fix: tighten dill MemoryError downgrade gating#710
mldangelo merged 6 commits intomainfrom
codex/investigate-joblib/dill-memoryerror-security-issue

Conversation

@mldangelo
Copy link
Member

@mldangelo mldangelo commented Mar 16, 2026

Motivation

  • The MemoryError downgrade path treated .dill files (and some serialization cases) as legitimate based primarily on extension or shallow markers, which can be abused by a crafted pickle to reduce severity to INFO and bypass warning-level gates.
  • Prevent weakening scanner guarantees by requiring positive evidence that a serialization file is actually a legitimate joblib/dill artifact before treating parser MemoryError as a benign scanner limitation.

Description

  • Require dill-specific markers in payload for .dill files in _is_legitimate_serialization_file() instead of accepting the extension alone by checking for b"dill", b"_dill", or b"dill._dill" in the first 2KB of the file. (Edited modelaudit/scanners/pickle_scanner.py.)
  • Add detection of parsed dill globals (has_dill_globals) and tighten the serialization global-gate so extension-validated files require positive advanced-global evidence (.joblib -> joblib/sklearn/numpy globals, .dill -> dill globals) before a MemoryError downgrade is allowed. (Edited modelaudit/scanners/pickle_scanner.py.)
  • Add a regression test ensuring a .dill file that triggers MemoryError is not downgraded to INFO when no dill globals are parsed. (Edited tests/scanners/test_pickle_scanner.py.)

Testing

  • Ran uv run ruff format modelaudit/ tests/ successfully.
  • Ran uv run ruff check --fix modelaudit/ tests/ successfully.
  • Ran uv run mypy modelaudit/ with no type issues.
  • Ran full test suite with uv run pytest -n auto -m "not slow and not integration" --maxfail=1 which reported an existing unrelated test failure (tests/utils/helpers/test_secure_hasher.py::TestErrorHandling::test_hash_permission_denied); this failure is pre-existing and unrelated to the changes in this PR.
  • Executed targeted tests uv run pytest tests/scanners/test_pickle_scanner.py -k "dill_memory_error or dangerous_globals_not_downgraded or legitimate_pytorch_bin_memory_error" and the selected pickle-scanner regression tests passed.

Codex Task

Summary by CodeRabbit

  • New Features

    • Added explicit recognition of dill alongside joblib as legitimate serialization signals and improved detection that distinguishes extension-based vs extensionless serialized files.
  • Bug Fixes

    • Refined handling of memory-limit failures so legitimate serialized models are downgraded to informational severity while suspicious inputs remain warnings.
    • Improved robustness against dangerous globals when assessing file legitimacy.
  • Tests

    • Expanded and standardized tests for memory-error paths, dill/joblib variants, and opcode/global-based validation; relaxed one performance timing bound.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Reworks pickle scanner global-analysis: adds dill global detection, replaces marker-string heuristics with opcode/globals-based extension signals, introduces extension-based gating (including .dill and .joblib), and adjusts passes_global_gate and depth logic. Tests expanded for MemoryError, downgrade semantics, and joblib/dill sample generation.

Changes

Cohort / File(s) Summary
Pickle scanner logic
modelaudit/scanners/pickle_scanner.py
Adds internal _analyze_sample_globals, detects has_dill_globals, computes has_extension_based_serialization_globals and has_legitimate_serialization_globals, refactors get_depth and passes_global_gate to rely on opcode/globals-based signals instead of marker strings while preserving PyTorch and dangerous-global guards.
Scanner tests (MemoryError & gating)
tests/scanners/test_pickle_scanner.py
Adds many tests covering MemoryError handling, downgrade-to-info semantics, global-extraction budget/timeouts, and combinations of dangerous vs benign advanced globals for dill/joblib/pickle and PyTorch scenarios. Adjusts mocks to accept new scan_start_time parameter.
Joblib/dill test helper & fixtures
tests/test_dill_joblib_enhanced.py
Introduces _write_joblib_like_pickle helper, replaces inline joblib/dill sample construction with the helper across tests, and updates expectations for malicious vs legitimate dill cases.
Real-world performance test
tests/test_real_world_dill_joblib.py
Replaces dynamic pickle-writing with a static crafted binary payload and relaxes average validation timing assertion (1ms → 3ms) to reflect opcode scan behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hop through bytes both stiff and thin,
I count the globals tucked within.
Dill and joblib I now can tell,
I gate with care and sniff for smell,
A carrot for safe scans — nibble, grin!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: tightening the MemoryError downgrade gating logic for dill files by requiring positive evidence rather than relying on extension alone.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/investigate-joblib/dill-memoryerror-security-issue
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Line 5666: The check that computes has_dill_globals only matches "dill" and
"dill.*", which misses legitimate dill internals like "_dill" and "dill._dill"
and can cause false positives in the later global gate (the code around
has_dill_globals and the global gate at the block referenced by lines
5711-5717). Update the detection logic that computes has_dill_globals in
pickle_scanner.py to also accept module names that equal "_dill" or start with
"_dill" and likewise handle "dill._dill" (e.g., treat any mod that is "dill",
startswith("dill."), is "_dill", or startswith("_dill.") as dill-related); then
ensure the related global gate checks (the 5711-5717 logic) use this expanded
predicate so benign dill internals are not rejected while preserving
malicious-dill detection.
- Around line 2901-2905: The current check in the pickle scanner returns True
purely based on ext_lower == ".dill" which trusts the filename extension; change
this to require evidence of Dill content before treating as legitimate: in the
function that performs the shallow pickle header check (the block that checks
ext_lower and currently returns True for ".dill"), replace the extension-only
pass with an evidence-based check — e.g., read additional bytes from the stream
and look for Dill-specific markers (module/import references like b"dill" or
Dill-specific persistent-type signatures) or, if available, attempt a
guarded/dry-run dill sniff (import dill and call a non-destructive detection
helper) and only return True when those markers/detection succeed; keep
ext_lower as a secondary hint, not a sole condition.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 2174-2178: The monkeypatched lambda for
PickleScanner._extract_globals_advanced has the wrong signature and will raise
TypeError when callers pass keyword args like multiple_pickles or
scan_start_time; update the patch to accept the same parameters used in real
calls (e.g., self, file_obj, multiple_pickles=True, scan_start_time=None) or
simply use **kwargs to absorb extras so the test stub matches the real method
signature in PickleScanner.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ba029f34-b929-462b-9ba5-7bbaafcf36bb

📥 Commits

Reviewing files that changed from the base of the PR and between d9fe283 and 62d751c.

📒 Files selected for processing (2)
  • modelaudit/scanners/pickle_scanner.py
  • tests/scanners/test_pickle_scanner.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
modelaudit/scanners/pickle_scanner.py (2)

5666-5666: ⚠️ Potential issue | 🟠 Major

Broaden dill-global detection to include _dill internals.

Line 5666 misses _dill/_dill.*, so legitimate dill payloads can fail the dill-global gate.

Suggested fix
-            has_dill_globals = any(mod == "dill" or mod.startswith("dill.") for mod, _func, _opcode in advanced_globals)
+            has_dill_globals = any(
+                mod in {"dill", "_dill"}
+                or mod.startswith(("dill.", "_dill.", "dill._dill"))
+                for mod, _func, _opcode in advanced_globals
+            )

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when adding scanner/feature changes."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` at line 5666, The current dill
detection only checks for "dill" and "dill.*" in the generator that sets
has_dill_globals; update that check to also include "_dill" and "_dill.*" so
internal dill modules are detected. Modify the comprehension that computes
has_dill_globals (which iterates over advanced_globals) to treat mod equal to
"_dill" or mod.startswith("_dill.") as a match in addition to the existing
"dill" checks, and add unit tests covering benign and malicious payloads that
use both "dill" and "_dill" module names.

2901-2905: ⚠️ Potential issue | 🟠 Major

Require evidence-based .dill legitimacy checks, not extension-only acceptance.

Line 2904-2905 still returns True for any .dill file after a shallow header check, which keeps downgrade gating extension-trusting.

Suggested fix
-            # Dill can serialize plain pickle-compatible objects without
-            # embedding obvious dill markers near the front of the stream, so
-            # the extension remains the legitimacy signal here.
             if ext_lower == ".dill":
-                return True
+                f.seek(0)
+                sample = f.read(2048)
+                dill_indicators = [b"dill", b"_dill", b"dill._dill"]
+                return any(marker in sample for marker in dill_indicators)

Based on learnings: "Preserve or strengthen security detections; test both benign and malicious samples when adding scanner/feature changes."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelaudit/scanners/pickle_scanner.py` around lines 2901 - 2905, The current
acceptance of ".dill" files is extension-only; change the logic around the
ext_lower == ".dill" branch in the function handling pickle legitimacy (the
block referencing ext_lower and returning True) to perform evidence-based checks
instead of an unconditional return: read the file header and/or a small prefix
of bytes and verify dill-specific markers (or attempt a controlled, sandboxed
dill.loads that is tightly exception-caught), reject if deserialization yields
code objects or REDUCE/BUILD opcodes, and only return True when those deeper
checks pass; also add unit tests covering benign and malicious .dill samples to
ensure the new checks prevent downgrade gating.
tests/scanners/test_pickle_scanner.py (1)

2174-2178: ⚠️ Potential issue | 🟠 Major

Fix monkeypatch signature mismatch for _extract_globals_advanced.

Line 2177 patches _extract_globals_advanced as lambda self, file_obj: set(), but the scanner calls this method with keyword args (e.g., multiple_pickles, scan_start_time). This can raise TypeError and invalidate this regression scenario.

🐛 Proposed fix
     monkeypatch.setattr(
         PickleScanner,
         "_extract_globals_advanced",
-        lambda self, file_obj: set(),
+        lambda self, file_obj, multiple_pickles=True, scan_start_time=None: set(),
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/scanners/test_pickle_scanner.py` around lines 2174 - 2178, The
monkeypatch for PickleScanner._extract_globals_advanced uses a lambda with
signature (self, file_obj) but the real scanner calls it with keyword arguments
like multiple_pickles and scan_start_time; update the monkeypatch to accept the
extra parameters (e.g., change the replacement to a callable matching
PickleScanner._extract_globals_advanced by adding *args/**kwargs or explicit
keyword params) so the patched function signature matches calls and returns
set() as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Around line 5712-5718: The current logic sets
has_extension_based_serialization_globals true for file_ext == ".joblib" without
requiring parsed joblib/sklearn/numpy globals; change the predicate so .joblib
only qualifies when advanced_globals are present and has_joblib_globals is true
(e.g. replace the .joblib branch with file_ext == ".joblib" and
bool(advanced_globals) and has_joblib_globals), keep the .dill branch gated by
has_dill_globals, and then leave has_legitimate_serialization_globals to combine
that result with the existing not has_dangerous_advanced_global check.

---

Duplicate comments:
In `@modelaudit/scanners/pickle_scanner.py`:
- Line 5666: The current dill detection only checks for "dill" and "dill.*" in
the generator that sets has_dill_globals; update that check to also include
"_dill" and "_dill.*" so internal dill modules are detected. Modify the
comprehension that computes has_dill_globals (which iterates over
advanced_globals) to treat mod equal to "_dill" or mod.startswith("_dill.") as a
match in addition to the existing "dill" checks, and add unit tests covering
benign and malicious payloads that use both "dill" and "_dill" module names.
- Around line 2901-2905: The current acceptance of ".dill" files is
extension-only; change the logic around the ext_lower == ".dill" branch in the
function handling pickle legitimacy (the block referencing ext_lower and
returning True) to perform evidence-based checks instead of an unconditional
return: read the file header and/or a small prefix of bytes and verify
dill-specific markers (or attempt a controlled, sandboxed dill.loads that is
tightly exception-caught), reject if deserialization yields code objects or
REDUCE/BUILD opcodes, and only return True when those deeper checks pass; also
add unit tests covering benign and malicious .dill samples to ensure the new
checks prevent downgrade gating.

In `@tests/scanners/test_pickle_scanner.py`:
- Around line 2174-2178: The monkeypatch for
PickleScanner._extract_globals_advanced uses a lambda with signature (self,
file_obj) but the real scanner calls it with keyword arguments like
multiple_pickles and scan_start_time; update the monkeypatch to accept the extra
parameters (e.g., change the replacement to a callable matching
PickleScanner._extract_globals_advanced by adding *args/**kwargs or explicit
keyword params) so the patched function signature matches calls and returns
set() as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f51653da-8d4c-404c-85d8-377580683782

📥 Commits

Reviewing files that changed from the base of the PR and between 62d751c and f57c796.

📒 Files selected for processing (2)
  • modelaudit/scanners/pickle_scanner.py
  • tests/scanners/test_pickle_scanner.py

@mldangelo mldangelo merged commit 5eefa15 into main Mar 18, 2026
25 of 26 checks passed
@mldangelo mldangelo deleted the codex/investigate-joblib/dill-memoryerror-security-issue branch March 18, 2026 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant