fix: cover mixed types in stat sampling#5569
Conversation
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
📝 WalkthroughWalkthroughTwo new helper functions ( ChangesMixed-type atom-type coverage in stat collection
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
deepmd/utils/model_stat.py (1)
112-116: 💤 Low valueRedundant type cast on line 116.
Line 112 already casts
natoms_vectonp.int32, so the conditional cast on line 116 is redundant.♻️ Remove redundant cast
extra_batch["natoms_vec"] = data.natoms_vec[sys_idx].astype(np.int32) extra_batch["default_mesh"] = data.default_mesh[sys_idx] for key, value in extra_batch.items(): - if key == "natoms_vec": - value = value.astype(np.int32) if ( key not in {"natoms_vec", "default_mesh"}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/utils/model_stat.py` around lines 112 - 116, The conditional block within the loop that iterates through extra_batch.items() contains a redundant type cast. Since natoms_vec is already cast to np.int32 when assigned to extra_batch on line 112, the additional astype(np.int32) call in the conditional block checking for key == "natoms_vec" is unnecessary. Remove the redundant conditional cast block (the if statement checking for natoms_vec and the associated value assignment) from the loop.deepmd/pt/utils/stat.py (1)
147-159: 💤 Low valueLoop terminates early if first missing type's frame is already used.
When
frame_idx in used_frames, the loop breaks immediately instead of trying another missing type. This is safe but could leave some types uncovered if a single frame covers multiple types. Since this is a conservative approach and the fix addresses the main issue (ensuring at least one representative frame per type is attempted), this is acceptable.♻️ Alternative: continue to next missing type instead of breaking
while len(missing_types) > 0: frame_idx = first_frame_for_type[int(missing_types[0])] if frame_idx < 0 or frame_idx in used_frames: - break + # This type's first frame is invalid or already fetched; remove from missing + # and continue. The frame may have covered multiple types, so recompute. + missing_types = missing_types[1:] + continue used_frames.add(frame_idx)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/utils/stat.py` around lines 147 - 159, The loop that iterates over missing_types breaks prematurely when the frame_idx for the first missing type is already in used_frames, which prevents searching for representative frames for other missing types. Instead of breaking when this condition occurs, the loop should continue to the next iteration to attempt finding an unused frame for other missing types. This ensures all type coverage is attempted rather than giving up at the first already-used frame.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@deepmd/pt/utils/stat.py`:
- Around line 147-159: The loop that iterates over missing_types breaks
prematurely when the frame_idx for the first missing type is already in
used_frames, which prevents searching for representative frames for other
missing types. Instead of breaking when this condition occurs, the loop should
continue to the next iteration to attempt finding an unused frame for other
missing types. This ensures all type coverage is attempted rather than giving up
at the first already-used frame.
In `@deepmd/utils/model_stat.py`:
- Around line 112-116: The conditional block within the loop that iterates
through extra_batch.items() contains a redundant type cast. Since natoms_vec is
already cast to np.int32 when assigned to extra_batch on line 112, the
additional astype(np.int32) call in the conditional block checking for key ==
"natoms_vec" is unnecessary. Remove the redundant conditional cast block (the if
statement checking for natoms_vec and the associated value assignment) from the
loop.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5ac212f0-e4d6-4d6c-be5a-e2b78e3b8e5b
📒 Files selected for processing (4)
deepmd/pt/utils/stat.pydeepmd/utils/model_stat.pysource/tests/common/test_model_stat.pysource/tests/pt/test_observed_type.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5569 +/- ##
==========================================
- Coverage 82.18% 82.17% -0.02%
==========================================
Files 898 898
Lines 103576 103676 +100
Branches 4434 4434
==========================================
+ Hits 85129 85191 +62
- Misses 17056 17090 +34
- Partials 1391 1395 +4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Summary
Fixes #4424.
Tests
uvx ruff check deepmd/pt/utils/stat.py deepmd/utils/model_stat.py source/tests/pt/test_observed_type.py source/tests/common/test_model_stat.pyuvx ruff format deepmd/pt/utils/stat.py deepmd/utils/model_stat.py source/tests/pt/test_observed_type.py source/tests/common/test_model_stat.py --checkpython3 -m py_compile deepmd/pt/utils/stat.py deepmd/utils/model_stat.py source/tests/pt/test_observed_type.py source/tests/common/test_model_stat.pygit diff --checkAuthored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Summary by CodeRabbit
Improvements
Tests