feat: periodic soak test mode for RAC fuzz framework#22
feat: periodic soak test mode for RAC fuzz framework#22
Conversation
Phase A - Storage cleanup (works with existing finite runs): - Add created_at column to all FUZZ_* tables for TTL-based purge - Add FUZZ_WKL.cleanup() procedure + DBMS_SCHEDULER job (every 30min) - Add archive log cleanup loop in fuzz-test.sh up (hourly) - Add seq dict pruning in kafka-consumer.py (every 10min, 24h TTL) - Add SQLite event purge in validator.py after each validation cycle Phase B - Continuous soak operation: - Add FUZZ_WKL.run_forever() with rate limiting via DBMS_SESSION.SLEEP - Add SOAK_MODE to validator.py: continuous validate-purge cycles - Add stall detection (exit if no new events for 5min) - Add fuzz-test.sh soak subcommand with health monitoring - Add SQLite busy_timeout to handle concurrent writer contention Tested: 2-min finite run (11,238 events, 0 mismatches) and 5-min soak run (2,407 events validated in cycle 1, 0 mismatches) on RAC VM.
Phase B (continuous soak) is being redesigned as periodic fuzz runs; remove the streaming implementation: - Drop run_forever() in FUZZ_WKL package - Drop SOAK_MODE path in validator.py (stall detection, continuous loop) - Drop soak subcommand and action_soak from fuzz-test.sh - Remove SOAK-TEST.md design doc Replace host-side archive cleanup loop with a docker-compose service: - New archive-cleanup/ directory (Dockerfile + crontab) - Alpine + openssh-client + supercronic v0.2.44 (SHA1 pinned) - Hourly find -mtime +1 -print -delete, deletions logged to stdout - Lifecycle tied to docker-compose up/down, visible via docker logs Phase A cleanup mechanisms retained: - created_at columns + FUZZ_WKL.cleanup() + DBMS_SCHEDULER job - Consumer seq dict pruning (10min interval, 24h TTL) - validator.py purge_old_events() called after each one-shot cycle - SQLite busy_timeout=30000 on both consumer and validator
- Move FUZZ_* table cleanup into workload PL/SQL (every 5min) so DELETEs flow through CDC and get validated like normal DML. - Remove DBMS_SCHEDULER FUZZ_CLEANUP job from up/down (replaced by in-workload cleanup). - Fix grep -q + pipefail SIGPIPE false-failure in OLR/Debezium readiness waits.
…ator - fuzz-workload.sql: make FUZZ_WKL.run() idempotent across cycles. Repopulate per-node tracked-ID arrays from existing rows (parity- filtered), continue g_next_id from MAX(id) preserving parity, and continue g_event_seq from MAX numeric tail of existing event_ids so cycle N+1 event_ids never collide with cycle N. Seed INSERTs only run on cold start (empty tables). - validator.py: accept START_CURSOR env and emit '[validator] final_cursor=...' (safe frontier only). Lets a soak loop resume past already-validated events without re-scanning. - fuzz-test.sh: forward START_CURSOR to the validator container; dump sqlplus output when FUZZ_DONE summary is missing (diagnostic). Verified: 3 back-to-back cycles with no teardown, 32k events, 0 mismatches, cursor advances each cycle.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe PR enhances the RAC fuzz testing framework with archive cleanup infrastructure, resumable cursor support, TTL-based event purging, database resilience improvements, and a soak test orchestration script to enable continuous testing with cycle recovery. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 37 minutes and 38 seconds.Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
tests/dbz-twin/rac/archive-cleanup/Dockerfile (1)
1-14: Consider adding a non-root user for security best practices.The container runs as root by default. While this is acceptable for test infrastructure, adding a non-root user would align with container security best practices.
🛡️ Optional: Add non-root user
RUN apk add --no-cache openssh-client curl \ && curl -fsSL -o /usr/local/bin/supercronic "${SUPERCRONIC_URL}" \ && echo "${SUPERCRONIC_SHA} /usr/local/bin/supercronic" | sha1sum -c - \ && chmod +x /usr/local/bin/supercronic \ - && apk del curl + && apk del curl \ + && adduser -D -u 1000 cleanup +USER cleanup ENTRYPOINT ["/usr/local/bin/supercronic", "-passthrough-logs"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/archive-cleanup/Dockerfile` around lines 1 - 14, Add a non-root user and switch to it at the end of the Dockerfile: create a group/user (e.g., addgroup -S super && adduser -S super -G super), change ownership of /usr/local/bin/supercronic (and any other runtime paths like /etc/crontab if needed) to that user, and add a USER super line before ENTRYPOINT/CMD; ensure permissions allow execution of /usr/local/bin/supercronic and that ENTRYPOINT ("/usr/local/bin/supercronic", "-passthrough-logs") continues to work under the non-root user.tests/dbz-twin/rac/fuzz-test.sh (1)
463-463: Quote$_SSH_OPTSto prevent word splitting.Shellcheck SC2086 correctly identifies that
$_SSH_OPTSshould be quoted. While it likely works in practice, quoting is safer.🔧 Suggested fix
- ssh $_SSH_OPTS "${VM_USER}@${VM_HOST}" \ + ssh "$_SSH_OPTS" "${VM_USER}@${VM_HOST}" \Note: If
$_SSH_OPTScontains multiple space-separated options, you may need to use an array instead:ssh "${_SSH_OPTS[@]}" ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/fuzz-test.sh` at line 463, The ssh invocation in fuzz-test.sh uses an unquoted variable $_SSH_OPTS which can undergo word splitting (SC2086); update the ssh call to quote the variable (use "${_SSH_OPTS}" or, if _SSH_OPTS is intended as an array, use "${_SSH_OPTS[@]}") so options are preserved correctly when invoking ssh "${_SSH_OPTS}" "${VM_USER}@${VM_HOST}" ...; ensure references to _SSH_OPTS, VM_USER, VM_HOST and the ssh command are updated accordingly.tests/dbz-twin/rac/validator.py (2)
434-434: Minor: Unused variablenfin unpacking.Prefix with underscore to indicate it's intentionally unused.
🧹 Fix unused variable warning
- (v, m, mm, mo, ml, to_, tl, lmc, oc, nf) = result + (v, m, mm, mo, ml, to_, tl, lmc, oc, _nf) = result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/validator.py` at line 434, The tuple unpacking currently binds an unused variable named `nf`; change the unpack target `nf` to `_nf` (or `_`) in the unpack expression `(v, m, mm, mo, ml, to_, tl, lmc, oc, nf) = result` so the unused value is clearly marked and silences the warning while leaving the rest of the bindings (`v, m, mm, mo, ml, to_, tl, lmc, oc`) unchanged.
358-361: Minor: Remove unnecessary f-string prefixes.Lines 359 and 382 use f-strings without placeholders.
🧹 Fix linting warnings
- print(f"\n{'='*60}", flush=True) - print(f" Fuzz Test Validation Summary", flush=True) - print(f"{'='*60}", flush=True) + print(f"\n{'='*60}", flush=True) + print(" Fuzz Test Validation Summary", flush=True) + print(f"{'='*60}", flush=True)And at line 382:
- print(f"Validator starting", flush=True) + print("Validator starting", flush=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/dbz-twin/rac/validator.py` around lines 358 - 361, Several print statements use f-strings without placeholders (for example the header prints using {'='*60} and the print that outputs total_validated), which is unnecessary; update those prints to plain string literals by removing the leading f (e.g., change print(f"\n{'='*60}", ...) and print(f" Total validated: {total_validated}", ...) to use non-f-prefixed strings where appropriate and only keep f-strings for prints that actually interpolate variables like total_validated). Ensure you only remove the f-prefix on lines that have no Python expression interpolation so formatting remains identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/dbz-twin/rac/soak.sh`:
- Line 6: The hardcoded absolute path in the source command inside
tests/dbz-twin/rac/soak.sh should be replaced with a path computed relative to
the script location (the source invocation in soak.sh), e.g., compute the script
directory via $0 or ${BASH_SOURCE[0]} and source the vm-env.sh from
../environments/rac/vm-env.sh relative to that directory so the script is
portable across machines.
---
Nitpick comments:
In `@tests/dbz-twin/rac/archive-cleanup/Dockerfile`:
- Around line 1-14: Add a non-root user and switch to it at the end of the
Dockerfile: create a group/user (e.g., addgroup -S super && adduser -S super -G
super), change ownership of /usr/local/bin/supercronic (and any other runtime
paths like /etc/crontab if needed) to that user, and add a USER super line
before ENTRYPOINT/CMD; ensure permissions allow execution of
/usr/local/bin/supercronic and that ENTRYPOINT ("/usr/local/bin/supercronic",
"-passthrough-logs") continues to work under the non-root user.
In `@tests/dbz-twin/rac/fuzz-test.sh`:
- Line 463: The ssh invocation in fuzz-test.sh uses an unquoted variable
$_SSH_OPTS which can undergo word splitting (SC2086); update the ssh call to
quote the variable (use "${_SSH_OPTS}" or, if _SSH_OPTS is intended as an array,
use "${_SSH_OPTS[@]}") so options are preserved correctly when invoking ssh
"${_SSH_OPTS}" "${VM_USER}@${VM_HOST}" ...; ensure references to _SSH_OPTS,
VM_USER, VM_HOST and the ssh command are updated accordingly.
In `@tests/dbz-twin/rac/validator.py`:
- Line 434: The tuple unpacking currently binds an unused variable named `nf`;
change the unpack target `nf` to `_nf` (or `_`) in the unpack expression `(v, m,
mm, mo, ml, to_, tl, lmc, oc, nf) = result` so the unused value is clearly
marked and silences the warning while leaving the rest of the bindings (`v, m,
mm, mo, ml, to_, tl, lmc, oc`) unchanged.
- Around line 358-361: Several print statements use f-strings without
placeholders (for example the header prints using {'='*60} and the print that
outputs total_validated), which is unnecessary; update those prints to plain
string literals by removing the leading f (e.g., change print(f"\n{'='*60}",
...) and print(f" Total validated: {total_validated}", ...) to use
non-f-prefixed strings where appropriate and only keep f-strings for prints that
actually interpolate variables like total_validated). Ensure you only remove the
f-prefix on lines that have no Python expression interpolation so formatting
remains identical.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ab34e29b-3e2d-4124-a533-7cdad113d68b
📒 Files selected for processing (8)
tests/dbz-twin/rac/archive-cleanup/Dockerfiletests/dbz-twin/rac/archive-cleanup/crontabtests/dbz-twin/rac/docker-compose-fuzz.yamltests/dbz-twin/rac/fuzz-test.shtests/dbz-twin/rac/kafka-consumer.pytests/dbz-twin/rac/perf/fuzz-workload.sqltests/dbz-twin/rac/soak.shtests/dbz-twin/rac/validator.py
Summary
Test plan
run N+validate) still worksSummary by CodeRabbit
Release Notes
New Features
Improvements
Tests