Skip to content

Add import history cleanup task and max_import_history setting fixes …#14521

Open
tejas0077 wants to merge 2 commits intoDefectDojo:devfrom
tejas0077:feature/import-history-cleanup
Open

Add import history cleanup task and max_import_history setting fixes …#14521
tejas0077 wants to merge 2 commits intoDefectDojo:devfrom
tejas0077:feature/import-history-cleanup

Conversation

@tejas0077
Copy link
Contributor

Description

When running frequent reimports, the dojo_test_import_finding_action table
grows infinitely causing significant database bloat (reported cases of 19GB+).

This PR adds a configurable max_import_history setting similar to the existing
max_dupes feature to automatically clean up old import history records.

Changes made:

  • Added max_import_history field to System_Settings model
  • Added DD_IMPORT_HISTORY_MAX_PER_OBJECT setting to settings.dist.py
  • Added async_import_history_cleanup celery task in tasks.py
  • Added migration 0262_system_settings_max_import_history.py

Closes #13776

Test results

Manually verified the new field appears correctly in the System_Settings model.
The cleanup task follows the same pattern as the existing async_dupe_delete task.

Documentation

No documentation changes needed.

Checklist

  • Make sure to rebase your PR against the very latest dev.
  • Features/Changes should be submitted against the dev branch.
  • Model changes include the necessary migrations in dojo/db_migrations folder.
  • Added the proper labels to categorize this PR.

@github-actions github-actions bot added New Migration Adding a new migration file. Take care when merging. settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR labels Mar 14, 2026
@dryrunsecurity
Copy link

dryrunsecurity bot commented Mar 14, 2026

DryRun Security

🔴 Risk threshold exceeded.

This pull request modifies several sensitive files (dojo/db_migrations/0262_system_settings_max_import_history.py, dojo/models.py, and dojo/tasks.py) that are flagged by the configured codepaths scanner; these edits may require review or changes to .dryrunsecurity.yaml to permit the authors or paths. Please review the changes carefully for security impact and update allowed paths/authors if the edits are intended.

🔴 Configured Codepaths Edit in dojo/db_migrations/0262_system_settings_max_import_history.py (drs_feb15ef0)
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/models.py (drs_30105223)
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/tasks.py (drs_fc47c623)
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.
🔴 Configured Codepaths Edit in dojo/tasks.py (drs_514a6cfc)
Vulnerability Configured Codepaths Edit
Description Sensitive edits detected for this file. Sensitive file paths and allowed authors can be configured in .dryrunsecurity.yaml.

We've notified @mtesauro.


Comment to provide feedback on these findings.

Report false positive: @dryrunsecurity fp [FINDING ID] [FEEDBACK]
Report low-impact: @dryrunsecurity nit [FINDING ID] [FEEDBACK]

Example: @dryrunsecurity fp drs_90eda195 This code is not user-facing

All finding details can be found in the DryRun Security Dashboard.

Comment on lines +223 to +224
except System_Settings.DoesNotExist:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please log something or just let the exception bobble up

Copy link
Member

@valentijnscholten valentijnscholten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this as there is no trigger configured for this job to run?
Can you add a test case to make sure the right entries are deleted (oldest first, no more than needed, ...)

@tejas0077
Copy link
Contributor Author

Hi @valentijnscholten, thank you for the review!

I have addressed the first point by adding proper logging in the System_Settings.DoesNotExist exception handler.

Regarding the trigger for the job — you are correct that there is no periodic task configured yet. Should I add a Celery beat schedule similar to the existing async_dupe_delete task? Looking at the codebase, async_dupe_delete is triggered via the system settings UI. Should max_import_history follow the same pattern?

Regarding test cases — I will add unit tests to verify:

  • Oldest entries are deleted first
  • No more than the configured limit are deleted per run
  • Entries within the limit are not deleted

Could you point me to an existing test for async_dupe_delete so I can follow the same pattern?

Copy link
Contributor

@Maffooch Maffooch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import histories serve as an audit log for what occurred during an import/reimport. Rather than just dropped after a certain number of events, It would be better to make this cleanup time based (i.e. delete records older than one year)

@tejas0077
Copy link
Contributor Author

Thanks @Maffooch! That makes sense — treating import history as an audit log with time-based retention is a better approach. I'll rework the cleanup task to delete records older than a configurable number of days (defaulting to 365 days / 1 year) instead of using a count-based limit. Should the setting be called import_history_retention_days or do you have a preferred name for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New Migration Adding a new migration file. Take care when merging. settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants