Skip to content

[Feature Request]: SQL-type Library backup should use SQLite's built-in backup features, rather than a simple file copy #1332

@LikeLakers2

Description

@LikeLakers2

Checklist

  • I am using an up-to-date version.
  • I have read the documentation.
  • I have searched existing issues.

Description

The "Save Library Backup" option in the "File" menu currently performs a simple file copy of the current library database:

def save_library_backup_to_disk(self) -> Path:
assert isinstance(self.library_dir, Path)
makedirs(str(self.library_dir / TS_FOLDER_NAME / BACKUP_FOLDER_NAME), exist_ok=True)
filename = f"ts_library_backup_{datetime.now(UTC).strftime('%Y_%m_%d_%H%M%S')}.sqlite"
target_path = self.library_dir / TS_FOLDER_NAME / BACKUP_FOLDER_NAME / filename
shutil.copy2(
self.library_dir / TS_FOLDER_NAME / SQL_FILENAME,
target_path,
)
logger.info("Library backup saved to disk.", path=target_path)
return target_path

However, because the database file is not locked during this operation, there is the possibility for the backup to be a corrupt copy of the database. For example, if TagStudio or other programs write to the database during the backup, the backup has the potential to only copy some of the changes.

Admittedly, this chance is extremely small - but not nonexistent. That, combined with some of the options being (seemingly - I don't know too much about what TS uses for database stuff) easy to implement, leads me to suggest the following:

Solution

TagStudio should use SQLite's built-in backup features.

In particular, I would recommend using the VACUUM INTO <file> statement. This has the additional benefit of making the resulting backups smaller, due to VACUUM repacking the database to not include any deleted content. However, the database is locked for the entire operation - delaying any writes that TS and other programs might try to make.

Another option would be SQLite's Online Backup API. This is slightly more complicated than the VACUUM solution, and doesn't repack the database to a minimal size, but only blocks write operations for small periods during the backup - allowing TS and other programs to continue writing to the database.

Both of these options effectively zero the chance of the backup being corrupted, as they ensure that file will be locked in the appropriate ways during the backup. They also allow SQLite to verify that the backup was completed successfully.

Alternatives

We keep using the simple file copy - but this is undesirable due to the possibility for the backup to be corrupted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions