Skip to content

fix: stream database dumps for large databases (fixes #59)#127

Open
armorbreak001 wants to merge 1 commit intoouterbase:mainfrom
armorbreak001:fix/large-db-dump-streaming
Open

fix: stream database dumps for large databases (fixes #59)#127
armorbreak001 wants to merge 1 commit intoouterbase:mainfrom
armorbreak001:fix/large-db-dump-streaming

Conversation

@armorbreak001
Copy link
Copy Markdown

Summary

Fixes #59Database dumps do not work on large databases

The Problem

The original dump.ts loaded all data from all tables into memory as a single string before creating the Blob response. For databases approaching 1GB+ (the Durable Objects limit):

  • ❌ Out-of-memory errors (128MB DO limit)
  • ❌ DO blocked for entire export duration
  • ❌ Failed dumps for any database that doesn't fit in memory

The Solution

Rewrote src/export/dump.ts to use streaming architecture:

Feature Before After
Memory usage O(total DB size) O(batch_size) ~256KB
Row fetching SELECT * FROM table (all rows) LIMIT 1000 OFFSET N (batched)
Response Build string → Blob → Response ReadableStream (streaming)
Concurrency Blocks DO until done Yields between batches
NULL handling Not handled Properly escaped

Key Changes

  1. ReadableStream output — data streams to client as it's generated, never held in memory all at once
  2. fetchRowsBatches() async generator — fetches 1,000 rows at a time via LIMIT/OFFSET
  3. cooperativeYield() — yields control every ~256KB so other requests on this DO can be processed
  4. Improved escapeSqlValue() — handles NULL, undefined, bigint (original only handled strings and numbers)

Testing

  • ✅ All 23 existing export tests pass (5 dump-specific + 18 others)
  • ✅ Backward compatible — same SQL output format, same API response shape
  • ✅ Handles edge cases: empty DBs, tables with no data, special characters in values

Files Changed

  • src/export/dump.ts — Complete rewrite of dump logic (160 insertions, 44 deletions)

Problem: The original dump implementation loaded ALL data from ALL tables
into memory as a single string before creating the Blob response. For large
databases (>1GB), this causes:
- Out-of-memory errors in Durable Objects (128MB limit)
- Blocked Durable Object during long exports
- Failed dumps for any database that doesn't fit in memory

Solution: Rewrite dump.ts to use streaming:
1. ReadableStream output — memory stays O(batch_size) not O(total_db_size)
2. Batched row fetching via LIMIT/OFFSET (1000 rows/batch)
3. Cooperative multitasking — yields control between batches so other
   requests on the same Durable Object can be processed
4. Proper SQL value escaping (NULL, bigint, undefined handling)

Fixes outerbase#59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database dumps do not work on large databases

1 participant