Skip to content

encryption: implement DEK caching#4

Open
patrislav wants to merge 1 commit intomasterfrom
dek-caching
Open

encryption: implement DEK caching#4
patrislav wants to merge 1 commit intomasterfrom
dek-caching

Conversation

@patrislav
Copy link
Copy Markdown
Member

Adds an optional in-memory LRU cache for decrypted data encryption keys (DEKs), eliminating KMS round-trips on cache hits. Each decrypt/encrypt operation currently requires 2-3 KMS calls to recover the DEK via Shamir secret sharing — the cache reduces this to zero on hit.

Design

  • LRU + TTL eviction with configurable max size and TTL, using sync.Mutex + container/list
  • Singleflight dedup prevents concurrent cache misses for the same key from stampeding KMS
  • Key material zeroed on every eviction path (TTL, LRU, delete, clear)
  • Opt-in via WithCache() — existing callers are unaffected (no cache by default)
  • Encrypt path: still hits DynamoDB (to resolve key index → key ref), but skips KMS on cache hit
  • Decrypt path: skips DynamoDB, verification, and KMS entirely on cache hit

Security

  • DEK ↔ key ref mapping is immutable — cached DEKs cannot become stale
  • Key material is copied on store and on retrieval (caller and cache are fully independent)
  • RotateKey explicitly invalidates the cache entry
  • No migration handling needed — generation changes require restart, which clears the cache

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in, in-process decrypted DEK cache to the encryption.Pool to avoid repeated DynamoDB/KMS/Shamir work on repeated encrypt/decrypt operations for the same keyRef.

Changes:

  • Introduces CacheConfig, an in-memory LRU+TTL dekCache, and singleflight-style miss deduplication.
  • Updates Pool construction to accept PoolOptions (notably WithCache) and threads cache usage through Encrypt/Decrypt/RotateKey.
  • Adds unit tests covering cache hit/miss behavior, cache population, invalidation, and singleflight behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
encryption/pool.go Adds cache plumbing to Pool plus a new fetchDEK helper to centralize the miss path and deduplicate concurrent fetches.
encryption/pool_test.go Adds integration-style tests proving cache behavior across Encrypt/Decrypt/RotateKey and concurrent decrypts.
encryption/cache.go Implements the LRU+TTL DEK cache with key zeroing on eviction and singleflight coordination.
encryption/cache_test.go Adds unit tests for cache correctness (copy semantics, TTL/LRU eviction, delete/clear, singleflight).
Comments suppressed due to low confidence (1)

encryption/pool.go:235

  • Decrypt no longer annotates the trace span with the resolved key generation / key index (these annotations existed before the refactor). Since fetchDEK has the key object, consider re-adding these annotations (e.g., by setting them on the active span in ctx or returning key metadata) so cache misses still emit the same observability signals.
	key, found, err := p.keysTable.GetLatestByKeyRef(ctx, keyRef, false)
	if err != nil {
		return nil, fmt.Errorf("get latest key: %w", err)
	}
	if !found {
		return nil, fmt.Errorf("key not found")
	}
	if err := p.VerifyKey(ctx, att, key); err != nil {
		return nil, fmt.Errorf("verify key: %w", err)
	}

	config, err := p.getConfig(key.Generation)
	if err != nil {
		return nil, fmt.Errorf("get config: %w", err)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread encryption/pool.go
Comment on lines 161 to +183
@@ -139,6 +174,50 @@ func (p *Pool) Decrypt(ctx context.Context, att *enclave.Attestation, keyRef str
return nil, fmt.Errorf("decode ciphertext: %w", err)
}

// Try cache.
var privateKey []byte
if p.cache != nil {
if dek, ok := p.cache.get(keyRef); ok {
privateKey = dek
}
}
Comment thread encryption/pool.go
Comment on lines 163 to +164
// The key is verified against the attestation and migrated to the current generation if needed.
// If a DEK cache is configured, cached keys bypass DynamoDB and KMS on hit.
Comment thread encryption/pool.go
Comment on lines +245 to 255
if p.cache != nil {
p.cache.put(keyRef, privateKey)
}

// Trigger migration if needed. Migration is synchronous but non-fatal:
// failure is logged and does not affect the returned DEK.
if p.keyNeedsMigration(key) {
err := p.migrateKey(ctx, att, key, privateKey)
if err != nil {
// We don't want to fail the decryption if migration fails, log the error and continue
if err := p.migrateKey(ctx, att, key, privateKey); err != nil {
p.logger.ErrorContext(ctx, "migrating key failed", "error", err, "key_ref", key.KeyRef, "generation", key.Generation, "key_index", key.KeyIndex)
}
}
Comment thread encryption/cache_test.go
Comment on lines +59 to +65
// Grab internal slice before expiry.
c.mu.Lock()
internalSlice := c.entries["ref1"].dek
c.mu.Unlock()

time.Sleep(5 * time.Millisecond)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants