Skip to content

Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873

Open
veeceey wants to merge 1 commit intoopenai:mainfrom
veeceey:fix/issue-2672-typeadapter-memory-leak
Open

Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873
veeceey wants to merge 1 commit intoopenai:mainfrom
veeceey:fix/issue-2672-typeadapter-memory-leak

Conversation

@veeceey
Copy link

@veeceey veeceey commented Feb 15, 2026

Fixes #2672

The lru_cache wrapping pydantic.TypeAdapter in _models.py was configured with maxsize=None (unbounded). In multi-threaded environments — like a typical webserver — pydantic regenerates parameterized generic types (e.g. ParsedResponseOutputMessage[MyClass]) with different object identities on each call. Since each new identity misses the cache, the cache grows without limit, leaking memory proportional to request volume.

This changes the cache to maxsize=128, which bounds memory usage while still providing effective caching for the most commonly used types. 128 is the standard default for functools.lru_cache and should cover the vast majority of real-world type usage patterns.

I verified this with a multi-threaded test — after spawning 200 threads that each call TypeAdapter, the cache stays bounded at maxsize=128 instead of growing indefinitely. Added a regression test that asserts the cache maxsize is not None.

… usage

The lru_cache wrapping pydantic.TypeAdapter was set to maxsize=None
(unbounded). In multi-threaded contexts, pydantic regenerates
parameterized generic types with different identities on each call,
so the cache grows without bound. This is especially problematic in
webserver environments using responses.parse.

Setting maxsize=128 bounds the cache and prevents the memory leak
while still providing effective caching for the most recently used types.

Fixes openai#2672
@veeceey veeceey requested a review from a team as a code owner February 15, 2026 09:50
@rona-sh
Copy link

rona-sh commented Feb 17, 2026

Hi,
Thank you for the fix.
Since max_size's default is 128, it is actually unnecessary to pass it as argumnet.
Also, please note that (as I wrote in my referenced issue) this makes the caching mechanism useless - in a multi-threaded environment every item will be used once, cached, and never used again.
The real solution to this problem should be smarter caching.

@veeceey
Copy link
Author

veeceey commented Feb 18, 2026

Hi @rona-sh, thank you for the feedback! You make a great point — setting max_size=128 is indeed redundant since that's already the default, so I'll clean that up.

Regarding the caching concern, you're absolutely right that an LRU cache with unique-per-thread keys effectively becomes a write-only cache in high-concurrency scenarios. A smarter approach would be needed — for example, keying the cache on the type itself rather than on per-call identity, or using a thread-local cache, or even a simple dict without eviction since the number of distinct TypeAdapter types is typically bounded.

I'd love to hear your thoughts on which approach you'd prefer. I can update the PR accordingly, or if you'd rather tackle the smarter caching in a separate PR, I'm happy to scope this one down to just the immediate fix (removing the unbounded growth) as a stopgap. Let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes

2 participants

Comments