Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage by veeceey · Pull Request #2873 · openai/openai-python

veeceey · 2026-02-15T09:50:25Z

The lru_cache wrapping pydantic.TypeAdapter in _models.py was configured with maxsize=None (unbounded). In multi-threaded environments — like a typical webserver — pydantic regenerates parameterized generic types (e.g. ParsedResponseOutputMessage[MyClass]) with different object identities on each call. Since each new identity misses the cache, the cache grows without limit, leaking memory proportional to request volume.

This changes the cache to maxsize=128, which bounds memory usage while still providing effective caching for the most commonly used types. 128 is the standard default for functools.lru_cache and should cover the vast majority of real-world type usage patterns.

I verified this with a multi-threaded test — after spawning 200 threads that each call TypeAdapter, the cache stays bounded at maxsize=128 instead of growing indefinitely. Added a regression test that asserts the cache maxsize is not None.

… usage The lru_cache wrapping pydantic.TypeAdapter was set to maxsize=None (unbounded). In multi-threaded contexts, pydantic regenerates parameterized generic types with different identities on each call, so the cache grows without bound. This is especially problematic in webserver environments using responses.parse. Setting maxsize=128 bounds the cache and prevents the memory leak while still providing effective caching for the most recently used types. Fixes openai#2672

rona-sh · 2026-02-17T07:40:00Z

Hi,
Thank you for the fix.
Since max_size's default is 128, it is actually unnecessary to pass it as argumnet.
Also, please note that (as I wrote in my referenced issue) this makes the caching mechanism useless - in a multi-threaded environment every item will be used once, cached, and never used again.
The real solution to this problem should be smarter caching.

veeceey · 2026-02-18T08:36:34Z

Hi @rona-sh, thank you for the feedback! You make a great point — setting max_size=128 is indeed redundant since that's already the default, so I'll clean that up.

Regarding the caching concern, you're absolutely right that an LRU cache with unique-per-thread keys effectively becomes a write-only cache in high-concurrency scenarios. A smarter approach would be needed — for example, keying the cache on the type itself rather than on per-call identity, or using a thread-local cache, or even a simple dict without eviction since the number of distinct TypeAdapter types is typically bounded.

I'd love to hear your thoughts on which approach you'd prefer. I can update the PR accordingly, or if you'd rather tackle the smarter caching in a separate PR, I'm happy to scope this one down to just the immediate fix (removing the unbounded growth) as a stopgap. Let me know!

veeceey requested a review from a team as a code owner February 15, 2026 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873

Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873
veeceey wants to merge 1 commit intoopenai:mainfrom
veeceey:fix/issue-2672-typeadapter-memory-leak

veeceey commented Feb 15, 2026

Uh oh!

rona-sh commented Feb 17, 2026

Uh oh!

veeceey commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

veeceey commented Feb 15, 2026

Uh oh!

rona-sh commented Feb 17, 2026

Uh oh!

veeceey commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments