Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873
Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873veeceey wants to merge 1 commit intoopenai:mainfrom
Conversation
… usage The lru_cache wrapping pydantic.TypeAdapter was set to maxsize=None (unbounded). In multi-threaded contexts, pydantic regenerates parameterized generic types with different identities on each call, so the cache grows without bound. This is especially problematic in webserver environments using responses.parse. Setting maxsize=128 bounds the cache and prevents the memory leak while still providing effective caching for the most recently used types. Fixes openai#2672
|
Hi, |
|
Hi @rona-sh, thank you for the feedback! You make a great point — setting max_size=128 is indeed redundant since that's already the default, so I'll clean that up. Regarding the caching concern, you're absolutely right that an LRU cache with unique-per-thread keys effectively becomes a write-only cache in high-concurrency scenarios. A smarter approach would be needed — for example, keying the cache on the type itself rather than on per-call identity, or using a thread-local cache, or even a simple dict without eviction since the number of distinct TypeAdapter types is typically bounded. I'd love to hear your thoughts on which approach you'd prefer. I can update the PR accordingly, or if you'd rather tackle the smarter caching in a separate PR, I'm happy to scope this one down to just the immediate fix (removing the unbounded growth) as a stopgap. Let me know! |
Fixes #2672
The
lru_cachewrappingpydantic.TypeAdapterin_models.pywas configured withmaxsize=None(unbounded). In multi-threaded environments — like a typical webserver — pydantic regenerates parameterized generic types (e.g.ParsedResponseOutputMessage[MyClass]) with different object identities on each call. Since each new identity misses the cache, the cache grows without limit, leaking memory proportional to request volume.This changes the cache to
maxsize=128, which bounds memory usage while still providing effective caching for the most commonly used types. 128 is the standard default forfunctools.lru_cacheand should cover the vast majority of real-world type usage patterns.I verified this with a multi-threaded test — after spawning 200 threads that each call
TypeAdapter, the cache stays bounded atmaxsize=128instead of growing indefinitely. Added a regression test that asserts the cache maxsize is not None.