Skip to content

Memory leak: httpx.Response objects never garbage collected due to reference cycle in BoundSyncStream #395

@grafke

Description

@grafke

Related to #267

Description:

Long-running worker processes that poll Conductor continuously experience unbounded memory growth. We traced this to three issues in the SDK's HTTP layer, the most critical being a reference cycle that prevents httpx.Response objects from being garbage collected.

Impact

With 7 @worker_task decorators (= 7 OS subprocesses), we observed ~450 KB/min growth per subprocess, totaling ~12 MB/hour for the container. Over 24 hours that's ~288 MB of leaked memory.

Root cause 1: httpx.Response ↔ BoundSyncStream reference cycle (primary)

The SDK uses httpx.Client internally. When httpx processes a response, _send_single_request() wraps the stream:

httpx/_client.py:1019-1021

response.stream = BoundSyncStream(response.stream, response=response, start=start)

This creates a reference cycle:
response.stream → BoundSyncStream._response → response

BoundSyncStream.close() sets response.elapsed and closes the inner stream, but never clears self._response, so the cycle persists after close. CPython's cyclic GC should eventually collect these, but at typical poll rates (~1-2 req/sec per subprocess) the allocation rate outpaces collection.

Using tracemalloc in worker subprocesses, we measured the following growth per 60-second window:

 │         Allocation site         │ Growth/min │ Objects/min │
├──────────────────────────────────┼────────────┼─────────────┤
│ httpx/_models.py:162 (Headers)   │ +145 KiB   │ +2566       │
├──────────────────────────────────┼────────────┼─────────────┤
│ threading.py:307                 │ +79.4 KiB  │ +214        │                                                                                                                         
├──────────────────────────────────┼────────────┼─────────────┤
│ httpcore/_models.py:72           │ +46.8 KiB  │ +749        │                                                                                                                         
├──────────────────────────────────┼────────────┼─────────────┤
│ h11/_util.py:92                  │ +43.3 KiB  │ +962        │
├──────────────────────────────────┼────────────┼─────────────┤                                                                                                                         
│ httpx/_transports/default.py:254 │ +23.6 KiB  │ +109        │
└──────────────────────────────────┴────────────┴─────────────┘                                                                                                                         

After applying a workaround that breaks the cycle (response.stream = None; response._request = None after reading the body), all httpx allocations dropped to noise levels (~500 bytes/min).

Root cause 2: RESTResponse(io.IOBase) retains full httpx.Response

RESTResponse in conductor/client/http/rest.py inherits from io.IOBase, which registers a del finalizer on every instance. This adds GC finalization overhead and pressure.
More critically, self.resp = resp retains the entire httpx.Response object (with connection buffers, h2/hpack state, internal streams), when all downstream consumers only need the body text:

  • api_client.py:269 — response.resp.json()
  • api_client.py:271 — response.resp.text
  • rest.py:293-301,335 — http_resp.resp.text (in ApiException/AuthorizationException)

Root cause 3: self.last_response is write-only retention

ApiClient assigns self.last_response = response_data on every API call (api_client.py:176), but this field is never read anywhere in the codebase. It retains the last RESTResponse (and through it, the full httpx.Response) for the lifetime of the ApiClient.

Suggested fix

In RESTResponse — drop io.IOBase, eagerly read body, break the reference cycle:

class RESTResponse: # no io.IOBase
def init(self, resp):
self.status = resp.status_code
self.reason = getattr(resp, "reason_phrase", "")
self.data = resp.text # eagerly read body (triggers close())
self.headers = resp.headers
# Break Response ↔ BoundSyncStream reference cycle
resp.stream = None
resp._request = None

def json(self):
    return json.loads(self.data)                                                                                                                                                    
                                             
def getheaders(self):              
    return self.headers

In ApiException/AuthorizationException — change http_resp.resp.text → http_resp.data.

In api_client.py deserialize() — change response.resp.json() → response.json() and response.resp.text → response.data.

In api_client.py:176 — delete self.last_response = response_data (dead write).

Same changes in async_rest.py and async_api_client.py.

Our workaround

We monkey-patch RESTResponse at import time with a version that caches the body text in a lightweight shim (so existing .resp.text / .resp.json() callsites keep working without changes to ApiException or deserialize), breaks the reference cycle, and neuters last_response with a no-op descriptor.

Environment

  • conductor-python 1.3.9
  • Python 3.14
  • httpx (HTTP/1.1 transport via h11; also reproducible with HTTP/2 via h2)
  • Linux (Docker), 7 worker subprocesses via multiprocessing with fork

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions