fix: graceful recovery from malformed LLM tool call output by gdeyoung · Pull Request #1458 · agent0ai/agent-zero

gdeyoung · 2026-04-06T03:45:30Z

TL;DR — Agent crashes on malformed LLM output instead of recovering gracefully

When validate_tool_request() raises a ValueError (e.g., malformed JSON from non-OpenAI models like GLM, MiniMax, Qwen), the exception propagates uncaught and crashes the entire agent. A simple try/except wrapper routes the output to the existing misformat warning path, allowing the agent to self-correct and continue.

One line of defensive wrapping prevents a crash loop that requires manual agent restart.

Problem

In agent.py, the validate_tool_request() call at line ~878 is unguarded:

if tool_request is not None:
    await self.validate_tool_request(tool_request)  # Can raise ValueError

When the LLM produces malformed JSON (common with GLM-5.1, MiniMax, Qwen), validate_tool_request() raises ValueError. This exception is not caught, causing the agent to crash instead of using the existing misformat recovery path.

Solution

Wrap the call in try/except ValueError:

if tool_request is not None:
    try:
        await self.validate_tool_request(tool_request)
    except ValueError:
        # Malformed JSON from LLM — treat as misformat, not fatal crash
        tool_request = None

When tool_request is set to None, the existing misformat warning logic handles it — prompting the agent to self-correct on the next turn.

Impact

Before: Agent crashes on malformed LLM output, requires manual restart
After: Agent receives misformat warning and self-corrects
Combined with PR fix: prevent infinite cascade crash by converting fw.warning/fw.error to plain text #1457 (cascade fix): ~90% reduction in agent crashes from JSON errors

Changes

File	Change
`agent.py`	Wrap `validate_tool_request()` in `try/except ValueError`

Testing

2 regression tests verify ValueError is caught and tool_request is set to None
21/21 tests passing across all critical patches
Running in production for 3+ days

When LLMs output valid JSON with tool_name but missing/malformed tool_args, json_parse_dirty() returned the incomplete dict as-is, causing ValueError crashes in validate_tool_request(). Root cause fix: normalize tool requests in json_parse_dirty() before returning them: - Missing tool_args -> defaults to {} - tool_args as string -> converts to dict - tool_args as non-dict -> defaults to {} - tool key -> normalized to tool_name Non-tool requests are left untouched. Related: agent0ai#1458, agent0ai#1466

This was referenced Apr 6, 2026

feat: add regex fallback parser to recover malformed JSON tool calls #1459

Open

fix: guard against None model in call_utility_model to prevent crash #1460

Open

fix: add ALWAYS REQUIRED markers to communication prompt to prevent missing tool_name #1461

Open

gdeyoung mentioned this pull request Apr 9, 2026

fix: normalize incomplete tool requests at extraction layer #1483

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: graceful recovery from malformed LLM tool call output#1458

fix: graceful recovery from malformed LLM tool call output#1458
gdeyoung wants to merge 1 commit intoagent0ai:mainfrom
gdeyoung:fix/validation-recovery-malformed-llm-v17

gdeyoung commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gdeyoung commented Apr 6, 2026

TL;DR — Agent crashes on malformed LLM output instead of recovering gracefully

Problem

Solution

Impact

Changes

Testing

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant