fix(benchmarks): guard against empty choices and message=None in LLM eval calls by qizwiz · Pull Request #219 · EverMind-AI/EverOS

qizwiz · 2026-05-18T07:44:56Z

What

Add guards at three LLM evaluation call sites in the EvoAgentBench domain evaluators before accessing choices[0].message.content.

Why

client.chat.completions.create() can return two empty-response shapes:

choices = [] — on content-policy rejections, rate-limit errors, or provider failures
choices[0].message = None — e.g. Gemini 2.5 Flash (via OpenAI-compatible endpoint) returns HTTP 200 with finish_reason: PROHIBITED_CONTENT and message=None

Both crash with IndexError or AttributeError. The existing try/except blocks catch these as generic "LLM evaluation failed: list index out of range" errors, making benchmark runs hard to diagnose.

Files changed

File	Fix
`benchmarks/EvoAgentBench/src/domains/information_retrieval/judge.py`	Guard before `resp.choices[0].message.content or ""`
`benchmarks/EvoAgentBench/src/domains/knowledge_work/evaluate.py`	Guard before `resp.choices[0].message.content`
`benchmarks/EvoAgentBench/src/domains/reasoning/evaluate.py`	Guard before `response.choices[0].message.content`

# Before
eval_text = resp.choices[0].message.content

# After
if not resp.choices or resp.choices[0].message is None:
    raise ValueError("LLM returned empty or filtered response")
eval_text = resp.choices[0].message.content

Corpus context

Detected by pact (llm_response_unguarded mode), a Z3-verified static analyzer for LLM crash vectors. This pattern was found across 13.8k violations in 800+ repos.

…eval calls client.chat.completions.create() can return choices=[] on content-policy rejections or provider errors, and choices[0].message=None on filtered responses (e.g. Gemini PROHIBITED_CONTENT via OpenAI-compatible endpoint). Both crash with IndexError/AttributeError. The existing try/except blocks catch these as generic 'LLM evaluation failed' errors, making them hard to diagnose. Explicit guards surface the root cause clearly.

github-actions Bot mentioned this pull request May 18, 2026

[watch] Overnight fork patrol: 2026-05-18 Fearvox/EverOS#34

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(benchmarks): guard against empty choices and message=None in LLM eval calls#219

fix(benchmarks): guard against empty choices and message=None in LLM eval calls#219
qizwiz wants to merge 1 commit into
EverMind-AI:mainfrom
qizwiz:fix/guard-empty-llm-response

qizwiz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qizwiz commented May 18, 2026

What

Why

Files changed

Corpus context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant