Skip to content

Novelty check runs after evaluation, wasting compute on rejected programs #439

@lesshaste

Description

@lesshaste

I was reading the code, and as far I can tell the novelty checks are done in the wrong place if the evolved code is likely to take a long time to run.

Problem Description:
In the current implementation, the novelty check (which uses embedding similarity) runs AFTER the potentially expensive evaluation step. This means programs that would be rejected for being too similar to existing programs still get fully evaluated, wasting significant compute resources.

The novelty check has TWO components:

  • Embedding similarity check (cheap - generates embedding and compares cosine similarity)
  • LLM novelty judgment (expensive - calls LLM to judge if code is truly novel)

Current flow:

  1. Generate child program
  2. Evaluate program (expensive - runs the actual code)
  3. Add to database → Novelty check (embedding + optional LLM call)
  4. If too similar → reject

Expected flow:

  1. Generate child program
  2. Novelty check (embedding + optional LLM)
  3. If passes → Evaluate program (expensive)
  4. Add to database

Location of issue:

  • openevolve/process_parallel.py - Worker returns evaluated result
  • openevolve/database.py:300 - Novelty check called in database.add() AFTER evaluation is complete
  • openevolve/database.py:_is_novel() - Performs embedding similarity check
  • openevolve/database.py:_llm_judge_novelty() - Performs LLM-based novelty judgment

Evidence:
The database.add() method calls _is_novel() which includes both embedding comparison AND the LLM-based novelty judgment - but this happens only after the program has already been evaluated.

database.py - _is_novel() includes:

  1. Generate embedding for new program
  2. Compare with embeddings of existing programs (cosine similarity)
  3. If similarity < threshold: call _llm_judge_novelty() - calls LLM!

But this all happens AFTER evaluation in process_parallel.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions