-
Notifications
You must be signed in to change notification settings - Fork 894
Description
I was reading the code, and as far I can tell the novelty checks are done in the wrong place if the evolved code is likely to take a long time to run.
Problem Description:
In the current implementation, the novelty check (which uses embedding similarity) runs AFTER the potentially expensive evaluation step. This means programs that would be rejected for being too similar to existing programs still get fully evaluated, wasting significant compute resources.
The novelty check has TWO components:
- Embedding similarity check (cheap - generates embedding and compares cosine similarity)
- LLM novelty judgment (expensive - calls LLM to judge if code is truly novel)
Current flow:
- Generate child program
- Evaluate program (expensive - runs the actual code)
- Add to database → Novelty check (embedding + optional LLM call)
- If too similar → reject
Expected flow:
- Generate child program
- Novelty check (embedding + optional LLM)
- If passes → Evaluate program (expensive)
- Add to database
Location of issue:
- openevolve/process_parallel.py - Worker returns evaluated result
- openevolve/database.py:300 - Novelty check called in database.add() AFTER evaluation is complete
- openevolve/database.py:_is_novel() - Performs embedding similarity check
- openevolve/database.py:_llm_judge_novelty() - Performs LLM-based novelty judgment
Evidence:
The database.add() method calls _is_novel() which includes both embedding comparison AND the LLM-based novelty judgment - but this happens only after the program has already been evaluated.
database.py - _is_novel() includes:
- Generate embedding for new program
- Compare with embeddings of existing programs (cosine similarity)
- If similarity < threshold: call _llm_judge_novelty() - calls LLM!
But this all happens AFTER evaluation in process_parallel.py