Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions openevolve/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -861,6 +861,7 @@ def _calculate_feature_coords(self, program: Program) -> List[int]:
# Use code length as complexity measure
complexity = len(program.code)
bin_idx = self._calculate_complexity_bin(complexity)
program.complexity = bin_idx # Store complexity bin in program
coords.append(bin_idx)
Comment on lines 861 to 865
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning the bin index into Program.complexity/Program.diversity is semantically ambiguous (the dataclass defines these as derived feature values, currently typed as float). Consider either casting to float for consistency, or introducing explicit fields like complexity_bin/diversity_bin to avoid confusing bins with raw feature values.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

elif dim == "diversity":
# Use cached diversity calculation with reference set
Expand All @@ -869,6 +870,7 @@ def _calculate_feature_coords(self, program: Program) -> List[int]:
else:
diversity = self._get_cached_diversity(program)
bin_idx = self._calculate_diversity_bin(diversity)
program.diversity = bin_idx # Store diversity bin in program
coords.append(bin_idx)
Comment on lines 867 to 874
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cold-start branch (len(self.programs) < 2) diversity’s bin_idx is forced to 0 but program.diversity is not updated, so saved programs may still show the default value rather than the computed bin. Set program.diversity in this branch as well for consistency with the complexity handling.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

elif dim == "score":
# Use average of numeric metrics
Expand Down
5 changes: 3 additions & 2 deletions openevolve/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,9 +208,10 @@ async def evaluate_program(
if "combined_score" in eval_result.metrics:
# Original combined_score is just accuracy
accuracy = eval_result.metrics["combined_score"]
# Combine with LLM average (70% accuracy, 30% LLM quality)
# Combine accuracy with LLM average using dynamic weighting:
# (1 - llm_feedback_weight) * accuracy + llm_feedback_weight * LLM quality
eval_result.metrics["combined_score"] = (
accuracy * 0.7 + llm_average * 0.3
accuracy * (1-self.config.llm_feedback_weight) + llm_average * self.config.llm_feedback_weight
)
Comment on lines 213 to 215
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combined_score now depends on llm_feedback_weight, but there’s no guard ensuring the weight is within [0.0, 1.0]. If a user misconfigures this, the score can become negative or exceed expected bounds; consider clamping or raising a clear config error before using it here.

Copilot uses AI. Check for mistakes.
Comment on lines +214 to 215
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line exceeds the configured Black line length (100) and is missing spaces around operators (e.g., 1 - weight). Reformatting will improve readability and avoid formatting/lint churn in future diffs.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback


# Store artifacts if enabled and present
Expand Down