Skip to content

feat: add language version support across multiple language implement…#1680

Open
HeshamHM28 wants to merge 5 commits intoomni-javafrom
feat/java/wire-language-version
Open

feat: add language version support across multiple language implement…#1680
HeshamHM28 wants to merge 5 commits intoomni-javafrom
feat/java/wire-language-version

Conversation

@HeshamHM28
Copy link
Contributor

@HeshamHM28 HeshamHM28 commented Feb 26, 2026

No description provided.

…nguage_version

Make language_version the single source of truth for version info across
all languages. PythonSupport.language_version now returns platform.python_version()
instead of None. All API payloads use language_version as canonical, with
python_version kept only as a backward-compat shim for the backend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Ubuntu seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Ubuntu and others added 2 commits February 26, 2026 23:58
The Java Fibonacci E2E test was failing because AI-generated tests called
fibonacci(92)/fibonacci(93) against the naive recursive implementation,
which hangs forever. Since all tests run in a single Maven process, this
caused a 120s timeout that killed ALL tests, including the fast ones,
preventing any baseline from being established.

Fix: inject @timeout(30) on each @test method during instrumentation.
Individual hanging tests now get killed by JUnit without blocking others.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The default SAME_THREAD mode uses Thread.interrupt() which is silently
ignored by CPU-bound code like naive recursive fibonacci. SEPARATE_THREAD
runs the test in a new thread and fails it with TimeoutException when the
deadline passes, which actually works for tight computational loops.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +615 to +648
result_lines: list[str] = []
import_added = False
for line in lines:
result_lines.append(line)
# Insert after the last JUnit import line
if not import_added and line.strip().startswith("import org.junit.jupiter.api."):
# Peek ahead: if the next non-empty line is NOT another import, insert here
result_lines.append(timeout_import)
import_added = True
if not import_added:
# Fallback: insert before the first import
result_lines2: list[str] = []
for line in result_lines:
if not import_added and line.strip().startswith("import "):
result_lines2.append(timeout_import)
import_added = True
result_lines2.append(line)
result_lines = result_lines2
source = "\n".join(result_lines)
# Deduplicate: the import may appear twice if multiple junit imports existed
source = source.replace(f"{timeout_import}\n{timeout_import}", timeout_import)

# Add @Timeout after each @Test annotation (only if not already present)
lines = source.split("\n")
result_lines = []
for i, line in enumerate(lines):
result_lines.append(line)
stripped = line.strip()
if _is_test_annotation(stripped):
# Check if the next non-blank line is already @Timeout
next_idx = i + 1
while next_idx < len(lines) and not lines[next_idx].strip():
next_idx += 1
if next_idx >= len(lines) or not lines[next_idx].strip().startswith("@Timeout"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 20% (0.20x) speedup for _add_per_test_timeout in codeflash/languages/java/instrumentation.py

⏱️ Runtime : 3.86 milliseconds 3.22 milliseconds (best of 250 runs)

📝 Explanation and details

Runtime improvement (primary): the optimized version reduces average wall time from 3.86 ms to 3.22 ms — a 19% speedup — by cutting redundant string work and reducing list/appends during the import/annotation pass.

What changed (concrete optimizations)

  • Precompute stripped lines once: computed stripped_lines = [line.strip() for line in lines] and then reuse it for all checks. The original code called line.strip() repeatedly inside the main loop and inside the blank-line skipping loop; the optimized code does that work once.
  • Simplified import insertion: instead of building result_lines and doing a second pass to insert the import, the optimized code finds the insertion index with a single enumerate scan and uses list.insert. This removes multiple temporary lists, extra append calls and condition checks from the common path.
  • Use index-based iteration and reuse length: iterate by index over lines and reuse stripped_lines[i] (and n = len(lines)), which avoids repeated attribute lookups and redundant strip operations while scanning for the next non-blank line.

Why that yields a speedup

  • .strip() and string operations dominate cost when processing many lines. By doing strip once per line instead of many times, we convert many small repeated CPU-bound string operations into one predictable pass (the profiler shows that repeated stripping was a major hotspot).
  • Reducing intermediate list building and append churn (result_lines/result_lines2) lowers Python-level overhead (function calls, list resizing and per-append work) which is significant when the source has many lines / many @test annotations.
  • The while-loop that skips blank lines now checks precomputed stripped strings, so skipping is cheaper (no repeated .strip() calls per iteration).

Behavioral/Dependency changes

  • Behavior is preserved: the insertion logic and idempotency checks are the same. The only trade-off is an upfront allocation for stripped_lines (one list of len(lines)), which is negligible for typical file sizes and pays back when the source has many lines.

Trade-offs and when to expect regressions

  • Small files with only a few lines may see tiny regressions in several microbenchmarks because of the small upfront cost to build stripped_lines and the index-finding work; the annotated tests show a few microsecond slower cases. This is an acceptable trade-off for the overall runtime improvement because the optimization shines when there are many lines (the large-scale test with 1000 tests shows ~25–28% faster).
  • Memory: slight increase (one extra list of stripped strings). This is linear in number of lines and typically small compared to the source text itself.

Who benefits most

  • Hot paths that process large Java sources or files with many @test annotations (e.g., batch instrumentation of many tests) will benefit the most — shown by the large-scale test where runtime drops from ~1.38 ms to ~1.10 ms on the first pass and stays efficient on the idempotent second pass.
  • If this function is used frequently across many files (file-at-a-time transforms), the per-file reduction accumulates into meaningful CPU/time savings.

Summary

  • Primary win: 19% overall runtime improvement (3.86 ms -> 3.22 ms).
  • Key techniques: eliminate repeated strip() calls by precomputing stripped_lines, and reduce list/appending churn when inserting imports by finding an insertion index and using insert.
  • Result: faster processing on large inputs (best-case improvements visible in the annotated large-scale tests) with only a small upfront allocation and negligible change in behavior.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 18 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import re

import pytest  # used for our unit tests
# Import the function under test from the real module where it is defined.
from codeflash.languages.java.instrumentation import (
    _PER_TEST_TIMEOUT_SECONDS, _add_per_test_timeout)

def test_inserts_import_and_timeout_annotation_after_test():
    # A minimal Java source with a JUnit Test import and a single @Test method.
    source = (
        "package com.example;\n"
        "import org.junit.jupiter.api.Test;\n"
        "\n"
        "public class ExampleTest {\n"
        "    @Test\n"
        "    public void doesSomething() {\n"
        "    }\n"
        "}\n"
    )

    # Run the function under test with default timeout.
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 10.00μs -> 10.4μs (4.32% slower)

def test_does_not_duplicate_import_if_already_present():
    # If the source already contains the Timeout import, it should not be duplicated.
    source = (
        "import org.junit.jupiter.api.Timeout;\n"
        "import org.junit.jupiter.api.Test;\n"
        "\n"
        "class T {\n"
        "  @Test\n"
        "  void x() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 6.82μs -> 7.34μs (7.11% slower)

def test_ignores_test_like_annotations_not_exactly_test():
    # Ensure @TestOnly, @TestFactory, @TestTemplate are NOT considered @Test and do not get @Timeout.
    source = (
        "import org.junit.jupiter.api.Test;\n"
        "class C {\n"
        "  @TestOnly\n"
        "  void a() {}\n"
        "\n"
        "  @TestFactory\n"
        "  void b() {}\n"
        "\n"
        "  @TestTemplate\n"
        "  void c() {}\n"
        "\n"
        "  @Test\n"
        "  void realTest() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 12.5μs -> 12.3μs (0.891% faster)

def test_various_forms_of_test_annotation_recognized():
    # Test the different possible syntactic forms of the @Test annotation:
    # exactly '@Test', '@Test(', and '@Test ' (with a space), and variations with parameters.
    source = (
        "import org.junit.jupiter.api.Test;\n"
        "class V {\n"
        "  @Test\n"
        "  void a() {}\n"
        "\n"
        "  @Test()\n"
        "  void b() {}\n"
        "\n"
        "  @Test ( )\n"
        "  void c() {}\n"
        "\n"
        "  @Test(timeout = 5000)\n"
        "  void d() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 13.6μs -> 13.1μs (4.07% faster)

def test_existing_timeout_no_additional_annotation_added():
    # If a @Timeout annotation already follows the @Test (possibly with blank lines in between),
    # the function must not insert another @Timeout.
    source = (
        "import org.junit.jupiter.api.Test;\n"
        "import org.junit.jupiter.api.Timeout;\n"
        "class E {\n"
        "  @Test\n"
        "\n"
        "  @Timeout(value = 5, threadMode = Timeout.ThreadMode.SEPARATE_THREAD)\n"
        "  void alreadyTimedOut() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 6.52μs -> 6.99μs (6.72% slower)

def test_when_no_imports_present_still_adds_annotations_but_no_import():
    # If the file contains no import lines at all, the function will not add the import,
    # but it will still add @Timeout annotations after @Test.
    source = (
        "package p;\n"
        "class N {\n"
        "  @Test\n"
        "  void t() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 9.60μs -> 9.47μs (1.37% faster)

def test_deduplicates_import_added_after_multiple_junit_imports():
    # If multiple junit.jupiter imports exist in a row, the algorithm may try to insert
    # the timeout import multiple times; it should deduplicate adjacent duplicates.
    source = (
        "import org.junit.jupiter.api.Assertions;\n"
        "import org.junit.jupiter.api.Test;\n"
        "import org.junit.jupiter.api.Test;\n"
        "class D {\n"
        "  @Test\n"
        "  void t1() {}\n"
        "}\n"
    )
    codeflash_output = _add_per_test_timeout(source); result = codeflash_output # 8.88μs -> 9.33μs (4.82% slower)

def test_large_scale_many_tests_and_idempotency():
    # Build a large Java source with a single junit Test import and many @Test methods.
    num_tests = 1000  # Use up to 1000 as requested
    header = "package big;\nimport org.junit.jupiter.api.Test;\n\npublic class BigTest {\n"
    methods = []
    for i in range(num_tests):
        # Each method has an @Test annotation and a trivial method body.
        methods.append(f"    @Test\n    public void test{i}() {{}}\n")
    footer = "}\n"
    big_source = header + "\n".join(methods) + footer

    # Apply the transformation once.
    codeflash_output = _add_per_test_timeout(big_source); result = codeflash_output # 1.38ms -> 1.10ms (25.5% faster)

    # Now apply the transformation again to the already-transformed output.
    # The second application should be idempotent: no new @Timeout lines should be added.
    codeflash_output = _add_per_test_timeout(result); result2 = codeflash_output # 1.21ms -> 1.09ms (10.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import re  # used for pattern checks in assertions

import pytest  # used for our unit tests
# import the function and constant from the real module under test
from codeflash.languages.java.instrumentation import (
    _PER_TEST_TIMEOUT_SECONDS, _add_per_test_timeout)

# Helper: construct the canonical import and annotation strings used by the implementation.
_TIMEOUT_IMPORT = "import org.junit.jupiter.api.Timeout;"
_TIMEOUT_ANNOTATION = f"@Timeout(value = {_PER_TEST_TIMEOUT_SECONDS}, threadMode = Timeout.ThreadMode.SEPARATE_THREAD)"

def test_adds_import_and_timeout_after_junit_import_and_test_annotation():
    # Simple source with a junit Test import and a single @Test annotation.
    src = "\n".join(
        [
            "package example;",
            "import org.junit.jupiter.api.Test;",
            "public class Example {",
            "    @Test",
            "    public void testOne() {}",
            "}",
        ]
    )
    # Run the function under test.
    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 8.73μs -> 9.60μs (9.09% slower)

def test_does_not_duplicate_import_if_already_present_and_adds_annotations():
    # Source already contains the timeout import and a @Test. We should not add a second import.
    src = "\n".join(
        [
            "package example;",
            _TIMEOUT_IMPORT,
            "import org.junit.jupiter.api.Test;",
            "public class Example2 {",
            "@Test",
            "void method() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 6.11μs -> 6.82μs (10.4% slower)

def test_ignores_non_test_annotations_like_testonly_and_testfactory():
    # Ensure annotations starting with @Test but not being @Test proper are ignored.
    src = "\n".join(
        [
            "import org.junit.jupiter.api.Test;",
            "public class Example3 {",
            "    @TestOnly",
            "    public void helper() {}",
            "    @TestFactory",
            "    public void factory() {}",
            "    @Test",
            "    public void realTest() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 9.85μs -> 10.2μs (3.71% slower)

def test_recognizes_parameterized_test_declarations_and_preserves_spacing():
    # Tests with parentheses after @Test should be recognized.
    src = "\n".join(
        [
            "import org.junit.jupiter.api.Test;",
            "public class Example4 {",
            "    @Test(expected = RuntimeException.class)",
            "    public void throwsException() {}",
            "    @Test(timeout = 5000)",
            "    public void hasTimeoutAttr() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 9.66μs -> 9.96μs (3.01% slower)

def test_does_not_add_timeout_if_already_present_after_test():
    # If a @Timeout is already present after @Test (possibly separated by blank lines),
    # the function should not insert another @Timeout.
    src = "\n".join(
        [
            "import org.junit.jupiter.api.Test;",
            "import org.junit.jupiter.api.Timeout;",
            "public class Example5 {",
            "    @Test",
            "    @Timeout(value = 10, threadMode = Timeout.ThreadMode.SEPARATE_THREAD)",
            "    public void alreadyTimeout() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 5.69μs -> 6.17μs (7.81% slower)

def test_inserts_import_before_first_import_when_no_junit_imports_present():
    # When there are imports, but none from org.junit.jupiter.api., the timeout import should be inserted before the first import.
    src = "\n".join(
        [
            "package example;",
            "import java.util.List;",
            "import com.example.Foo;",
            "public class Example6 {",
            "    @Test",
            "    public void t() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 9.96μs -> 10.1μs (1.67% slower)

    # The timeout import should be inserted before the first import line.
    first_import_index = out.index("import java.util.List;")
    inserted_index = out.index(_TIMEOUT_IMPORT)

def test_no_import_added_when_no_import_lines_exist_but_annotation_is_added():
    # If the source contains no import lines at all, the function will not add the timeout import,
    # but it should still add the per-test timeout annotation after @Test.
    src = "\n".join(
        [
            "package example;",
            "public class Example7 {",
            "    @Test",
            "    void lonelyTest() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 8.49μs -> 8.86μs (4.18% slower)

def test_preserves_indentation_and_inserts_before_blank_lines():
    # If there are blank lines between @Test and the method, the function should still insert @Timeout
    # immediately after @Test and before the blank lines, preserving indentation.
    src = "\n".join(
        [
            "import org.junit.jupiter.api.Test;",
            "public class Example8 {",
            "        @Test",  # 8 spaces indentation
            "",
            "        public void spacedTest() {}",
            "}",
        ]
    )

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 8.44μs -> 8.92μs (5.39% slower)

def test_large_scale_insertion_for_many_tests():
    # Large-scale test: create a source with a single junit import and many @Test annotations to verify performance
    # and correctness at scale. Use 1000 annotations as required by the spec.
    n = 1000
    lines = ["package big;"]
    # include one junit import to anchor insertion
    lines.append("import org.junit.jupiter.api.Test;")
    lines.append("public class BigTests {")
    # add many test methods each with an @Test annotation
    for i in range(n):
        lines.append("    @Test")
        lines.append(f"    public void test{i}() {{}}")
    lines.append("}")
    src = "\n".join(lines)

    codeflash_output = _add_per_test_timeout(src); out = codeflash_output # 1.14ms -> 888μs (27.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1680-2026-02-27T00.19.12

Click to see suggested changes
Suggested change
result_lines: list[str] = []
import_added = False
for line in lines:
result_lines.append(line)
# Insert after the last JUnit import line
if not import_added and line.strip().startswith("import org.junit.jupiter.api."):
# Peek ahead: if the next non-empty line is NOT another import, insert here
result_lines.append(timeout_import)
import_added = True
if not import_added:
# Fallback: insert before the first import
result_lines2: list[str] = []
for line in result_lines:
if not import_added and line.strip().startswith("import "):
result_lines2.append(timeout_import)
import_added = True
result_lines2.append(line)
result_lines = result_lines2
source = "\n".join(result_lines)
# Deduplicate: the import may appear twice if multiple junit imports existed
source = source.replace(f"{timeout_import}\n{timeout_import}", timeout_import)
# Add @Timeout after each @Test annotation (only if not already present)
lines = source.split("\n")
result_lines = []
for i, line in enumerate(lines):
result_lines.append(line)
stripped = line.strip()
if _is_test_annotation(stripped):
# Check if the next non-blank line is already @Timeout
next_idx = i + 1
while next_idx < len(lines) and not lines[next_idx].strip():
next_idx += 1
if next_idx >= len(lines) or not lines[next_idx].strip().startswith("@Timeout"):
# Insert after the last JUnit import line
# (original logic inserts after the first junit import encountered;
# preserve that behavior: insert after first "import org.junit.jupiter.api." occurrence)
idx_first_junit = None
for idx, line in enumerate(lines):
if line.strip().startswith("import org.junit.jupiter.api."):
idx_first_junit = idx
break
if idx_first_junit is not None:
lines.insert(idx_first_junit + 1, timeout_import)
else:
# Fallback: insert before the first import
idx_first_import = None
for idx, line in enumerate(lines):
if line.strip().startswith("import "):
idx_first_import = idx
break
if idx_first_import is not None:
lines.insert(idx_first_import, timeout_import)
# else: no imports present, do not add
source = "\n".join(lines)
# Deduplicate: the import may appear twice if multiple junit imports existed
# Deduplicate: the import may appear twice if multiple junit imports existed
source = source.replace(f"{timeout_import}\n{timeout_import}", timeout_import)
# Add @Timeout after each @Test annotation (only if not already present)
lines = source.split("\n")
stripped_lines = [line.strip() for line in lines]
result_lines = []
n = len(lines)
for i in range(n):
line = lines[i]
result_lines.append(line)
stripped = stripped_lines[i]
if _is_test_annotation(stripped):
# Check if the next non-blank line is already @Timeout
next_idx = i + 1
while next_idx < n and not stripped_lines[next_idx]:
next_idx += 1
if next_idx >= n or not stripped_lines[next_idx].startswith("@Timeout"):

Static Badge

Behavior-mode instrumentation captured function return values as Object,
causing compilation errors when substituted back into generic contexts
(e.g. List<Long>.add(func())). Fixed by extracting the target function's
return type via tree-sitter and casting the Object variable back to the
correct type when _infer_array_cast_type cannot determine it from context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@HeshamHM28 HeshamHM28 force-pushed the feat/java/wire-language-version branch from 2f9d026 to 15e0d16 Compare February 27, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants