Skip to content

⚡️ Speed up function _is_jsx_component_usage by 83% in PR #1561 (add/support_react)#1653

Open
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-24T21.52.21
Open

⚡️ Speed up function _is_jsx_component_usage by 83% in PR #1561 (add/support_react)#1653
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-24T21.52.21

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 24, 2026

⚡️ This pull request contains optimizations for PR #1561

If you approve this dependent PR, these changes will be merged into the original PR branch add/support_react.

This PR will be automatically closed if the original PR is merged.


📄 83% (0.83x) speedup for _is_jsx_component_usage in codeflash/languages/javascript/instrument.py

⏱️ Runtime : 874 microseconds 478 microseconds (best of 250 runs)

📝 Explanation and details

Runtime improved from 874 μs to 478 μs (about 1.83× faster, ~82% relative speedup). The optimized version was accepted for this runtime improvement.

What changed

  • Precompiled the render detection regex to a module-level compiled pattern (_RENDER_CALL_RE = re.compile(...)) so we don't recompile the same pattern on every call.
  • Added cheap substring checks ("render" not in code or "<" not in code or func_name not in code) to fast-fail obvious negatives before any regex work.
  • Kept the existing jsx regex (which must include func_name via re.escape) but now it is only executed when the cheap checks pass.

Why this speeds things up

  • Regex compilation and execution are relatively expensive in Python. The original profiler shows two heavy costs: re.search(jsx_pattern, code) consumed ~78% of the time and re.search(r"\brender\s*(", code) ~20%. Avoiding unnecessary regex calls produces the biggest wins.
  • The substring tests are O(n) scans using optimized C code (very cheap) and frequently rule out the need to run the heavier regexes. In negative/common cases (no render, no "<", or func_name absent) we return quickly with only a tiny C-level cost.
  • Precompiling the render regex removes repeated compilation overhead and makes the final render check slightly cheaper and more predictable.

Evidence in profiling & tests

  • Total runtime halved in the benchmark (874 μs → 478 μs).
  • The optimized profiler shows the cheap substring check uses only a small fraction of time while the expensive jsx regex runs less frequently relative to overall runtime.
  • Tests that represent large inputs with many JSX-like tags but no render() call (the common pathological case) show the largest wins (e.g., big_no_render went from 226 μs → 9.74 μs). That demonstrates the early-exit check is extremely effective on large inputs.
  • Some small, positive cases (where both "<" and "render" are present and the JSX regex matches) saw tiny regressions because the extra substring checks add a marginal constant cost before the successful regex checks. Overall this trade-off is acceptable because it yields large wins on common/expensive negative cases and lowers average runtime.

Behavioral impact and safety

  • The function’s semantics are unchanged: it still escapes func_name for the JSX check and still requires a render() call (the same word-boundary render pattern is used, but now via a compiled regex).
  • This optimization benefits workloads that call this function many times or process large source strings (hot-paths that analyze files or AST-less heuristic checks). In those scenarios the early-fail and compiled pattern greatly reduce CPU work per call.
  • If you expect many repeated calls with the same func_name and always-positive cases, a further micro-optimization could be to cache compiled jsx patterns per func_name — but that wasn't necessary to get the large runtime improvements seen here.

Summary

  • Primary benefit: substantial runtime reduction (1.83× faster / ~82% speedup).
  • Key techniques: cheap substring fast-fail + module-level compiled regex.
  • Trade-offs: negligible extra cost for tiny positive cases, but large wins for common negative and large-input cases. This is a favorable trade-off for runtime-sensitive code paths.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 32 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

# imports
import re

import pytest  # used for our unit tests
from codeflash.languages.javascript.instrument import _is_jsx_component_usage

def test_basic_jsx_with_render_true():
    # A straightforward JSX self-closing tag and a render() call -> should be True
    code = "import React from 'react'; render(<MyComponent />);"
    codeflash_output = _is_jsx_component_usage(code, "MyComponent") # 6.79μs -> 6.90μs (1.59% slower)

def test_no_jsx_but_render_false():
    # There is a render() call, but no JSX usage of the target component -> should be False
    code = "render(someFunction()); // no JSX tag of MyComponent"
    codeflash_output = _is_jsx_component_usage(code, "MyComponent") # 3.42μs -> 2.26μs (50.9% faster)

def test_jsx_but_no_render_false():
    # There is JSX usage but no render() function invocation -> should be False
    code = "const el = <MyComponent prop='x' />; // no render() present"
    codeflash_output = _is_jsx_component_usage(code, "MyComponent") # 6.41μs -> 6.54μs (2.00% slower)

def test_jsx_with_space_and_attributes_true():
    # JSX with arbitrary spacing before the component name and attributes should match
    code = "/* jsx */ function t(){ return <   FancyComp prop='v'>Hello</FancyComp>; } render(thing);"
    codeflash_output = _is_jsx_component_usage(code, "FancyComp") # 6.70μs -> 6.65μs (0.752% faster)

def test_component_name_with_special_regex_chars_escaped():
    # Component names that include regex-special characters (like + or .) must be treated literally.
    # The implementation uses re.escape, so these should match correctly.
    code_plus = "render(<Comp+X/>);"
    codeflash_output = _is_jsx_component_usage(code_plus, "Comp+X") # 5.71μs -> 5.47μs (4.41% faster)

    code_dot = "render(<Comp.Name></Comp.Name>);"
    codeflash_output = _is_jsx_component_usage(code_dot, "Comp.Name") # 2.78μs -> 2.83μs (1.73% slower)

def test_case_sensitivity_of_component_name():
    # The regex is literal and case-sensitive, so differing case should not match.
    code = "render(<mycomponent />);"
    codeflash_output = _is_jsx_component_usage(code, "MyComponent") # 4.02μs -> 2.33μs (72.8% faster)

def test_render_word_boundary_required():
    # The render() detection requires a word boundary before "render", so "superrender(" should not count.
    code = "superrender(<MyComp/>);"
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.39μs -> 5.55μs (2.88% slower)

    # An underscore before 'render' is a word character so the boundary fails as well.
    code2 = "foo_render(<MyComp/>);"
    codeflash_output = _is_jsx_component_usage(code2, "MyComp") # 2.48μs -> 2.52μs (1.19% slower)

    # But a proper standalone render works.
    code3 = "render(<MyComp/>);"
    codeflash_output = _is_jsx_component_usage(code3, "MyComp") # 2.00μs -> 1.92μs (4.16% faster)

def test_empty_code_returns_false():
    # Empty source code should never indicate JSX usage
    codeflash_output = _is_jsx_component_usage("", "Anything") # 3.23μs -> 1.98μs (62.6% faster)

def test_none_code_raises_type_error():
    # Passing None for code is a type error because re.search expects str/bytes
    with pytest.raises(TypeError):
        _is_jsx_component_usage(None, "X") # 5.33μs -> 3.88μs (37.5% faster)

def test_empty_func_name_matches_generic_jsx_when_render_present():
    # If func_name is empty, the regex reduces to looking for '<' followed by whitespace or '>' or '/',
    # so minimal JSX like "< />" or "<>" will match. We assert the current behavior (True when render present).
    code1 = "render(< />);"  # space then slash
    codeflash_output = _is_jsx_component_usage(code1, "") # 4.41μs -> 4.49μs (1.80% slower)

    code2 = "render(<>);"
    # "<>" is also matched by the pattern (no spaces required because of the character class),
    # so with render present the function returns True.
    codeflash_output = _is_jsx_component_usage(code2, "") # 1.91μs -> 1.82μs (4.88% faster)

def test_large_scale_many_tags_and_single_target_true():
    # Build a large document with many different JSX-like tags and only one occurrence of the target component.
    parts = []
    # Add 900 generic tags to simulate a big file
    for i in range(900):
        parts.append(f"<A{i} />")
    # Insert many benign render-like words to ensure they don't accidentally trigger (but add a real render later)
    for i in range(50):
        parts.append(f"/* comment render-like {i} */")
    # Insert the target JSX somewhere deep in the file
    parts.append("<DeepNamespace.TargetComponent prop='x'/>")
    # Finally add the required render() call at the end
    parts.append("someOtherCode(); render(document.body);")
    big_code = "\n".join(parts)

    # The function should find the escaped component name and the render() and return True
    codeflash_output = _is_jsx_component_usage(big_code, "DeepNamespace.TargetComponent") # 110μs -> 117μs (5.16% slower)

def test_large_scale_many_tags_no_render_false():
    # Similar to the previous large file, but omit render() entirely -> should be False
    parts = []
    for i in range(1000):
        parts.append(f"<Tag{i} attr='{i}'/>")
    # include target component markup but no render()
    parts.append("<TargetLarge/>")
    big_code_no_render = "\n".join(parts)

    # Even though TargetLarge is used as JSX markup, absence of render() means False
    codeflash_output = _is_jsx_component_usage(big_code_no_render, "TargetLarge") # 226μs -> 9.74μs (2222% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import re

# imports
import pytest  # used for our unit tests
from codeflash.languages.javascript.instrument import _is_jsx_component_usage

def test_basic_self_closing_jsx_with_render():
    # Simple positive case: render called with a self-closing JSX tag
    code = "render(<MyComp />);"  # typical usage in tests
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.41μs -> 5.43μs (0.368% slower)

def test_basic_jsx_with_children_and_render():
    # JSX with opening and closing tags plus a render call should be detected
    code = "render(<MyComp>child</MyComp>);"  # component used as JSX with children
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.16μs -> 5.05μs (2.18% faster)

def test_jsx_without_render_returns_false():
    # If there's JSX markup but no render() call, function should return False
    code = "<MyComp />"  # just markup, no render invocation
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.13μs -> 1.85μs (177% faster)

def test_render_call_without_jsx_returns_false():
    # If render() is called but the component is invoked as a function, not JSX, return False
    code = "render(MyComp());"  # calling the component directly
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 3.12μs -> 1.96μs (58.7% faster)

def test_name_substring_no_false_positive():
    # Ensure we don't get false positive when func_name is a substring of another component
    code = "render(<MyComponent />);"  # contains "My" as prefix but not an exact tag of "My"
    codeflash_output = _is_jsx_component_usage(code, "My") # 3.79μs -> 4.31μs (12.1% slower)

def test_regex_special_characters_in_function_name():
    # Function names with regex metacharacters should be escaped and matched literally
    func_name = "Comp+Name"  # '+' is special in regex, must be escaped
    code = "render(<Comp+Name/>);"  # literal '+' in tag name
    codeflash_output = _is_jsx_component_usage(code, func_name) # 5.85μs -> 5.89μs (0.679% slower)

def test_leading_spaces_between_angle_and_name():
    # Spaces allowed after '<' before the component name should be handled
    code = "render(<   MyComp  />);"  # multiple spaces between '<' and name and between name and '/'
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.14μs -> 5.16μs (0.368% slower)

def test_case_sensitivity_of_component_name():
    # Matching is case-sensitive: different case should not match
    code = "render(<mycomp />);"  # lowercase tag
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 3.66μs -> 2.08μs (75.5% faster)

def test_render_word_boundary_prevents_partial_matches():
    # Ensure the 'render' detection respects word boundaries (e.g., 'renderer(' should not match)
    code = "renderer(<MyComp />);"  # starts with 'renderer', not 'render'
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.42μs -> 5.54μs (2.17% slower)

def test_empty_code_string():
    # Empty input should safely return False (no JSX and no render)
    codeflash_output = _is_jsx_component_usage("", "Anything") # 3.21μs -> 1.98μs (61.6% faster)

def test_none_raises_type_error():
    # Passing None (invalid type) should raise a TypeError from re.search usage
    with pytest.raises(TypeError):
        _is_jsx_component_usage(None, "Name") # 5.42μs -> 3.97μs (36.6% faster)

def test_component_name_with_dot_and_render():
    # Names with dots should be accepted literally because re.escape is used
    func_name = "Comp.Name"
    code = "render(<Comp.Name />);"  # dot inside component name
    codeflash_output = _is_jsx_component_usage(code, func_name) # 5.76μs -> 6.00μs (4.02% slower)

def test_render_with_whitespace_and_newlines():
    # render can have spaces/newlines between name and '(', pattern allows whitespace before '('
    code = "render   (\n  <MyComp/>  )"  # render with spaces/newline then JSX
    codeflash_output = _is_jsx_component_usage(code, "MyComp") # 5.21μs -> 5.16μs (0.969% faster)

def test_large_scale_many_non_matching_tags_then_one_matching():
    # Build a large code string with many unrelated tags to simulate a big file.
    parts = []
    # Add 1000 different tags that should not match our target
    for i in range(1000):
        parts.append(f"<X{i} prop={{{i}}} />\n")
    # Append a single matching usage at the end to validate detection in large input
    parts.append("/* some comment */\nrender(<TargetComponent someProp={true} />);\n")
    big_code = "".join(parts)
    # The function should find the JSX tag for "TargetComponent" and the render() call
    codeflash_output = _is_jsx_component_usage(big_code, "TargetComponent") # 215μs -> 229μs (6.01% slower)

def test_large_scale_many_matching_tags_but_no_render():
    # Build a large code string with many occurrences of the JSX tag but no render call.
    # Despite many JSX matches, absence of render() should make the function return False.
    parts = []
    for i in range(1000):
        # Repeated occurrences of the target component as JSX
        parts.append(f"<BigComp id={i} />\n")
    big_code = "".join(parts)
    # Many JSX tags, but no render invocation -> should be False
    codeflash_output = _is_jsx_component_usage(big_code, "BigComp") # 185μs -> 9.51μs (1848% faster)

def test_large_scale_render_present_but_component_tag_missing():
    # Large code with many render() calls but no JSX tags for the target component should return False.
    parts = []
    for i in range(1000):
        parts.append(f"render(someOtherComponent({i}));\n")  # many render calls but not with JSX
    big_code = "".join(parts)
    codeflash_output = _is_jsx_component_usage(big_code, "NonExistentComp") # 14.0μs -> 3.21μs (338% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1561-2026-02-24T21.52.21 and push.

Codeflash Static Badge

Runtime improved from 874 μs to 478 μs (about 1.83× faster, ~82% relative speedup). The optimized version was accepted for this runtime improvement.

What changed
- Precompiled the render detection regex to a module-level compiled pattern (_RENDER_CALL_RE = re.compile(...)) so we don't recompile the same pattern on every call.
- Added cheap substring checks ("render" not in code or "<" not in code or func_name not in code) to fast-fail obvious negatives before any regex work.
- Kept the existing jsx regex (which must include func_name via re.escape) but now it is only executed when the cheap checks pass.

Why this speeds things up
- Regex compilation and execution are relatively expensive in Python. The original profiler shows two heavy costs: re.search(jsx_pattern, code) consumed ~78% of the time and re.search(r"\brender\s*\(", code) ~20%. Avoiding unnecessary regex calls produces the biggest wins.
- The substring tests are O(n) scans using optimized C code (very cheap) and frequently rule out the need to run the heavier regexes. In negative/common cases (no render, no "<", or func_name absent) we return quickly with only a tiny C-level cost.
- Precompiling the render regex removes repeated compilation overhead and makes the final render check slightly cheaper and more predictable.

Evidence in profiling & tests
- Total runtime halved in the benchmark (874 μs → 478 μs).
- The optimized profiler shows the cheap substring check uses only a small fraction of time while the expensive jsx regex runs less frequently relative to overall runtime.
- Tests that represent large inputs with many JSX-like tags but no render() call (the common pathological case) show the largest wins (e.g., big_no_render went from 226 μs → 9.74 μs). That demonstrates the early-exit check is extremely effective on large inputs.
- Some small, positive cases (where both "<" and "render" are present and the JSX regex matches) saw tiny regressions because the extra substring checks add a marginal constant cost before the successful regex checks. Overall this trade-off is acceptable because it yields large wins on common/expensive negative cases and lowers average runtime.

Behavioral impact and safety
- The function’s semantics are unchanged: it still escapes func_name for the JSX check and still requires a render() call (the same word-boundary render pattern is used, but now via a compiled regex).
- This optimization benefits workloads that call this function many times or process large source strings (hot-paths that analyze files or AST-less heuristic checks). In those scenarios the early-fail and compiled pattern greatly reduce CPU work per call.
- If you expect many repeated calls with the same func_name and always-positive cases, a further micro-optimization could be to cache compiled jsx patterns per func_name — but that wasn't necessary to get the large runtime improvements seen here.

Summary
- Primary benefit: substantial runtime reduction (1.83× faster / ~82% speedup).
- Key techniques: cheap substring fast-fail + module-level compiled regex.
- Trade-offs: negligible extra cost for tiny positive cases, but large wins for common negative and large-input cases. This is a favorable trade-off for runtime-sensitive code paths.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants