⚡️ Speed up function `find_react_components` by 26% in PR #1561 (`add/support_react`) by codeflash-ai[bot] · Pull Request #1682 · codeflash-ai/codeflash

codeflash-ai · 2026-02-27T00:24:02Z

⚡️ This pull request contains optimizations for PR #1561

If you approve this dependent PR, these changes will be merged into the original PR branch add/support_react.

This PR will be automatically closed if the original PR is merged.

📄 26% (0.26x) speedup for `find_react_components` in `codeflash/languages/javascript/frameworks/react/discovery.py`

⏱️ Runtime : 7.34 milliseconds → 5.82 milliseconds (best of 250 runs)

📝 Explanation and details

Runtime improvement: the optimized code reduces end-to-end runtime from ~7.34 ms to ~5.82 ms — a 26% speedup — by removing Python-level work and repeated allocations in the hot path.

What changed (concrete optimizations)

Cached source bytes: added an lru_cache-backed _encode_source(source) so repeated source.encode("utf-8") calls reuse the same bytes object instead of allocating/encoding every time.
Faster hook extraction: replaced the Python-level regex iteration + seen-set loop with HOOK_EXTRACT_RE.findall(...) then list(dict.fromkeys(...)) to deduplicate while preserving first-seen order. This shifts most work into C (re.findall and dict construction) and removes per-match Python bookkeeping.
Cheap early-exit for memo checks: added a fast substring check ("memo(" and "React.memo") to skip the more expensive AST-parent walk and repeated slice+decode operations when memo is not present in the source.
Minor micro-alloc reduction: switched some ephemeral lists to tuples where appropriate (e.g., memo_patterns) and removed duplicated encode calls elsewhere.

Why these changes speed things up

Avoiding repeated .encode calls eliminates expensive per-function memory allocations and Python function-call overhead. The original profiler showed significant time in source.encode() sites (e.g., _extract_props_type, _function_returns_jsx). Caching the encoded bytes eliminates these hotspots when the same source string is inspected multiple times (typical when scanning many functions in one file).
Using regex.findall and dict.fromkeys moves the heavy lifting into C implementations (re engine and dict internals), cutting Python loop/branch overhead. The line profiler shows _extract_hooks_used time dropped substantially.
The substring check for memo presence is O(n) at C speed and avoids the common-case cost of doing tree/parent inspection and repeated byte-slicing/decoding for every function when memo is not used in the file.
Together these changes reduce per-function overhead in the main loop of find_react_components, which is where most time is spent for large files.

How this affects real workloads / hot paths

find_react_components is used during project-wide discovery and in downstream analyzers (see integration tests). When scanning large files with many functions (the realistic hot path), per-function overhead dominates; these changes reduce that overhead, so the largest wins are for big files or many functions in a single source (the annotated large-scale tests show the biggest improvement: ~34% in that test).
Small files or single-function files still benefit (microsecond-level wins) but the biggest impact is when the analyzer processes hundreds of functions in one source — exactly the scenario exercised by the large-scale annotated test and the integration flows that call find_react_components.

Which tests / cases benefit most

Large-scale detection and deduping tests (thousands of functions, many repeated hook patterns) get the largest absolute wins because of eliminated allocations and cheaper hook extraction.
Any test or real workload that repeatedly slices/decodes source bytes for props/memo detection benefits from the cached encoded bytes.
Small, early-exit scenarios (files with "use server") are unaffected functionally and still return quickly.

Behavioral/implementation notes and trade-offs

Semantics preserved: the changes do not change detection logic; they only change how data is extracted (same regex, same tree checks).
Memory trade-off: lru_cache(maxsize=32) will keep recent encoded source bytes alive (small, bounded memory increase). This is an intentional and reasonable trade-off for eliminating repeated encodings in the common case of scanning many functions from the same file.
The early substring check is conservative: it only avoids the AST/decoding work when memo-like identifiers are absent; when present, the full checks still run so detection correctness is unchanged.

Summary

Primary benefit: 26% runtime reduction (7.34 ms → 5.82 ms) by cutting Python-level loops and repeated allocations in the hot path.
Changes are low-risk, preserve behavior, and give the biggest improvements on large files and workloads that scan many functions in the same source (the common case for project analysis).

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 10 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Click to see Generated Regression Tests

from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.languages.javascript.frameworks.react import discovery
from codeflash.languages.javascript.frameworks.react.discovery import (
    ComponentType, ReactComponentInfo, find_react_components)
from codeflash.languages.javascript.treesitter import (FunctionNode,
                                                       TreeSitterAnalyzer)

# NB: We will create small helper objects to emulate the minimal parts of a tree-sitter Node
# that the discovery module expects. These are minimal, focused on the attributes/methods used
# by the code under test (child_by_field_name, children, type, parent, start_byte, end_byte,
# next_named_sibling). We keep these helpers extremely small and deterministic so tests remain stable.

class _MiniNode:
    """Minimal node-like object implementing the small API that discovery._node_contains_jsx
    and related helpers expect. This is not a full tree-sitter Node; it's a tiny structural
    shim used only inside tests to exercise logic that inspects node.type and children.
    """
    def __init__(self, type_, children=None, parent=None, start_byte=0, end_byte=0):
        self.type = type_
        # children should be a list of _MiniNode
        self.children = children or []
        self.parent = parent
        self._children_by_field = {}
        self.start_byte = start_byte
        self.end_byte = end_byte
        # next_named_sibling is occasionally used; set by test if needed
        self.next_named_sibling = None

    def child_by_field_name(self, name: str):
        # Support only the "body", "parameters", and "function" lookups used in discovery
        return self._children_by_field.get(name)

    def set_child_field(self, name: str, node):
        self._children_by_field[name] = node
        if node is not None:
            node.parent = self

# Helper to construct a FunctionNode using the real dataclass from the project.
def _make_function_node(
    *,
    name: str,
    is_method: bool = False,
    is_arrow: bool = False,
    source_text: str = "",
    node: _MiniNode | None = None,
    start_line: int = 1,
    end_line: int = 1,
) -> FunctionNode:
    """Create a FunctionNode dataclass instance with sensible defaults."""
    # Fill all required dataclass fields. We keep many flags False for simplicity.
    return FunctionNode(
        name=name,
        node=node or _MiniNode("function_declaration"),
        start_line=start_line,
        end_line=end_line,
        start_col=0,
        end_col=0,
        is_async=False,
        is_method=is_method,
        is_arrow=is_arrow,
        is_generator=False,
        class_name=None,
        parent_function=None,
        source_text=source_text,
        doc_start_line=None,
        is_exported=False,
    )

def test_skips_file_with_use_server_directive():
    # If the file begins with a "use server" directive, find_react_components must return an empty list
    src = "'use server'\n\nexport function MyComp() { return <div/> }"
    analyzer = TreeSitterAnalyzer("javascript")  # real analyzer instance; we don't use its parsing here
    # We don't need to monkeypatch analyzer because the function returns early
    codeflash_output = find_react_components(src, Path("some_file.tsx"), analyzer); result = codeflash_output # 2.84μs -> 2.83μs (0.035% faster)

def test_finds_hook_functions_without_parsing_jsx():
    # Hooks are recognized purely by name (useXxx) and not by return type, so we can detect them
    src = """
    export function useMyThing() {
        const [s, setS] = useState(0);
        useEffect(() => {});
    }
    """
    analyzer = TreeSitterAnalyzer("javascript")
    # Monkeypatch the analyzer to return a single FunctionNode representing the hook.
    # Using the actual TreeSitterAnalyzer instance (real class) but assigning a custom function
    # as an attribute is a lightweight test-time configuration and keeps downstream logic deterministic.
    fn = _make_function_node(
        name="useMyThing",
        is_method=False,
        is_arrow=False,
        source_text=src,
        start_line=2,
        end_line=6,
    )

    # Replace the find_functions method on the real analyzer instance with a deterministic lambda.
    analyzer.find_functions = lambda source, include_methods=False, include_arrow_functions=True, require_name=True: [fn]

    codeflash_output = find_react_components(src, Path("hooks.tsx"), analyzer); comps = codeflash_output # 18.7μs -> 15.6μs (19.8% faster)
    comp = comps[0]

def test_non_pascal_and_method_functions_are_excluded():
    # Create three functions:
    # - non-pascal function name (should be ignored)
    # - a PascalCase but marked as method (should be ignored)
    # - a valid hook (should be included)
    src = """
    function myfunc() { return <div/> }  // lowercase name - not a component
    class C { MyMethod() { return <span/> } } // method - should be skipped by include_methods=False in analyzer
    function useThing() { useLayoutEffect(() => {}); } // hook - should be included
    """
    analyzer = TreeSitterAnalyzer("javascript")

    non_pascal = _make_function_node(name="myfunc", is_method=False, source_text="function myfunc() { }")
    method_like = _make_function_node(name="MyMethod", is_method=True, source_text="class C { MyMethod(){ } }")
    hook = _make_function_node(name="useThing", is_method=False, source_text="function useThing(){ useLayoutEffect(); }")

    # Ensure analyzer returns all three; find_react_components should filter appropriately
    analyzer.find_functions = lambda source, include_methods=False, include_arrow_functions=True, require_name=True: [
        non_pascal,
        method_like,
        hook,
    ]

    codeflash_output = find_react_components(src, Path("edge.tsx"), analyzer); comps = codeflash_output # 17.7μs -> 14.9μs (19.2% faster)

def test_detects_memo_wrapping_by_source_string_and_jsx_return():
    # This test verifies that memo-detection works both via AST parent call_expression
    # and via textual patterns like "memo(MyComp)".
    # We will rely on the textual pattern because creating full call_expression parents is more involved.
    src_template = """
    import React, {{ memo }} from "react";

    export const MyComp = (props: Props) => <div>{props.children}</div>;

    export default memo(MyComp);
    """
    analyzer = TreeSitterAnalyzer("javascript")

    # Construct a node structure where node.child_by_field_name("body") returns a node
    # which contains a child with a JSX type so _function_returns_jsx returns True.
    jsx_node = _MiniNode("jsx_element", children=[])
    body_node = _MiniNode("parenthesized_expression", children=[jsx_node])
    func_node = _MiniNode("arrow_function")
    func_node.set_child_field("body", body_node)

    fn = _make_function_node(
        name="MyComp",
        is_method=False,
        is_arrow=True,
        source_text='(props: Props) => <div>{props.children}</div>',
        node=func_node,
        start_line=3,
        end_line=3,
    )

    # Analyzer returns our single arrow component
    analyzer.find_functions = lambda source, include_methods=False, include_arrow_functions=True, require_name=True: [fn]

    codeflash_output = find_react_components(src_template, Path("memoed.tsx"), analyzer); comps = codeflash_output # 17.0μs -> 16.0μs (5.99% faster)
    comp = comps[0]

def test_large_scale_many_hooks_and_components():
    # Build a large list of FunctionNode objects to ensure find_react_components handles many items.
    # We'll create 800 hooks (useXxx) and 200 components (PascalCase). Hooks are easier because they don't
    # require JSX detection. For components, we'll mark them arrow functions and provide a simple body node
    # containing a jsx_element type so _function_returns_jsx sees them as returning JSX.
    analyzer = TreeSitterAnalyzer("javascript")

    functions = []

    # Generate 800 hooks named useHook0 .. useHook799
    for i in range(800):
        name = f"useHook{i}"
        src_text = f"function {name}() {{ useState(); }}"
        fn = _make_function_node(name=name, is_method=False, is_arrow=False, source_text=src_text, start_line=1 + i, end_line=1 + i)
        functions.append(fn)

    # Generate 200 PascalCase components named Comp0 .. Comp199
    for i in range(200):
        name = f"Comp{i}"
        # create a node with a body that contains a jsx_element to indicate JSX return
        jsx_node = _MiniNode("jsx_element")
        body_node = _MiniNode("block", children=[jsx_node])
        func_node = _MiniNode("function_declaration")
        func_node.set_child_field("body", body_node)
        src_text = f"function {name}() {{ return <div>{i}</div>; }}"
        fn = _make_function_node(name=name, is_method=False, is_arrow=False, source_text=src_text, node=func_node, start_line=1000 + i, end_line=1000 + i)
        functions.append(fn)

    # Shuffle order slightly to mix hooks and components (deterministic)
    functions = functions[::]  # keep order deterministic

    # Monkeypatch analyzer.find_functions to return our pre-built list.
    analyzer.find_functions = lambda source, include_methods=False, include_arrow_functions=True, require_name=True: functions

    src = "// big file with many functions\n" + "\n".join(f"// filler line {i}" for i in range(10))
    codeflash_output = find_react_components(src, Path("bigfile.tsx"), analyzer); comps = codeflash_output # 4.76ms -> 3.55ms (34.2% faster)

    # Verify some invariants: count of hooks and components
    hook_count = sum(1 for c in comps if c.component_type == ComponentType.HOOK)
    arrow_or_func_count = sum(1 for c in comps if c.component_type in (ComponentType.ARROW, ComponentType.FUNCTION))

    # Spot-check a few entries for correctness
    names = {c.function_name for c in comps}
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from pathlib import Path

# Import the module and symbols under test.
import codeflash.languages.javascript.frameworks.react.discovery as discovery
# imports
import pytest  # used for our unit tests
from codeflash.languages.javascript.frameworks.react.discovery import (
    ComponentType, ReactComponentInfo, find_react_components)
# Import the real analyzer and FunctionNode dataclass from the project's treesitter module.
from codeflash.languages.javascript.treesitter import (FunctionNode,
                                                       TreeSitterAnalyzer)

def test_basic_function_component_detection(monkeypatch):
    """
    Basic positive case:
    - A PascalCase standalone function that 'returns JSX' should be reported as a FUNCTION component.
    - Hook calls inside the function source_text should be extracted (unique, in first-seen order).
    - Props type and memo detection are read via patched extractors (to avoid requiring a real tree-sitter Node).
    """
    # Simple source and fake file path
    source = "/* some file */\nconst x = 1;\n"
    file_path = Path("components/MyComp.tsx")

    # Create a real TreeSitterAnalyzer instance (language string is accepted by ctor).
    analyzer = TreeSitterAnalyzer("javascript")

    # Build a FunctionNode instance using the real dataclass constructor.
    # We deliberately set node to None because we will patch internal helpers that would otherwise
    # attempt to access tree-sitter Node APIs.
    fn = FunctionNode(
        name="MyComp",  # PascalCase => candidate component
        node=None,
        start_line=10,
        end_line=20,
        start_col=0,
        end_col=0,
        is_async=False,
        is_method=False,  # standalone function
        is_arrow=False,
        is_generator=False,
        class_name=None,
        parent_function=None,
        # Include source text containing hook calls; the regex should capture useState and useCustom
        source_text="const [s, setS] = useState(0);\nuseCustom<TypeParam>();\nreturn <div/>;",
    )

    # Patch analyzer.find_functions to return our single FunctionNode.
    monkeypatch.setattr(analyzer, "find_functions", lambda *args, **kwargs: [fn])

    # Patch internal helpers to avoid needing a real tree-sitter Node structure.
    # _function_returns_jsx: treat presence of attribute _returns_jsx as the determinant.
    monkeypatch.setattr(discovery, "_function_returns_jsx", lambda f, s, a: getattr(f, "_returns_jsx", True))
    # _extract_props_type: read from an attribute on the FunctionNode if present.
    monkeypatch.setattr(discovery, "_extract_props_type", lambda f, s, a: getattr(f, "_props_type", None))
    # _is_wrapped_in_memo: read from attribute if present.
    monkeypatch.setattr(discovery, "_is_wrapped_in_memo", lambda f, s: getattr(f, "_is_memo", False))

    # Call the function under test.
    codeflash_output = find_react_components(source, file_path, analyzer); components = codeflash_output # 16.9μs -> 15.3μs (10.6% faster)

    comp = components[0]

def test_hook_is_classified_as_hook_and_not_component(monkeypatch):
    """
    A function named with the 'useXxx' pattern that is not a method should be classified as a HOOK.
    Hooks should be reported with component_type == ComponentType.HOOK and returns_jsx == False.
    """
    source = ""
    file_path = Path("hooks/useThing.ts")
    analyzer = TreeSitterAnalyzer("javascript")

    fn = FunctionNode(
        name="useThing",
        node=None,
        start_line=1,
        end_line=3,
        start_col=0,
        end_col=0,
        is_async=False,
        is_method=False,
        is_arrow=False,
        is_generator=False,
        class_name=None,
        parent_function=None,
        source_text="const val = useState(0); useMemo(()=>{});",
    )

    # Ensure analyzer returns our function
    monkeypatch.setattr(analyzer, "find_functions", lambda *args, **kwargs: [fn])

    # Patch helpers similarly to previous test.
    monkeypatch.setattr(discovery, "_function_returns_jsx", lambda f, s, a: getattr(f, "_returns_jsx", False))
    monkeypatch.setattr(discovery, "_extract_props_type", lambda f, s, a: None)
    monkeypatch.setattr(discovery, "_is_wrapped_in_memo", lambda f, s: False)

    codeflash_output = find_react_components(source, file_path, analyzer); components = codeflash_output # 12.2μs -> 11.2μs (8.39% faster)
    comp = components[0]

def test_skips_server_component_file_without_parsing(monkeypatch):
    """
    Files that begin with a 'use server' directive should be skipped entirely,
    and the analyzer.find_functions should not be called.
    """
    # Source that triggers server directive detection in the first five lines
    source = "'use server';\nimport React from 'react';\nfunction Ignored() { return <div/> }"
    file_path = Path("app/page.tsx")
    analyzer = TreeSitterAnalyzer("javascript")

    # If find_functions gets called, fail the test (it should not be called)
    def fail_if_called(*args, **kwargs):
        pytest.fail("analyzer.find_functions should not be called for server component files")

    monkeypatch.setattr(analyzer, "find_functions", fail_if_called)

    codeflash_output = find_react_components(source, file_path, analyzer); components = codeflash_output # 2.85μs -> 2.79μs (2.15% faster)

def test_non_pascal_case_and_methods_are_ignored(monkeypatch):
    """
    Ensure functions that are not PascalCase or that are class methods are ignored.
    """
    source = ""
    file_path = Path("misc/file.tsx")
    analyzer = TreeSitterAnalyzer("javascript")

    fn_lowercase = FunctionNode(
        name="notPascal",
        node=None,
        start_line=1,
        end_line=2,
        start_col=0,
        end_col=0,
        is_async=False,
        is_method=False,
        is_arrow=False,
        is_generator=False,
        class_name=None,
        parent_function=None,
        source_text="return <span/>;",
    )

    fn_method = FunctionNode(
        name="MyMethod",
        node=None,
        start_line=3,
        end_line=4,
        start_col=0,
        end_col=0,
        is_async=False,
        is_method=True,  # method inside class -> should be ignored
        is_arrow=False,
        is_generator=False,
        class_name="SomeClass",
        parent_function=None,
        source_text="return <div/>;",
    )

    # Analyzer returns both functions
    monkeypatch.setattr(analyzer, "find_functions", lambda *args, **kwargs: [fn_lowercase, fn_method])

    # Patch helpers to treat both as returning JSX if asked
    monkeypatch.setattr(discovery, "_function_returns_jsx", lambda f, s, a: True)
    monkeypatch.setattr(discovery, "_extract_props_type", lambda f, s, a: None)
    monkeypatch.setattr(discovery, "_is_wrapped_in_memo", lambda f, s: False)

    codeflash_output = find_react_components(source, file_path, analyzer); components = codeflash_output # 4.69μs -> 4.68μs (0.235% faster)

def test_large_scale_detection_and_hook_deduping(monkeypatch):
    """
    Large-scale test: create up to 1000 FunctionNode entries and verify:
    - Only PascalCase standalone functions that 'return JSX' are reported.
    - Hook extraction deduplicates calls and preserves first-seen order.
    - The collection scales to 1000 functions in a deterministic manner.
    """
    num_total = 1000
    # We'll make half of them valid PascalCase components, half invalid
    num_valid = num_total // 2

    source = "// large file\n"
    file_path = Path("big/Many.tsx")
    analyzer = TreeSitterAnalyzer("javascript")

    functions = []
    for i in range(num_total):
        if i < num_valid:
            # Valid PascalCase component names: Comp0, Comp1, ...
            name = f"Comp{i}"
            is_method = False
            is_arrow = (i % 3 == 0)  # some arrows, some normal functions
            # craft source_text to include hook calls with duplicates and generics: useA, useB, useA
            source_text = "useA(); useB<Type>(); useA(); return <div/>;"
            # Mark attributes that our patched helpers will consult
            fn = FunctionNode(
                name=name,
                node=None,
                start_line=i * 2 + 1,
                end_line=i * 2 + 2,
                start_col=0,
                end_col=0,
                is_async=False,
                is_method=is_method,
                is_arrow=is_arrow,
                is_generator=False,
                class_name=None,
                parent_function=None,
                source_text=source_text,
            )
            # set helper attributes that patched helpers will read
            setattr(fn, "_returns_jsx", True)
            setattr(fn, "_props_type", None)
            setattr(fn, "_is_memo", False)
        else:
            # Invalid names (not PascalCase) or methods
            idx = i - num_valid
            name = f"not_comp_{idx}"
            fn = FunctionNode(
                name=name,
                node=None,
                start_line=i * 2 + 1,
                end_line=i * 2 + 2,
                start_col=0,
                end_col=0,
                is_async=False,
                is_method=(idx % 5 == 0),  # some will be methods
                is_arrow=False,
                is_generator=False,
                class_name=None,
                parent_function=None,
                source_text="useA(); return <div/>;",
            )
            # even if they have _returns_jsx True, they should be ignored by PascalCase check or is_method
            setattr(fn, "_returns_jsx", True)
            setattr(fn, "_props_type", None)
            setattr(fn, "_is_memo", False)

        functions.append(fn)

    # Patch analyzer to return our synthetic functions list
    monkeypatch.setattr(analyzer, "find_functions", lambda *args, **kwargs: functions)

    # Patch internal helpers to use the attributes we set on FunctionNode objects.
    monkeypatch.setattr(discovery, "_function_returns_jsx", lambda f, s, a: getattr(f, "_returns_jsx", False))
    monkeypatch.setattr(discovery, "_extract_props_type", lambda f, s, a: getattr(f, "_props_type", None))
    monkeypatch.setattr(discovery, "_is_wrapped_in_memo", lambda f, s: getattr(f, "_is_memo", False))

    codeflash_output = find_react_components(source, file_path, analyzer); components = codeflash_output # 2.49ms -> 2.18ms (14.0% faster)

    # Verify that hook extraction on the first valid component preserved order and deduplicated
    first_comp = components[0]
    # All returned items should be either FUNCTION or ARROW depending on is_arrow flag
    for comp_info in components:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1561-2026-02-27T00.23.56 and push.

Runtime improvement: the optimized code reduces end-to-end runtime from ~7.34 ms to ~5.82 ms — a 26% speedup — by removing Python-level work and repeated allocations in the hot path. What changed (concrete optimizations) - Cached source bytes: added an lru_cache-backed _encode_source(source) so repeated source.encode("utf-8") calls reuse the same bytes object instead of allocating/encoding every time. - Faster hook extraction: replaced the Python-level regex iteration + seen-set loop with HOOK_EXTRACT_RE.findall(...) then list(dict.fromkeys(...)) to deduplicate while preserving first-seen order. This shifts most work into C (re.findall and dict construction) and removes per-match Python bookkeeping. - Cheap early-exit for memo checks: added a fast substring check ("memo(" and "React.memo") to skip the more expensive AST-parent walk and repeated slice+decode operations when memo is not present in the source. - Minor micro-alloc reduction: switched some ephemeral lists to tuples where appropriate (e.g., memo_patterns) and removed duplicated encode calls elsewhere. Why these changes speed things up - Avoiding repeated .encode calls eliminates expensive per-function memory allocations and Python function-call overhead. The original profiler showed significant time in source.encode() sites (e.g., _extract_props_type, _function_returns_jsx). Caching the encoded bytes eliminates these hotspots when the same source string is inspected multiple times (typical when scanning many functions in one file). - Using regex.findall and dict.fromkeys moves the heavy lifting into C implementations (re engine and dict internals), cutting Python loop/branch overhead. The line profiler shows _extract_hooks_used time dropped substantially. - The substring check for memo presence is O(n) at C speed and avoids the common-case cost of doing tree/parent inspection and repeated byte-slicing/decoding for every function when memo is not used in the file. - Together these changes reduce per-function overhead in the main loop of find_react_components, which is where most time is spent for large files. How this affects real workloads / hot paths - find_react_components is used during project-wide discovery and in downstream analyzers (see integration tests). When scanning large files with many functions (the realistic hot path), per-function overhead dominates; these changes reduce that overhead, so the largest wins are for big files or many functions in a single source (the annotated large-scale tests show the biggest improvement: ~34% in that test). - Small files or single-function files still benefit (microsecond-level wins) but the biggest impact is when the analyzer processes hundreds of functions in one source — exactly the scenario exercised by the large-scale annotated test and the integration flows that call find_react_components. Which tests / cases benefit most - Large-scale detection and deduping tests (thousands of functions, many repeated hook patterns) get the largest absolute wins because of eliminated allocations and cheaper hook extraction. - Any test or real workload that repeatedly slices/decodes source bytes for props/memo detection benefits from the cached encoded bytes. - Small, early-exit scenarios (files with "use server") are unaffected functionally and still return quickly. Behavioral/implementation notes and trade-offs - Semantics preserved: the changes do not change detection logic; they only change how data is extracted (same regex, same tree checks). - Memory trade-off: lru_cache(maxsize=32) will keep recent encoded source bytes alive (small, bounded memory increase). This is an intentional and reasonable trade-off for eliminating repeated encodings in the common case of scanning many functions from the same file. - The early substring check is conservative: it only avoids the AST/decoding work when memo-like identifiers are absent; when present, the full checks still run so detection correctness is unchanged. Summary - Primary benefit: 26% runtime reduction (7.34 ms → 5.82 ms) by cutting Python-level loops and repeated allocations in the hot path. - Changes are low-risk, preserve behavior, and give the biggest improvements on large files and workloads that scan many functions in the same source (the common case for project analysis).

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 27, 2026

codeflash-ai bot mentioned this pull request Feb 27, 2026

[WIP] react framework initial commit #1561

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `find_react_components` by 26% in PR #1561 (`add/support_react`)#1682

⚡️ Speed up function `find_react_components` by 26% in PR #1561 (`add/support_react`)#1682
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-27T00.23.56

codeflash-ai bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

codeflash-ai bot commented Feb 27, 2026

⚡️ This pull request contains optimizations for PR #1561

📄 26% (0.26x) speedup for find_react_components in codeflash/languages/javascript/frameworks/react/discovery.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

📄 26% (0.26x) speedup for `find_react_components` in `codeflash/languages/javascript/frameworks/react/discovery.py`