Skip to content

⚡️ Speed up method TreeSitterAnalyzer.has_return_statement by 10% in PR #1561 (add/support_react)#1670

Open
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-25T21.27.24
Open

⚡️ Speed up method TreeSitterAnalyzer.has_return_statement by 10% in PR #1561 (add/support_react)#1670
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-25T21.27.24

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 25, 2026

⚡️ This pull request contains optimizations for PR #1561

If you approve this dependent PR, these changes will be merged into the original PR branch add/support_react.

This PR will be automatically closed if the original PR is merged.


📄 10% (0.10x) speedup for TreeSitterAnalyzer.has_return_statement in codeflash/languages/javascript/treesitter_utils.py

⏱️ Runtime : 1.31 milliseconds 1.19 milliseconds (best of 250 runs)

📝 Explanation and details

Primary benefit — runtime: The optimized version reduces average wall-clock time by ~10% (1.31 ms -> 1.19 ms). This win comes from reducing per-call allocations and expensive attribute lookups inside the traversal hot-loop, which compounds when traversing large ASTs or calling the routine repeatedly.

What changed (specific optimizations)

  • Pulled the small, constant collection of function-like node types out to a module-level frozenset (_FUNC_LIKE_NODE_TYPES). That avoids re-creating the tuple on every _node_has_return call.
  • Reused local variables for child lists inside the loop (body_children = body_node.children and child_nodes = current.children) to avoid repeated attribute lookups.
  • Kept traversal logic identical; only micro-optimizations that reduce Python overhead were applied.

Why this speeds things up (Python-level rationale)

  • Allocation cost: In the original code func_types = ("function_declaration", ...) creates a new tuple on every call. Even tiny allocations matter when the traversal loops many times. Moving it to a module-level constant removes that allocation.
  • Membership cost: Using a frozenset for func-like node types makes "current.type in func_types" an O(1) hash lookup instead of a tiny linear scan; for very hot loops this helps and avoids per-call construction overhead.
  • Attribute lookup reduction: Accessing attributes like current.children or body_node.children repeatedly triggers attribute lookups (and possibly C-to-Python boundary work for tree-sitter Node objects). Binding them once to a local name reduces that overhead inside the loop.
  • Lower Python interpreter overhead: Fewer temporary objects and fewer attribute lookups reduce work inside the hot while/stack loop, which is where most time is spent (the annotated tests and profiler show the traversal loop dominates).

How this affects workloads (when you see the benefit)

  • Best for deep/large ASTs and repeated calls: Annotated tests show the largest wins on the deep chain and large-tree tests (16–20% faster), exactly the scenarios where the traversal iterates many nodes and the per-iteration savings add up.
  • Small, trivial functions: A few microbench tests (very small inputs) show negligible differences and a couple are slightly slower by a few percent — this is expected noise/trade-off from different micro-bench timings and is reasonable given the overall runtime improvement on realistic workloads.
  • Hot path safe: The change is local, preserves behavior, and benefits any code that repeatedly calls has_return_statement or walks big trees.

Trade-offs

  • No behavioral changes were introduced. A small number of microbench cases are a bit slower (annotated tests show a few 3–4% regressions), but those are tiny inputs where the traversal work is minimal. This small regression is an acceptable trade-off for the measurable runtime improvement on realistic workloads.

Summary

  • Primary gain: ~10% overall runtime improvement (1.31ms → 1.19ms).
  • How: remove per-call tuple allocation, use a module-level frozenset, and reduce repeated attribute lookups in the hot traversal loop.
  • Good for: large/deep AST traversal and high-call-rate scenarios; negligible or acceptable differences for tiny inputs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1040 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import pytest  # used for our unit tests
from codeflash.languages.javascript.treesitter_utils import TreeSitterAnalyzer

# -----------------------------------------------------------------------------
# NOTE:
# The TreeSitterAnalyzer.has_return_statement expects a "FunctionNode"-like
# object whose attributes the implementation uses are:
#   - is_generator (bool)
#   - is_arrow (bool)
#   - node (object exposing .type, .children and .child_by_field_name(name))
#
# For unit testing we construct small, deterministic node-like objects that
# implement the minimal API used by has_return_statement/_node_has_return.
# These are not mocks from unittest.mock; they are tiny concrete helper
# classes defined here purely to exercise the real TreeSitterAnalyzer code.
# -----------------------------------------------------------------------------

class NodeLike:
    """
    Minimal node-like object compatible with the TreeSitterAnalyzer expectations.

    It provides:
      - .type: a string indicating the node type (e.g. "return_statement")
      - .children: a list of child nodes
      - .child_by_field_name(name): returns a specific child node by name
    """

    __slots__ = ("type", "children", "_fields")

    def __init__(self, type: str, children=None, fields=None):
        # Set the node type used by the analyzer
        self.type = type
        # Ensure children is always a list for iteration; preserve order
        self.children = list(children) if children else []
        # A mapping of named fields to node objects (for .child_by_field_name)
        self._fields = dict(fields) if fields else {}

    def child_by_field_name(self, name: str):
        # Return the previously stored field or None if missing
        return self._fields.get(name)

class FunctionNodeLike:
    """
    Minimal function-like wrapper expected by has_return_statement.

    Contains:
      - is_generator: True if generator (immediate True result)
      - is_arrow: True if arrow function (may imply implicit return)
      - node: the NodeLike root node representing the function AST
    """
    __slots__ = ("is_generator", "is_arrow", "node")

    def __init__(self, node, is_generator=False, is_arrow=False):
        self.node = node
        self.is_generator = is_generator
        self.is_arrow = is_arrow

# Helper factory functions to build common node shapes used across tests.
def make_return_node():
    # Create a direct return statement node (leaf)
    return NodeLike("return_statement")

def make_statement_block(children=None):
    # Create a statement block with given children
    return NodeLike("statement_block", children=children or [])

def make_function_node_body(child_nodes):
    # Create a function-like node that has a body field referencing a statement_block
    body = make_statement_block(children=child_nodes)
    # The function node's .child_by_field_name("body") should return the body node
    fn_node = NodeLike("function_declaration", children=[body], fields={"body": body})
    return fn_node

def test_generator_function_always_reports_true():
    # Generator functions should immediately return True regardless of node contents.
    empty_node = NodeLike("function_declaration", children=[])
    fn = FunctionNodeLike(node=empty_node, is_generator=True, is_arrow=False)

    analyzer = TreeSitterAnalyzer("javascript")
    # Because is_generator is True, has_return_statement must return True without
    # inspecting the node structure.
    codeflash_output = analyzer.has_return_statement(fn, source="") # 557ns -> 544ns (2.39% faster)

def test_arrow_function_with_expression_body_implicit_return():
    # Arrow function with a non-statement_block body indicates an implicit return.
    # Create a body node that is an expression (e.g., "binary_expression").
    expr_body = NodeLike("binary_expression")
    fn_node = NodeLike("arrow_function", children=[expr_body], fields={"body": expr_body})

    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=True)
    analyzer = TreeSitterAnalyzer("javascript")

    # Because the body exists and its type is not "statement_block", this is an
    # implicit return and the method must return True.
    codeflash_output = analyzer.has_return_statement(fn, source="x => x + 1") # 1.61μs -> 1.66μs (3.25% slower)

def test_arrow_with_statement_block_body_delegates_to_node_traversal():
    # Arrow function with a statement_block must be inspected for explicit returns.
    # Build a body block containing a return_statement.
    return_node = make_return_node()
    body = make_statement_block(children=[return_node])
    fn_node = NodeLike("arrow_function", children=[body], fields={"body": body})

    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=True)
    analyzer = TreeSitterAnalyzer("javascript")

    # There is an explicit return inside the statement block, so the result is True.
    codeflash_output = analyzer.has_return_statement(fn, source="() => { return 1; }") # 2.99μs -> 3.11μs (3.99% slower)

def test_regular_function_with_direct_return_in_body():
    # Non-arrow, non-generator function with a return in its body should return True.
    return_node = make_return_node()
    fn_node = make_function_node_body([return_node])  # function_declaration with body->return
    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    codeflash_output = analyzer.has_return_statement(fn, source="function f(){ return 3; }") # 2.36μs -> 2.31μs (2.25% faster)

def test_no_return_statement_returns_false():
    # Function has no return statements anywhere; expect False.
    inner_expr = NodeLike("identifier")  # arbitrary node, not a return
    fn_node = make_function_node_body([inner_expr])
    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    codeflash_output = analyzer.has_return_statement(fn, source="function noReturn(){ let x = 1; }") # 2.61μs -> 2.73μs (4.18% slower)

def test_arrow_with_missing_body_falls_back_to_node_traversal():
    # Arrow function whose child_by_field_name("body") returns None should fallback
    # to traversal of the provided node.
    # Build a node tree with a top-level return node (but no explicit body field).
    top_return = make_return_node()
    top_node = NodeLike("arrow_function", children=[top_return], fields={})  # no "body" field
    fn = FunctionNodeLike(node=top_node, is_generator=False, is_arrow=True)
    analyzer = TreeSitterAnalyzer("javascript")

    # There is a return reachable by traversal, so expect True.
    codeflash_output = analyzer.has_return_statement(fn, source="() => (malformed)") # 2.09μs -> 2.00μs (4.76% faster)

def test_empty_function_node_with_no_children_returns_false():
    # A function node with no children and not generator/arrow must return False.
    empty_fn_node = NodeLike("function_declaration", children=[], fields={"body": None})
    fn = FunctionNodeLike(node=empty_fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    codeflash_output = analyzer.has_return_statement(fn, source="function empty(){}") # 1.63μs -> 1.56μs (4.43% faster)

def test_return_in_nested_inner_function_is_detected():
    # The analyzer traverses function bodies, and so a return inside a nested
    # function body that appears in the outer body will be found.
    # Outer body contains an inner function that contains a return.
    inner_return = make_return_node()
    inner_body = make_statement_block(children=[inner_return])
    inner_fn_node = NodeLike("function_expression", children=[inner_body], fields={"body": inner_body})

    # Outer function's body contains the inner function node
    outer_body = make_statement_block(children=[inner_fn_node])
    outer_fn_node = NodeLike("function_declaration", children=[outer_body], fields={"body": outer_body})

    fn = FunctionNodeLike(node=outer_fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    # Because the traversal descends into function bodies, this will be reported as True.
    codeflash_output = analyzer.has_return_statement(fn, source="function outer(){ function inner(){ return 1; } }") # 2.85μs -> 2.88μs (0.868% slower)

def test_long_chain_with_final_return_detected():
    # Build a deep linear chain of nodes (1000 elements) with a return at the end.
    depth = 1000
    # Start with the final return node
    node = make_return_node()
    # Wrap it in many generic statement_block layers to create a deep tree
    for _ in range(depth):
        node = NodeLike("statement_block", children=[node])

    # Put that deep block as the function body
    fn_node = NodeLike("function_declaration", children=[node], fields={"body": node})
    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    # Even with deep nesting, the iterative traversal should find the return.
    codeflash_output = analyzer.has_return_statement(fn, source="function deep(){ /* lots */ }") # 203μs -> 174μs (16.4% faster)

def test_large_tree_without_return_reports_false():
    # Build a large tree (1000 nodes) none of which are return statements.
    # We'll make a balanced-ish tree by repeatedly nesting two children.
    size = 1000

    leaves = [NodeLike("identifier") for _ in range(10)]  # start with some leaves
    nodes = leaves[:]
    # Grow the tree until it has at least 'size' nodes
    while len(nodes) < size:
        left = nodes[-1]
        right = NodeLike("identifier")
        parent = NodeLike("statement_block", children=[left, right])
        nodes.append(parent)

    root = nodes[-1]
    fn_node = NodeLike("function_declaration", children=[root], fields={"body": root})
    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    # No return nodes in this large tree: must be False.
    codeflash_output = analyzer.has_return_statement(fn, source="function big(){ /* lots of non-return nodes */ }") # 282μs -> 234μs (20.4% faster)

def test_repeated_calls_are_deterministic_under_load():
    # Call has_return_statement many times to ensure deterministic behavior and
    # no stateful side effects across calls.
    return_node = make_return_node()
    fn_node = make_function_node_body([return_node])
    fn = FunctionNodeLike(node=fn_node, is_generator=False, is_arrow=False)
    analyzer = TreeSitterAnalyzer("javascript")

    # Call the method 1000 times and ensure it always yields True.
    for _ in range(1000):
        codeflash_output = analyzer.has_return_statement(fn, source="function repeated(){ return 1; }") # 609μs -> 605μs (0.650% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1561-2026-02-25T21.27.24 and push.

Codeflash Static Badge

Primary benefit — runtime: The optimized version reduces average wall-clock time by ~10% (1.31 ms -> 1.19 ms). This win comes from reducing per-call allocations and expensive attribute lookups inside the traversal hot-loop, which compounds when traversing large ASTs or calling the routine repeatedly.

What changed (specific optimizations)
- Pulled the small, constant collection of function-like node types out to a module-level frozenset (_FUNC_LIKE_NODE_TYPES). That avoids re-creating the tuple on every _node_has_return call.
- Reused local variables for child lists inside the loop (body_children = body_node.children and child_nodes = current.children) to avoid repeated attribute lookups.
- Kept traversal logic identical; only micro-optimizations that reduce Python overhead were applied.

Why this speeds things up (Python-level rationale)
- Allocation cost: In the original code func_types = ("function_declaration", ...) creates a new tuple on every call. Even tiny allocations matter when the traversal loops many times. Moving it to a module-level constant removes that allocation.
- Membership cost: Using a frozenset for func-like node types makes "current.type in func_types" an O(1) hash lookup instead of a tiny linear scan; for very hot loops this helps and avoids per-call construction overhead.
- Attribute lookup reduction: Accessing attributes like current.children or body_node.children repeatedly triggers attribute lookups (and possibly C-to-Python boundary work for tree-sitter Node objects). Binding them once to a local name reduces that overhead inside the loop.
- Lower Python interpreter overhead: Fewer temporary objects and fewer attribute lookups reduce work inside the hot while/stack loop, which is where most time is spent (the annotated tests and profiler show the traversal loop dominates).

How this affects workloads (when you see the benefit)
- Best for deep/large ASTs and repeated calls: Annotated tests show the largest wins on the deep chain and large-tree tests (16–20% faster), exactly the scenarios where the traversal iterates many nodes and the per-iteration savings add up.
- Small, trivial functions: A few microbench tests (very small inputs) show negligible differences and a couple are slightly slower by a few percent — this is expected noise/trade-off from different micro-bench timings and is reasonable given the overall runtime improvement on realistic workloads.
- Hot path safe: The change is local, preserves behavior, and benefits any code that repeatedly calls has_return_statement or walks big trees.

Trade-offs
- No behavioral changes were introduced. A small number of microbench cases are a bit slower (annotated tests show a few 3–4% regressions), but those are tiny inputs where the traversal work is minimal. This small regression is an acceptable trade-off for the measurable runtime improvement on realistic workloads.

Summary
- Primary gain: ~10% overall runtime improvement (1.31ms → 1.19ms).
- How: remove per-call tuple allocation, use a module-level frozenset, and reduce repeated attribute lookups in the hot traversal loop.
- Good for: large/deep AST traversal and high-call-rate scenarios; negligible or acceptable differences for tiny inputs.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants