Minimal demo for representing a Python repository as a searchable code graph.
Default target: the current directory (.).
- AST nodes:
File,Class,Function,Method,Import - Call graph edges:
CALLS - CFG edges:
CFG_ENTRY,NEXT,TRUE_BRANCH,FALSE_BRANCH,LOOP_BODY,LOOP_BACK,LOOP_EXIT - DFG nodes/edges:
VariableDef,VariableUse,DEFINES,USES,DATA_FLOW - Query demo: keyword search over nodes plus one-hop incoming/outgoing neighbors
Build the base graph:
python3 main.py --src . --out output/code_graph.jsonBuild the graph with CFG and DFG:
python3 main.py --src . --out output/code_graph_full.json --with-cfg --with-dfgQuery the graph:
python3 query.py --graph output/code_graph_full.json --q build_code_graphThe exported JSON has three top-level fields:
metadata: source root, language, enabled features, resolved/unresolved/ambiguous call countsnodes: code entities such as files, classes, functions, statements, variablesedges: relationships such as containment, imports, calls, control flow, data flow
Generated JSON files are ignored by git. Regenerate them with the commands above.
python3 -m unittest discover
python3 -m py_compile ast_extractor.py call_graph_builder.py cfg_builder.py dfg_builder.py exporter.py graph_builder.py graph_schema.py main.py query.py repo_scanner.py- Python only
- Static approximation only
CALLS,CFG, andDFGare lightweight demos, not compiler-grade analysis- Does not handle complex dynamic dispatch, aliases, reflection, closures, global/nonlocal scope, or polymorphism
- Attribute calls through variables, such as
node.to_dict(), resolve only when a local type hint, constructor assignment, self attribute assignment, or module alias makes the target clear
- Use tree-sitter for multi-language parsing
- Store/query graphs with NetworkX or Neo4j
- Add embedding-based semantic retrieval
- Expand retrieved context with graph traversal
- Convert retrieved subgraphs into prompts for Graph RAG code generation