Function Call Analysis

Updated 3 December 2025

Function call analysis is the systematic study of calling relationships between subroutines, foundational for program comprehension and security.
It employs both static and dynamic methodologies, including tools like PyCG, JARVIS, and GNN-augmented techniques, to build precise call graphs.
The analysis aids in performance profiling, vulnerability detection, and compiler optimizations such as safe inlining and malware classification.

Function call analysis is the systematic paper of calling relationships between subroutines (functions, methods) in software systems. It encompasses the static or dynamic construction of call graphs and leverages these graphs for a range of tasks spanning program comprehension, performance profiling, security analysis, and automated tooling. In call graphs, nodes represent functions and directed edges embody potential invocation paths, i.e., "may-call" relations. With the universality of function-centric composition across languages, function call analysis is foundational to modern software engineering and program analysis.

1. Theoretical Foundations and Importance

A call graph is formally defined as a directed graph $G=(V, E)$ where $V$ is the set of all functions in a program and $E \subseteq V \times V$ contains $(f_i, f_j)$ if $f_i$ may invoke $f_j$ (0803.4025). Self-loops represent recursion, and the graph may be cyclic or acyclic depending on software structure. Precise construction of call graphs is critical for:

Profiling: Informing instrumentation location and cost propagation across call chains.
Vulnerability Propagation: Modeling how taint or defects spread through invocation paths.
Dependency Impact Analysis: Identifying actual use of downstream APIs or libraries for targeted security advisories (Salis et al., 2021).
Software Quality Assessment: Extracting graph-theoretic metrics such as indegree/outdegree distributions, clustering coefficients, and betweenness centrality to prioritize code reviews and refactoring (0803.4025).

2. Methodologies for Static Call Graph Generation

Python: PyCG and JARVIS

PyCG implements a pragmatic, context-insensitive, inter-procedural static analysis. The core assignment graph $\pi$ records possible points-to relations between program identifiers (functions, variables, classes, modules). The approach handles modules, generators, closures, and multiple inheritance, encoding assignment and call-site resolution via IR reduction semantics. The key [call] rule links call expressions to inter-procedural edges in $\pi$ , enabling call graph extraction via fixpoint iteration (Salis et al., 2021).

JARVIS advances this methodology through per-function, flow-sensitive type graphs (FTG), enabling strong updates and accurate intra/inter-procedural summaries. Call graph edges are built on-the-fly, modularly, and only for reachable code, improving precision (~0.35), recall (~0.60), and analysis speed (67% faster than PyCG) (Huang et al., 2023).

Enterprise Codebases

For multi-layered C#.NET systems, signature-based extraction traverses class, method, and property signatures across disparate code-repositories, constructing inter-layer call graphs with up to 78.26% accuracy and significant time savings (Veenendaal et al., 2016).

Higher-Order Languages: Pushdown and Context-Free Analyses

Finite-state analyses (e.g., k-CFA) introduce spurious caller-callee linkages and return-flow pollution. Pushdown analyses such as CFA2 (Vardoulakis et al., 2011) and the state-dependent continuation allocation of "Pushdown Control-Flow Analysis for Free" (Gilray et al., 2015) match calls and returns exactly using context-free summarization, eliminating over-approximation and recovering precise, semantics-preserving graphs. The latter achieves cubic-time complexity via shallowly copied entry-context addresses, outperforming prior approaches in both precision and implementation simplicity.

JavaScript: GNN-Augmented Construction

Graphia frames call graph construction as a multi-edge program graph analyzed via gated graph neural networks. Enriching program graphs with syntactic and semantic identifier edges enables robust link prediction for unresolved call sites, achieving top-5 accuracy of ≥72% in multi-file npm corpus evaluations (Bhuiyan et al., 22 Jun 2025).

3. Dynamic and Hybrid Graph Analyses

Purely static call graphs may over- or under-approximate actual invocation behavior, particularly in dynamic languages. Hybrid approaches merge static and observed dynamic edges, yielding more faithful function-level invocation metrics:

HNII/HNOI Metrics: Hybrid Number of Incoming/Outgoing Invocations computed as the union of static and dynamic edges, weighted by cross-tool confidence. These metrics yield 2–10% improvement in bug prediction F₁ and recall when integrated with standard ML classifiers (Antal et al., 12 May 2024).
Dynamic graphs require instrumentation during representative test suite execution; static graphs ensure coverage but admit false positives.

4. Function Call Analysis in Optimization and Compilation

Function call analysis is critical for compiler optimizations, notably function inlining, which must guarantee environmental consonance for safety. The graph-reachability based method (Bergstrom et al., 2013) constructs a unified control-flow and binding graph—where inlining is permitted iff the callee is unique at a site and no path passes through rebinding nodes for captured free variables. This test achieves safe and scalable inlining, demonstrated in whole-program ML compilers.

Function-call overhead is another quantitative axis. Benchmarks reveal that C and statically-typed Cython approach ~5–7 ns per call, whereas Python and MATLAB incur ~300 ns, with Octave slower still. Analysis of $T_\mathrm{total}(N) = N(t_\mathrm{func} + t_\mathrm{call})$ informs pragmatic workflow choices: code prototyping in high-level languages, selective migration to native routines for hot paths, and judicious vectorization (Gaul, 2012).

5. Applications in Security, Reverse Engineering, and Machine Learning

Security Advisory Logic: Fine-grained call graphs enable package-level risk assessment; for instance, PyCG found that only a fraction of projects actually invoked APIs associated with vulnerabilities, allowing tailored notifications (Salis et al., 2021).
Bypassing Obfuscation: Autonomous Function Call Resolution (AFCR) systematically extracts and resolves hidden calls in canaried JavaScript via AST analysis and targeted harness execution (Oh et al., 22 Jan 2025).
Decompilation Accuracy: Labelled function calls ("tool calls") allow LLMs to retrieve exact literal values from binaries, dramatically improving decompilation fidelity and reconstructive correctness, with SOTA re-executability up to 61.43% on HumanEval-Decompile (Feng et al., 17 Feb 2025).
Malware Classification: Call graph–based node embedding pipelines (using RNN autoencoders and graph kernels) yield malware family detection rates up to 99.41% (Dalton et al., 2020) and, when fused with dynamic process graphs, further strengthen detection F₁ (from ~0.72 to ~0.85–0.94) (Aneja et al., 11 Oct 2025).

6. Graph-Theoretic Properties and Software Quality Assessment

Comprehensive analysis of call graphs uncovers:

Degree Distributions: Indegree follows a power-law; outdegree is exponentially bounded by design for maintainability (0803.4025).
Clustering and Small-World Effects: High clustering coefficients ( $C \gg C_\mathrm{random}$ ) and short average path lengths ( $L \sim O(\log n)$ ) indicate modularity but risk rapid bug propagation.
Scale--Richness: Hubs tend to connect to peripheral nodes, moderating systemic fragility (S(G) ≈ 0, scale-rich rather than scale-free).
Centrality and Assortativity: Few functions dominate betweenness centrality (impact), and the degree correlation reveals architectural layering.
Security Implications: Spectral radius governs epidemic threshold ( $\beta_c = 1 / \lambda_1$ ), identifying codebase fragility regions.

7. Function Call Analysis in LLMs

Function calling within LLMs, both for external tool invocation and structure-steering, fundamentally alters model internal logic. Causal analysis via layer- and token-level interventions confirms a concentrated causal footprint under function calling, leading to hardened decision boundaries and a 135% average boost in malicious-input detection compared to natural-language prompts (Ji et al., 18 Sep 2025). Practical implications include embedding policy as function-call schemas and monitoring middle-layer activations for compliance calibrations.

Function call analysis, via precise call graph construction, context-sensitive and pushdown modeling, hybrid metrics, and targeted tooling, underpins a broad spectrum of program analysis, optimization, security, and machine learning applications. Continuous advances in analysis algorithms and graph-augmented ML ensure that function call analysis remains central to robust, scalable, and secure software development.