Papers
Topics
Authors
Recent
Search
2000 character limit reached

Self-Modifying Pushdown Systems (SM-PDS)

Updated 15 February 2026
  • Self-Modifying Pushdown Systems (SM-PDS) are defined by dynamic rule modifications that extend classical pushdown systems to model self-modifying code behavior.
  • Direct saturation algorithms for reachability (Pre* and Post*) efficiently compute configuration sets despite an exponential state space, ensuring practical analysis.
  • SM-PDS techniques enable effective malware detection, outperforming traditional methods by accurately localizing malicious code blocks and improving analysis speed.

Self-Modifying Pushdown Systems (SM-PDS) are an extension of classical pushdown systems (PDS) designed to model the behavior of programs that are capable of modifying their own transition relations during execution. This formalism provides an analyzable, automata-theoretic abstraction for self-modifying code, which is widely used in malware for obfuscation and evasion. The foundational work on SM-PDS establishes their formal semantics, reachability analysis, LTL model checking, and demonstrates their practical applicability in malware detection (Touili et al., 2019, Touili et al., 2019).

1. Formal Definition and Semantics

An SM-PDS is formally a quadruple M=(P,Γ,Δ,Δn)M = (P, \Gamma, \Delta, \Delta_n), where:

  • PP is a finite set of control locations (or states).
  • Γ\Gamma is a finite stack alphabet.
  • ΔP×Γ×P×Γ\Delta \subseteq P \times \Gamma \times P \times \Gamma^* is the standard (non-modifying) transition relation; a rule (p,γ)(p,w)(p, \gamma) \mapsto (p', w) means that at control pp, with γ\gamma atop the stack, pop γ\gamma, push ww, and move to pp'.
  • ΔnP×Δ×Δ×P\Delta_n \subseteq P \times \Delta \times \Delta \times P is a finite set of self-modifying rules; a rule p#(r1r2)pp\#(r_1 \rightarrow r_2)p' enables replacement of rule r1r_1 with r2r_2 in the active set of rules, at control pp, moving to pp' without accessing the stack.

A configuration is a tuple c=(p,w,Θ)c = (\langle p, w \rangle, \Theta), where pPp \in P is the current control, wΓw \in \Gamma^* is the stack, and ΘΔΔn\Theta \subseteq \Delta \cup \Delta_n is the current “phase” (active rules). The transition relation M\Rightarrow_M covers:

  • Standard step: If (p,γ)(p,w)ΘΔ(p, \gamma) \mapsto (p', w') \in \Theta \cap \Delta, then (p,γw,Θ)M(p,ww,Θ)(\langle p, \gamma w \rangle, \Theta) \Rightarrow_M (\langle p', w'w \rangle, \Theta).
  • Self-modifying step: If p#(r1r2)pΘΔnp\#(r_1 \rightarrow r_2)p' \in \Theta \cap \Delta_n and r1Θr_1 \in \Theta, then (p,w,Θ)M(p,w,Θ)(\langle p, w \rangle, \Theta) \Rightarrow_M (\langle p', w \rangle, \Theta'), where Θ=(Θ{r1}){r2}\Theta' = (\Theta \setminus \{r_1\}) \cup \{r_2\}.

This semantics generalizes the classical PDS by allowing the automaton to alter its operational rules dynamically, capturing code rewriting behaviors directly (Touili et al., 2019, Touili et al., 2019).

2. Reachability and Decidability

Reachability analysis in SM-PDSs extends standard PDS analysis to account for dynamic rule modifications. For a regular set of initial configurations C0C_0 and a target set CFC_F, define:

  • Post(C0)={ccC0,cMc}\mathrm{Post}^*(C_0) = \{ c' \mid \exists c \in C_0, c \Rightarrow_M^* c' \},
  • Pre(CF)={ccCF,cMc}\mathrm{Pre}^*(C_F) = \{ c \mid \exists c' \in C_F, c \Rightarrow_M^* c' \}.

Both forward (Post*) and backward (Pre*) reachability sets are effectively regular and computable, despite the exponential increase in the number of possible “phases.” The reachability problem for SM-PDS is EXPTIME-complete, stemming from the 2Δ+Δn2^{|\Delta|+|\Delta_n|} bound on phases and a reduction from PDS over exponentially-large alphabets (Touili et al., 2019).

3. Direct Saturation Algorithms

Efficient reachability algorithms for SM-PDS exploit direct automata saturation, extending multi-automata constructions from static PDS to the dynamic setting.

3.1 Backward Reachability (Pre*)

A finite automaton where states are (p,Θ)(p, \Theta) (control and phase) and transitions are labeled from Γ{ε}\Gamma \cup \{\varepsilon\} is saturated by two rules:

  • (α₁) Standard transition: For r:(p,γ)(p1,w)r: (p, \gamma) \mapsto (p_1, w), if (p1,Θ)wq(p_1, \Theta) \xrightarrow{w} q exists, add (p,Θ)γq(p, \Theta) \xrightarrow{\gamma} q.
  • (α₂) Self-modifying transition: For r:p#(r1r2)p1r: p\#(r_1 \rightarrow r_2) p_1, if {r,r2}Θ\{r, r_2\}\subseteq \Theta, set Θ=(Θ{r2}){r1}\Theta' = (\Theta \setminus \{r_2\}) \cup \{r_1\}. If (p1,Θ)γq(p_1, \Theta) \xrightarrow{\gamma} q, add (p,Θ)γq(p, \Theta') \xrightarrow{\gamma} q.

Saturation terminates as the state space (p,Θ)(p, \Theta) is finite (at most P2Δ+Δn|P| 2^{|\Delta| + |\Delta_n|} states), guaranteeing that Pre* can be computed in O(poly(P,Γ)2Δ+ΔnΔ)O(\mathrm{poly}(|P|, |\Gamma|) 2^{|\Delta|+|\Delta_n|} |\Delta|) time (Touili et al., 2019).

3.2 Forward Reachability (Post*)

A similar automaton-based construction applies, where transitions are saturated according to the applicable rule, including provisions for stack manipulation and self-modification. The method introduces “helper” states as needed and similarly achieves exact recognition of Post*, terminating after exponential time in the system size (Touili et al., 2019).

3.3 Correctness and Regularity

At each iteration, the saturation algorithms maintain an invariant connecting accepted words to configuration reachability and exploit the closure properties of regular languages. Detailed inductive proofs establish the approach’s correctness. The algorithms avoid reliance on well-quasi-ordering; plain finiteness suffices (Touili et al., 2019).

4. LTL Model Checking and SM-BPDS

Temporal logic analysis of SM-PDS proceeds by reduction to the nonemptiness problem for self-modifying Büchi pushdown systems (SM-BPDS). Given SM-PDS P\mathcal{P}, labeling function ν:P2At\nu: P \to 2^{At}, and LTL formula φ\varphi, the construction produces a synchronized product SM-BPDS BPφ\mathcal{BP}_\varphi. The state space becomes P×QP \times Q, where QQ is the state set of the Büchi automaton corresponding to φ\varphi. Standard and self-modifying rules are producted accordingly.

Acceptance (Büchi acceptance) is defined so that a run is accepting if it visits accepting sets G=P×FG = P \times F infinitely often, where FF are the accepting states of the Büchi automaton for φ\varphi (Touili et al., 2019).

4.1 Emptiness via Head-Reachability Graphs

A head is (p,γ,θ)(\langle p, \gamma \rangle, \theta). The SM-BPDS accepts from cc iff there is a repeating head (reachable from cc and lying on a cycle with at least one accepting edge) in a finite, labelled directed “head-reachability graph” G\mathcal{G}. This graph has O(PΓ2Δ+Δc)O(|P||\Gamma|2^{|\Delta|+|\Delta_c|}) nodes (Touili et al., 2019).

Edges are established by saturation with boolean labels indicating whether the acceptance condition is triggered. An edge is labeled $1$ if an accepting state is visited on the path. Complexity of the entire procedure is polynomial in the SM-PDS size and linear in the Büchi automaton size.

5. Implementation and Experimental Analysis

Both (Touili et al., 2019) and (Touili et al., 2019) describe direct prototype implementations of the described algorithms. Key points include:

  • Construction of SM-PDS models directly from binary code, utilizing disassembly (e.g., via Jakstab) and control-flow graph extraction.
  • Tool pipelines include translation of LTL formulae to Büchi automata (using LTL2BA) and direct emptiness-checking or reachability procedures.
  • Performance comparison across three approaches: direct SM-PDS saturation, translation to a static PDS followed by standard (Moped) analysis, and translation to symbolic PDS.
  • On synthetic benchmarks (Δ+Δn|\Delta| + |\Delta_n| up to 1,000), direct SM-PDS saturation is 10×10\times1,000×1,000\times faster and lower memory than PDS translations. PDS methods become infeasible beyond Δ+Δn2,000|\Delta| + |\Delta_n| \approx 2,000, while the direct method completes in seconds (Touili et al., 2019).
  • In (Touili et al., 2019), LTL model checking on large SM-PDSs (up to 20,000 rules) remains practical, with direct methods running in under a minute, compared to hours or intractable resource use for PDS translations.

6. Applications in Malware Detection

SM-PDS models are directly applicable to the static analysis of self-modifying code, especially malware. Representative findings include:

  • Construction of SM-PDS models that simulate mov-based self-modification in real malware, enabling the reachability of malicious code blocks to be proved—capabilities unattainable with PDS-only models (Touili et al., 2019).
  • Direct application to 13 real malware samples confirmed successful localization of the malicious basic blocks.
  • In (Touili et al., 2019), LTL-based model checking with SM-PDS is applied to 892 malware samples (from VirusShare, MalShare, VX-Heaven, and NGVCK) and 19 benign binaries, using four template LTL formulas for malicious behavior. The tool achieved 100% detection of malware and no false positives on benign code. Average checking times ranged from 0.001 s (small unpackers) to 100 s (complex worms).
  • Comparative antivirus evaluation on 205 NGVCK worms showed the SM-PDS-based analyzer achieved 100% detection, while commercial antivirus engines peaked at approximately 82% and dropped as low as 1.4% depending on the product (Touili et al., 2019).

7. Significance and Impact

SM-PDSs provide a mathematically robust, automata-theoretic basis for analyzing programs with self-modifying behavior. The direct saturation algorithms, which are optimal for the state space blow-up inherent in SM-PDSs, outperform translation-based and symbolic PDS approaches both theoretically and empirically. The methodology enables new static analyses and temporal logic model checking previously infeasible for self-modifying code. The combination of pushdown and self-modification analysis supplies unique capabilities, particularly for malware detection, surpassing both standard program analysis frameworks and widely-used security products (Touili et al., 2019, Touili et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Self-Modifying Pushdown Systems (SM-PDS).