Automatic Heap Layout Manipulation for Exploitation (1804.08470v2)

Published 23 Apr 2018 in cs.CR and cs.PL

Abstract: Heap layout manipulation is integral to exploiting heap-based memory corruption vulnerabilities. In this paper we present the first automatic approach to the problem, based on pseudo-random black-box search. Our approach searches for the inputs required to place the source of a heap-based buffer overflow or underflow next to heap-allocated objects that an exploit developer, or automatic exploit generation system, wishes to read or corrupt. We present a framework for benchmarking heap layout manipulation algorithms, and use it to evaluate our approach on several real-world allocators, showing that pseudo-random black box search can be highly effective. We then present SHRIKE, a novel system that can perform automatic heap layout manipulation on the PHP interpreter and can be used in the construction of control-flow hijacking exploits. Starting from PHP's regression tests, SHRIKE discovers fragments of PHP code that interact with the interpreter's heap in useful ways, such as making allocations and deallocations of particular sizes, or allocating objects containing sensitive data, such as pointers. SHRIKE then uses our search algorithm to piece together these fragments into programs, searching for one that achieves a desired heap layout. SHRIKE allows an exploit developer to focus on the higher level concepts in an exploit, and to defer the resolution of heap layout constraints to SHRIKE. We demonstrate this by using SHRIKE in the construction of a control-flow hijacking exploit for the PHP interpreter.

Citations (52)

View on Semantic Scholar

Summary

Automatic Heap Layout Manipulation for Exploitation: An Overview

The paper "Automatic Heap Layout Manipulation for Exploitation" addresses a nuanced challenge within the domain of automatic exploit generation (AEG): the manipulation of heap layout to facilitate the exploitation of heap-based buffer overflows and underflows. Throughout the paper, the authors propose both conceptual frameworks and practical implementations aimed at automating a task traditionally performed manually by experts in computer security.

Heap layout manipulation (HLM) is essential in effective exploitation because it determines the spatial relationship between heap-allocated objects, affecting which data can be corrupted by a vulnerability. The authors present a novel approach centered around pseudo-random black-box searches to automatically discover inputs that position allocated objects in memory to facilitate exploit development.

Key Aspects of the Research

Problem Analysis: The paper begins with an in-depth analysis of the heap layout manipulation problem, dissecting it into its core components. The authors emphasize the distinctions between heap and stack-based corruption vulnerabilities. They explore how fundamental constraints such as allocation order, noise in allocator interactions, and allocator implementation diversity affect the complexity of the HLM problem.
Sieve Framework: The authors introduce Sieve, an open-source framework designed to benchmark heap layout manipulation algorithms on various allocators. This tool enables experimentation across synthetic benchmarks, allowing the evaluation of different heap initializations, allocator implementations, and manipulation algorithms.
Shrike System for PHP: A significant portion of the paper is dedicated to Shrike, a specific implementation that automates heap layout manipulation on the PHP interpreter. Shrike identifies useful fragments of PHP code that interact with the heap, uses these fragments to generate candidate programs, and evaluates them against specified exploits. The modular design of Shrike integrates automatic mapping of heap constraints and interaction discovery into the exploit development process.

Experimental Evaluation

The paper presents comprehensive experiments using both synthetic benchmarks and real-world scenarios in PHP to assess the effectiveness of pseudo-random search strategies for heap layout manipulation. Results showed promising success rates particularly in noise-free environments, although increased noise and the use of segregated storage in allocators reduce the effectiveness of the search. The ability to generate control-flow hijacking exploits for PHP demonstrates the practical potential of the proposed approach.

Implications and Speculation

While the presented techniques focus on deterministic settings with known initial heap states, they reveal underlying complexities that amplify as conditions become less controlled, such as non-deterministic allocator behavior. The research underscores the necessity for new heuristics and refined search strategies that address challenges posed by allocator diversity and system noise. Additionally, the work hints at the potential for human-machine hybrid systems in exploit development, emphasizing the synergy between human expertise and automated reasoning.

In the broader AI and cybersecurity landscape, the automation of heap layout manipulation represents a significant stride toward more robust exploit generation systems. Future advancements may include adaptable algorithms capable of reasoning about dynamic environments and integrating feedback loops to refine exploit conditions iteratively.

Conclusion

The paper offers both theoretical insights and empirical evidence that automatic heap layout manipulation is feasible within specific constraints. By providing foundational tools and methods that significantly alleviate the manual burden traditionally associated with exploit generation, it lays the groundwork for future research into more adaptive and complex environments.