Unleashing the Power of LLM to Infer State Machine from the Protocol Implementation (2405.00393v4)

Published 1 May 2024 in cs.CR

Abstract: State machines are essential for enhancing protocol analysis to identify vulnerabilities. However, inferring state machines from network protocol implementations is challenging due to complex code syntax and semantics. Traditional dynamic analysis methods often miss critical state transitions due to limited coverage, while static analysis faces path explosion issues. To overcome these challenges, we introduce a novel state machine inference approach utilizing LLMs, named ProtocolGPT. This method employs retrieval augmented generation technology to enhance a pre-trained model with specific knowledge from protocol implementations. Through effective prompt engineering, we accurately identify and infer state machines. To the best of our knowledge, our approach represents the first state machine inference that leverages the source code of protocol implementations. Our evaluation of six protocol implementations shows that our method achieves a precision of over 90%, outperforming the baselines by more than 30%. Furthermore, integrating our approach with protocol fuzzing improves coverage by more than 20% and uncovers two 0-day vulnerabilities compared to baseline methods.

References (50)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/FSFG/status/1785959751378935908

Unleashing the Power of LLM to Infer State Machine from the Protocol Implementation (2405.00393v4)

Summary

Related Papers

Tweets