Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A new approach to content-based file type detection (1002.3174v3)

Published 17 Feb 2010 in cs.LG and cs.AI

Abstract: File type identification and file type clustering may be difficult tasks that have an increasingly importance in the field of computer and network security. Classical methods of file type detection including considering file extensions and magic bytes can be easily spoofed. Content-based file type detection is a newer way that is taken into account recently. In this paper, a new content-based method for the purpose of file type detection and file type clustering is proposed that is based on the PCA and neural networks. The proposed method has a good accuracy and is fast enough.

Citations (66)

Summary

We haven't generated a summary for this paper yet.