Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Audio Analyzer: a Framework to Industrialize the Research on Audio Forensics (2310.19081v1)

Published 29 Oct 2023 in cs.SD and eess.AS

Abstract: Deep Audio Analyzer is an open source speech framework that aims to simplify the research and the development process of neural speech processing pipelines, allowing users to conceive, compare and share results in a fast and reproducible way. This paper describes the core architecture designed to support several tasks of common interest in the audio forensics field, showing possibility of creating new tasks thus customizing the framework. By means of Deep Audio Analyzer, forensics examiners (i.e. from Law Enforcement Agencies) and researchers will be able to visualize audio features, easily evaluate performances on pretrained models, to create, export and share new audio analysis workflows by combining deep neural network models with few clicks. One of the advantages of this tool is to speed up research and practical experimentation, in the field of audio forensics analysis thus also improving experimental reproducibility by exporting and sharing pipelines. All features are developed in modules accessible by the user through a Graphic User Interface. Index Terms: Speech Processing, Deep Learning Audio, Deep Learning Audio Pipeline creation, Audio Forensics.

Summary

We haven't generated a summary for this paper yet.