Development of a Conversation State Prediction System (2107.01462v4)
Abstract: With the evolution of the concept of Speaker diarization using LSTM, it is relatively easier to understand the speaker identities for specific segments of input audio stream data than manually tagging the data. With such a concept, it is highly desirable to consider the possibility of using the identified speaker identities to aid in recognizing the Speaker States in a conversation. In this study, the Markov Chains are used to identify and update the Speaker States for the next conversations between the same set of speakers, to enable identification of their states in the most natural and long conversations. The model is based on several audio samples from natural conversations of three or greater than three speakers in two datasets with overall total error percentages for recognized states being lesser than or equal to 12%. The findings imply that the proposed extension to the Speaker diarization is effective to predict the states for a conversation.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.