Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog (1904.11610v1)

Published 25 Apr 2019 in cs.CL and cs.AI

Abstract: We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners. The corpus consists of half a million instant messages, across several messaging platforms. We focus our analyses on seven speaker attributes, each of which partitions the set of speakers, namely: gender; relative age; family member; romantic partner; classmate; co-worker; and native to the same country. In addition to the content of the messages, we examine conversational aspects such as the time messages are sent, messaging frequency, psycholinguistic word categories, linguistic mirroring, and graph-based features reflecting how people in the corpus mention each other. We present two sets of experiments predicting each attribute using (1) short context windows; and (2) a larger set of messages. We find that using all features leads to gains of 9-14% over using message text only.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Charles Welch (19 papers)
  2. Jonathan K. Kummerfeld (38 papers)
  3. Rada Mihalcea (131 papers)
  4. Verónica Pérez-Rosas (15 papers)
Citations (15)