Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

r-HUMO: A Risk-Aware Human-Machine Cooperation Framework for Entity Resolution with Quality Guarantees (1803.05714v3)

Published 15 Mar 2018 in cs.HC and cs.DB

Abstract: Even though many approaches have been proposed for entity resolution (ER), it remains very challenging to find one with quality guarantees. To this end, we proposea risk-aware HUman-Machine cOoperation framework for ER, denoted by r-HUMO. Built on the existing HUMO framework, r-HUMO similarly enforces both precision and recall levels by partitioning an ER workload between the human and the machine. However, r-HUMO is the first solution to optimize the process of human workload selection from a risk perspective. It iteratively selects human workload based on real-time risk analysis on human-labeled results as well as prespecified machine metrics. In this paper,we first introduce the r-HUMO framework and then present the risk analysis technique to prioritize the instances for manual labeling. Finally,we empirically evaluate r-HUMO's performance on real data. Our extensive experiments show that r-HUMO is effective in enforcing quality guarantees,and compared with the state-of-the-art alternatives, it can achieve better quality control with reduced human cost.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Boyi Hou (5 papers)
  2. Qun Chen (28 papers)
  3. Zhaoqiang Chen (7 papers)
  4. Youcef Nafa (5 papers)
  5. Zhanhuai Li (9 papers)
Citations (11)