Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction (2108.08468v3)

Published 19 Aug 2021 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: We study the problem of query attribute value extraction, which aims to identify named entities from user queries as diverse surface form attribute values and afterward transform them into formally canonical forms. Such a problem consists of two phases: {named entity recognition (NER)} and {attribute value normalization (AVN)}. However, existing works only focus on the NER phase but neglect equally important AVN. To bridge this gap, this paper proposes a unified query attribute value extraction system in e-commerce search named QUEACO, which involves both two phases. Moreover, by leveraging large-scale weakly-labeled behavior data, we further improve the extraction performance with less supervision cost. Specifically, for the NER phase, QUEACO adopts a novel teacher-student network, where a teacher network that is trained on the strongly-labeled data generates pseudo-labels to refine the weakly-labeled data for training a student network. Meanwhile, the teacher network can be dynamically adapted by the feedback of the student's performance on strongly-labeled data to maximally denoise the noisy supervisions from the weak labels. For the AVN phase, we also leverage the weakly-labeled query-to-attribute behavior data to normalize surface form attribute values from queries into canonical forms from products. Extensive experiments on a real-world large-scale E-commerce dataset demonstrate the effectiveness of QUEACO.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Danqing Zhang (12 papers)
  2. Zheng Li (326 papers)
  3. Tianyu Cao (16 papers)
  4. Chen Luo (77 papers)
  5. Tony Wu (11 papers)
  6. Hanqing Lu (34 papers)
  7. Yiwei Song (9 papers)
  8. Bing Yin (56 papers)
  9. Tuo Zhao (131 papers)
  10. Qiang Yang (202 papers)
Citations (18)

Summary

We haven't generated a summary for this paper yet.