Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
11 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
37 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Making Large Language Models Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation (2312.05276v2)

Published 8 Dec 2023 in cs.AI and cs.LG

Abstract: Nowadays, the rapid development of mobile economy has promoted the flourishing of online marketing campaigns, whose success greatly hinges on the efficient matching between user preferences and desired marketing campaigns where a well-established Marketing-oriented Knowledge Graph (dubbed as MoKG) could serve as the critical "bridge" for preference propagation. In this paper, we seek to carefully prompt a LLM with domain-level knowledge as a better marketing-oriented knowledge miner for marketing-oriented knowledge graph construction, which is however non-trivial, suffering from several inevitable issues in real-world marketing scenarios, i.e., uncontrollable relation generation of LLMs,insufficient prompting ability of a single prompt, the unaffordable deployment cost of LLMs. To this end, we propose PAIR, a novel Progressive prompting Augmented mIning fRamework for harvesting marketing-oriented knowledge graph with LLMs. In particular, we reduce the pure relation generation to an LLM based adaptive relation filtering process through the knowledge-empowered prompting technique. Next, we steer LLMs for entity expansion with progressive prompting augmentation,followed by a reliable aggregation with comprehensive consideration of both self-consistency and semantic relatedness. In terms of online serving, we specialize in a small and white-box PAIR (i.e.,LightPAIR),which is fine-tuned with a high-quality corpus provided by a strong teacher-LLM. Extensive experiments and practical applications in audience targeting verify the effectiveness of the proposed (Light)PAIR.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Badr AlKhamissi and Marjan Ghazvininejad. 2022. A Review on Language Models as Knowledge Bases. arXiv preprint arXiv:2204.06031 (2022).
  2. Ask Me Anything: A simple strategy for prompting language models. In ICLR.
  3. Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
  4. Baichuan. 2023. Baichuan 2: Open Large-scale Language Models. arXiv preprint arXiv:2309.10305 (2023).
  5. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD. 1247–1250.
  6. COMET: Commonsense Transformers for Automatic Knowledge Graph Construction. In ACL. 4762–4779.
  7. Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning. In WSDM. 186–194.
  8. Zero-shot Approach to Overcome Perturbation Sensitivity of Prompts. In ACL. 5698–5711.
  9. Adversarial Learning for Incentive Optimization in Mobile Payment Marketing. In CIKM. 2940–2944.
  10. Consistent Prototype Learning for Few-Shot Continual Relation Extraction. In ACL. 7409–7422.
  11. Crawling The Internal Knowledge-Base of Language Models. In EACL. 1811–1824.
  12. CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning. In ACL. 6338–6353.
  13. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171–4186.
  14. Stephanie deWet and Jiafan Ou. 2019. Finding Users Who Act Alike: Transfer Learning for Expanding Advertiser Audiences. In KDD. 2251–2259.
  15. Chain-of-Verification Reduces Hallucination in Large Language Models. arXiv preprint arXiv:2309.11495 (2023).
  16. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In ACL. 320–335.
  17. DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge. In WWW. 2648–2659.
  18. Christiane Fellbaum. 1998. WordNet: an electronic lexical database. MIT Press.
  19. Linguistic representations for fewer-shot relation extraction across domains. In ACL. 7502–7514.
  20. BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from Pretrained Language Models. In ACL. 5000–5015.
  21. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. In Artif Intell, Vol. 194. 28–61.
  22. Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes. In ACL. 8003–8017.
  23. LoRA: Low-Rank Adaptation of Large Language Models. In ICLR.
  24. Active Retrieval Augmented Generation. arXiv preprint arXiv:2305.06983 (2023).
  25. Aishwarya Kamath and Rajarshi Das. 2019. A Survey on Semantic Parsing. In AKBC.
  26. Neural Architectures for Named Entity Recognition. In NAACL. 260–270.
  27. Exploring the Secrets Behind the Learning Difficulty of Meaning Representations for Semantic Parsing. In EMNLP. 3616–3625.
  28. Explicit Feature Interaction-aware Uplift Network for Online Marketing. In KDD. 4507–4515.
  29. Two-Stage Audience Expansion for Financial Targeting in Marketing. In CIKM. 2629–2636.
  30. Sources of Transfer in Multilingual Named Entity Recognition. In ACL. 8093–8104.
  31. Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786 (2022).
  32. Refined Commonsense Knowledge from Large-Scale Web Contents. arXiv preprint arXiv:2112.04596 (2021).
  33. OpenAI. 2023a. Chatgpt: Optimizing language models for dialogue.
  34. OpenAI. 2023b. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).
  35. Language Models as Knowledge Bases?. In EMNLP. 2463–2473.
  36. Exploring 360-Degree View of Customers for Lookalike Modeling. In SIGIR. 3400–3404.
  37. Soft Gazetteers for Low-Resource Named Entity Recognition. In ACL. 8118–8123.
  38. Commonsense Properties from Query Logs and Question Answering Forums. In CIKM. 1411–1420.
  39. ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. In AAAI. 3027–3035.
  40. REPLUG: Retrieval-Augmented Black-Box Language Models. arXiv preprint arXiv:2301.12652 (2023).
  41. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In EMNLP. 4222–4235.
  42. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In AAAI. 4444–4451.
  43. Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs? arXiv preprint arXiv:2308.10168 (2023).
  44. LLaMA: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971 (2023).
  45. LLaMA: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2307.09288 (2023).
  46. Enhancing Knowledge Graph Construction Using Large Language Models. arXiv preprint arXiv:2305.04676 (2023).
  47. Language Models are Open Knowledge Graphs. arXiv preprint arXiv:2010.11967 (2020).
  48. A Survey of Diversification Techniques in Search and Recommendation. arXiv preprint arXiv:2212.14464 (2022).
  49. S2ynRE: Two-stage Self-training with Synthetic data for Low-resource Relation Extraction. In ACL. 8186–8207.
  50. Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data. arXiv preprint arXiv:2304.01196 (2023).
  51. Vikas Yadav and Steven Bethard. 2018. A Survey on Recent Advances in Named Entity Recognition from Deep Learning models. In ACL. 2145–2158.
  52. Logistics Audience Expansion via Temporal Knowledge Graph. In CIKM. 4879–4886.
  53. Who Would be Interested in Services? An Entity Graph Learning System for User Targeting. In ICDE. 3248–3254.
  54. KG-BERT: BERT for Knowledge Graph Completion. arXiv preprint arXiv:1909.03193 (2019).
  55. Joint Incentive Optimization of Customer and Merchant in Mobile Payment Marketing. In AAAI. 15000–15007.
  56. Commonsense Knowledge Graph towards Super APP and Its Applications in Alipay. In KDD. 5509–5519.
  57. GLM-130B: An Open Bilingual Pre-trained Model. In ICLR.
  58. TransOMCS: From Linguistic Graphs to Commonsense Knowledge. In IJCAI. 4004–4010.
  59. How Language Model Hallucinations Can Snowball. arXiv preprint arXiv:2305.13534 (2023).
  60. Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning. In ICLR.
  61. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arXiv preprint arXiv:2309.01219 (2023).
  62. XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations. In ACL. 15918–15947.
  63. Graph Neural Networks with Generated Parameters for Relation Extraction. In ACL. 1331–1339.
  64. Learning to Expand Audience via Meta Hybrid Experts and Critics for Recommendation and Advertising. In KDD. 4005–4013.
  65. LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities. arXiv preprint arXiv:2305.13168.
  66. Hubble: An industrial system for audience expansion in mobile marketing. In KDD. 2455–2463.
Citations (2)

Summary

  • The paper introduces PAIR, a novel framework that employs adaptive relation filtering and progressive prompting augmentation to enrich marketing-oriented knowledge graphs.
  • The methodology integrates prior knowledge injection and reliable aggregation, achieving around 90.1% accuracy and up to 43.6% novelty in mined entities.
  • The framework offers a scalable, cost-effective solution with its lightweight model LightPAIR, promising improved online marketing outcomes and broader domain applicability.

Making LLMs Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation

The rapid advancement of the mobile economy has led to the proliferation of online marketing campaigns, wherein success heavily relies on the precise matching between user preferences and marketing campaigns. Central to this process is the effective construction and utilization of Marketing-oriented Knowledge Graphs (MoKGs). The paper "Making LLMs Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation" introduces a novel framework called PAIR that leverages LLMs for mining marketing-oriented knowledge graphs, thus improving the alignment between user preferences and marketing content.

Key Challenges and Solutions

The paper identifies several critical challenges in using LLMs for marketing knowledge graph construction, including:

  1. Uncontrollable Relation Generation: LLMs may generate irrelevant or fallacious relations when tasked with knowledge mining in marketing scenarios.
  2. Insufficient Prompting Capability: A single prompt often fails to cover the diverse and dynamic scope of marketing knowledge, resulting in suboptimal entity expansion.
  3. High Deployment Costs: The operational cost of deploying LLMs in real-time applications is prohibitively high.

To address these challenges, the authors propose PAIR (Progressive Prompting Augmented mining fRamework). The solution involves:

  1. Adaptive Relation Filtering: Transitioning from pure relation generation to an LLM-based relation filtering, making the process more controlled and reliable.
  2. Progressive Prompting Augmentation: Utilizing multiple well-designed prompts to steer the entity expansion process, combined with a robust aggregation mechanism that ensures both self-consistency and semantic relatedness.
  3. Deployment of a Lightweight Model: Introducing a smaller, fine-tuned version of the framework (LightPAIR) that reduces deployment costs and maintains privacy without significantly sacrificing performance.

Key Components and Methodology

Prior Knowledge Injection

Prior knowledge from existing resources like SupKG and descriptive databases such as Wikipedia is injected into LLMs to guide both the relation filtering and entity expansion processes. This prior knowledge helps LLMs overcome the unfamiliarity with domain-specific entities and relations, ensuring more relevant and accurate knowledge extraction.

Relation Filtering

To filter relations effectively, PAIR retrieves a subset of potential relations based on the entity type and uses an LLM to suggest relevant relations along with possible target entities. This significantly narrows down the space of relations and makes the subsequent entity expansion more focused and reliable.

Progressive Prompting Augmentation

The core innovation of PAIR lies in its progressive prompting augmentation, where multiple prompts exploiting different aspects of prior knowledge are used. This ranges from structural and descriptive knowledge to inheritable suggestions from relation filtering. By doing this, PAIR ensures a thorough and diversified expansion process that captures a broad spectrum of relevant marketing knowledge.

Reliable Aggregation

The final step involves aggregating the results from multiple prompts, considering both semantic relatedness and self-consistency. This ensures that the mined entities are not only relevant but also agreed upon by multiple reasoning paths within the LLM, thereby enhancing the overall reliability of the knowledge graph.

Empirical Evaluation

The paper reports extensive experiments using real-world marketing scenarios. PAIR demonstrates significant improvements over traditional knowledge graph completion and construction methods, including BERT and TRMP for completion, and COMET and LMCRAWL for construction. Key metrics such as accuracy, novelty, and diversity consistently show PAIR's superiority.

In particular, the accuracy rates achieved by PAIR hover around 90.1% on test datasets, a substantial improvement compared to the benchmarks. Notably, the mined entities exhibit substantial novelty (up to 43.6%) and diversity, affirming the framework's effectiveness in enriching the original knowledge graph and contributing unique marketing-related knowledge.

Implications and Future Work

The implications of this work are significant for online marketing and other applications requiring domain-specific knowledge mining. By leveraging LLMs’ inherent capabilities and augmenting them with domain-specific prompts and filtering, the authors provide a framework that is more adaptive and scalable to the needs of real-world marketing.

Future research directions suggested by the authors include further refining the metapath-oriented entity expansion to enhance controllability and explainability, as well as extending the framework to other domains beyond marketing.

Conclusion

PAIR represents a significant step forward in the application of LLMs for constructing marketing-oriented knowledge graphs. By addressing key challenges through adaptive relation filtering and progressive prompting augmentation, PAIR ensures more accurate, diverse, and novel knowledge extraction. This framework, along with its lightweight counterpart LightPAIR, holds promise for significantly improving the efficacy of online marketing campaigns and potentially other knowledge-intensive applications.

X Twitter Logo Streamline Icon: https://streamlinehq.com