Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Retrieval Augmented Generation using Engineering Design Knowledge (2307.06985v10)

Published 13 Jul 2023 in cs.CL, cs.DB, and cs.IR

Abstract: Aiming to support Retrieval Augmented Generation (RAG) in the design process, we present a method to identify explicit, engineering design facts - {head entity :: relationship :: tail entity} from patented artefact descriptions. Given a sentence with a pair of entities (based on noun phrases) marked in a unique manner, our method extracts the relationship that is explicitly communicated in the sentence. For this task, we create a dataset of 375,084 examples and fine-tune LLMs for relation identification (token classification) and elicitation (sequence-to-sequence). The token classification approach achieves up to 99.7 % accuracy. Upon applying the method to a domain of 4,870 fan system patents, we populate a knowledge base of over 2.93 million facts. Using this knowledge base, we demonstrate how LLMs are guided by explicit facts to synthesise knowledge and generate technical and cohesive responses when sought out for knowledge retrieval tasks in the design process.

Citations (3)

Summary

We haven't generated a summary for this paper yet.