Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

What can Data-Centric AI Learn from Data and ML Engineering? (2112.06439v1)

Published 13 Dec 2021 in cs.LG and cs.DB

Abstract: Data-centric AI is a new and exciting research topic in the AI community, but many organizations already build and maintain various "data-centric" applications whose goal is to produce high quality data. These range from traditional business data processing applications (e.g., "how much should we charge each of our customers this month?") to production ML systems such as recommendation engines. The fields of data and ML engineering have arisen in recent years to manage these applications, and both include many interesting novel tools and processes. In this paper, we discuss several lessons from data and ML engineering that could be interesting to apply in data-centric AI, based on our experience building data and ML platforms that serve thousands of applications at a range of organizations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Neoklis Polyzotis (14 papers)
  2. Matei Zaharia (101 papers)
Citations (45)

Summary

We haven't generated a summary for this paper yet.