Papers
Topics
Authors
Recent
2000 character limit reached

Hub Star Modeling 2.0 for Medallion Architecture (2504.08788v2)

Published 6 Apr 2025 in cs.DB

Abstract: Data warehousing enables performant access to high-quality data integrated from dynamic data sources. The medallion architecture, a standard for data warehousing, addresses these goals by organizing data into bronze, silver and gold layers, representing raw, integrated, and fit-to-purpose data, respectively. In terms of data modeling, bronze layer retains the structure of source data with additional metadata. The gold layer follows established modeling approaches such as star schema, snowflake, and flattened tables. The silver layer, acting as a canonical form, requires a flexible and scalable model to support continuous changes and incremental development. This paper introduces an enhanced Hub Star modeling approach tailored for the medallion architecture, simplifying silver-layer data modeling by generalizing hub and star concepts. This approach has been demonstrated using Databricks and the retail-org sample dataset, with all modeling and transformation scripts available on GitHub.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.