3.5 Musketeers to Reshape Data Lake

“storage is very cheap, you can store anything; compute is super scalable, why bother updating or deleting a block in file, just simply make a new file.”

“schema and data modeling are unnecessarily rigid, JSON and schemaless is the future of big data.”

“everybody can direly use the raw data to derive insights, generate features, and train models; the big data has democratized data for everybody, so no more hassle with data engineering or data warehouse.”

Databrick’s Delta Lake: Staging->Conformed->Aggregate/Feature Tier

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store