Complete Story
 

09/15/2018

What is a Data Lake?

A super-simple explanation for anyone

If you’re even tangentially involved with big data, you know that finding storage solutions for the volumes of data being generated every second is of utmost importance. When it comes to managing data, data professionals can consider using a data warehouse or a data lake as a data repository. In order to determine what’s best for your organization, let’s first define what they are and then compare them.

What is a data lake?
Some mistakenly believe that a data lake is just the 2.0 version of a data warehouse. While they are similar, they are different tools that should be used for different purposes. James Dixon, the CTO of Pentaho is credited with naming the concept of a data lake. He uses the following analogy:

“If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”

Please select this link to read the complete article from LinkedIn.

Printer-Friendly Version