Since data warehouses only house processed data, all of the data in a data warehouse has been used for a specific purpose within the organization. Pentaho CTO James Dixon has generally been credited with coining the term "data lake". He describes a data mart (a subset of a data warehouse) as akin to a bottle of water…"cleansed, packaged and structured for easy consumption" while a data lake is more like a body of water in its natural state. Data lake data often comes from disparate sources and can include a mix of structured, semi-structured, and unstructured data formats. Data lakes and data warehouses are useful for different users. Data lakes primarily store raw, unprocessed data, while data warehouses store processed and refined data. This means that data lakes have less organization and less filtration of data than their counterpart. It stores all types of data be it structured, semi-structured, or unstructured. Data Lake vs Data Warehouse is a conversation many companies are having. Data Lake is schema-on-read processing. Organizations often need both. Data warehouse is used to analyze archived structured data, filtered data that has been processed for a specific purpose. Raw, unstructured data usually requires a data scientist and specialized tools to understand and translate it for any specific business use. In the transportation industry, especially in supply chain management, the prediction capability that comes from flexible data in a data lake can have huge benefits, namely cost cutting benefits realized by examining data from forms within the transport pipeline. Here are the differences among the three data associated terms in the mentioned aspects: Data: Unlike a data lake, a database and a data warehouse can only store data that has been structured. 