Modern Data Warehouse Concepts

Name: Modern Data Warehouse Concepts
Rating: 4.0 (97 reviews)

Fundamentals, Strategies & Architecture options for implementing a Modern Data Warehouse Environment

Created bySid Inf

Last updated 6/2020

English

English [Auto],

What you'll learn

What is the need to move from a traditional Data Warehouse to a Modern Data Warehouse?
What are the benefits to an Enterprise to move on to a Modern Data Warehouse?
On what basis should we consider to move from a traditional Data Warehouse to a Modern Data Warehouse
What are the common use cases we should consider when considering the option of Modernizing the traditional Data Warehouse?

Course content

8 sections • 34 lectures • 2h 55m total length

Are there any prerequisites to this course? What are they?4:17
Are there any prerequisites to this course? What are they?

What is a Data Warehouse?5:03
A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
What are the key charecteristics of a Data Warehouse?4:59
A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth by William (Bill) Inmon:
•Subject Oriented
•Integrated
•Nonvolatile
•Time Variant
What are the componets of a traditional Data Warehosue Architecture?8:21
The data warehouse architecture is based on a relational database management system server that functions as the central repository for informational data. Operational data and processing is completely separated from data warehouse processing. This central information repository is surrounded by a number of key components designed to make the entire environment functional, manageable and accessible by both the operational systems that source data into the warehouse and by end-user query and analysis tools.
Typically, the source data for the warehouse is coming from the operational applications. As the data enters the warehouse, it is cleaned up and transformed into an integrated structure and format.
The challenges with the traditional Data Warehouses11:15
The challenges with the traditional Data Warehouses.
Expensive & time consuming to setup and operate
Difficult to scale/ Data Growth
Handling Data Variety
Not compatible with modern use cases
Compliance and Security

Data Ingestion Layer1:41
This lecture talks about the Data Ingestion methods.
ETL,ELT, Batch processing, Bulk Load, Event Data processing, Custom code using Java/Python/Perl/Pyspark etc…
What is ETL?5:43
ETL is short for extract, transform, load, three database functions that are combined into one tool to pull data out of one database and place it into another database. Extract is the process of reading data from a database. In this stage, the data is collected, often from multiple and different types of sources.
What is ELT and ETLT?5:09
ELT is a different way of looking at the tool approach to data movement. Instead of transforming the data before it’s written, ELT leverages the target system to do the transformation. The data is copied to the target and then transformed in place.
What is the difference between ETL and ELT?2:10
ETL vs. ELT – What’s the Big Difference?
What are the different types of Data processing methods?4:14
What are the different types of Data processing methods?
Batch
Real Time
Streaming
What is Batch Processing?5:13
Let's understand what is batch processing of data.
What is near Real time or Micro Batch Processing?5:36
We run batch processes on much smaller accumulations of data - typically less than a minute’s worth of data. These are called as Micro batches.
What is Stream processing?3:40
The real time event data processing.

What is a Data Lake?2:41
Data Lake is a repository where the data is stored in its native format.
Where does Data Lake Fit in? and What are the differences?10:47
Where does Data Lake Fit in? and What are the differences?
Understand the Data Lake vs Data Warehouse with an example7:46
Understand the Data Lake vs Data Warehouse with an example
Schema on Read and Schema on Write3:25
Schema on Read and Schema on Write
Differences between different Data Repositories6:35
Differences between different Data Repositories
The Co-existence of Data Warehouse and Data Lakes3:11
The Co-existence of Data Warehouse and Data Lakes
Dis-advantages and remediation options of a Data Lake4:17
Dis-advantages and remediation options of a Data Lake
Data Lake Zones and Data Governance in Data Lakes14:26
Data Lake Zones and Data Governance in Data Lakes

Advanced Analytics1:24
Advanced Analytics is the automated or semi-automated way of data or content analysis using sophisticated techniques and tools, typically which are not usually possible with the traditional business intelligence (BI) and these help to discover deeper insights, make predictions, or generate recommendations.
Advanced analytic techniques include those such as data/text mining, machine learning, pattern matching, forecasting, visualization, semantic analysis, sentiment analysis, network and cluster analysis, graph analysis, simulation, complex event processing etc.
Stay tuned for more awesome content!0:26
Stay tuned for more awesome content!

Requirements

Data Warehouse Concepts - Good to have but not mandatory.
Cloud Data Warehouse Concepts - Good to have but not mandatory.

Description

The objective of this course is to learn/know the fundamentals of the Modern Data Warehouse and what strategies can be used move from a traditional Data Warehouse in combination of Big Data Technologies, Data Lakes and Data Visualization.

A Modern Data Warehouse gives us flexibility to analyze the data we need, in the format we need it by using familiar tools , technologies, concepts like Big Data, Hadoop, Cloud Ecosystems, BI tools and SQL/NoSQL. A Modern Data Warehouse will be able to consume and process variety of data formats like the semi structured, unstructured and multi structured. These data sets come in multiple formats and are generated from a non-transactional systems such as machines, sensors, and customer interaction streams. These systems are not only varied but also, they're producing data at volumes, varieties, and velocities like we've never seen before. This kind of data is not new. We've actually had multi-structured data for a long time, but very few organizations could work with it before because it is so expensive to store and so hard to connect or link them using the traditional data warehouse architectures or models.

Who this course is for:

Any body interested in learning the basics of Modern Data Warehouse Architectures'
Data Warehouse and Business Intelligence Professionals
ETL Developers ( Informatica Power Center, IBM DataStage, Pentaho, Talend etc)
Data Analysts
Business Analysts
Data Architects
Enterprise Architects
Data Science Professionals
Data Engineers
BI/Reporting Professionals
Project Managers
ETL/DWH Testers
Analytics Professionals who want to know how the Modern Data Warehouse Architecures work
Individual Contributors in the field of Enterprise Business Intelligence/Enterprise Data Warehouse/ Data Lake
Mainframe Developers and Architects who want to switch to the Data Warehouse/ Modern Data Warehouse /Business Intelligence World

Modern Data Warehouse Concepts

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 4min

Traditional Data Warehouses4 lectures • 30min

The Key Players - Source Layer2 lectures • 10min

Data Ingestion Layer8 lectures • 33min

Data Lake8 lectures • 53min

Modern Data Warehouse Architectures6 lectures • 21min

What about Data Quality?3 lectures • 22min

Data Definitions / Terminologies2 lectures • 2min

Requirements

Description

Who this course is for: