Data Warehouse Concepts: Basic to Advanced concepts
4.2 (254 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
2,476 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Data Warehouse Concepts: Basic to Advanced concepts to your Wishlist.

Add to Wishlist

Data Warehouse Concepts: Basic to Advanced concepts

Data Warehouse Concepts: Learn the in BI/Data Warehouse/BIG DATA Concepts from scratch and become an expert.
Bestselling
4.2 (254 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
2,476 students enrolled
Created by Sid Inf
Last updated 5/2017
English
Current price: $10 Original price: $195 Discount: 95% off
1 day left at this price!
30-Day Money-Back Guarantee
Includes:
  • 7.5 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • In this course, you would be learning all the concepts and terminologies related to the Datawarehouse , such as the OLTP, OLAP, Dimensions, Facts and much more, along with other concepts related to it such as what is meant by Start Schema, Snow flake Schema, other options available and their differences.
  • It also explains how the data is managed with in the Data Warehouse and explains the process of reading and writing data onto the Warehouse. Later in the course you would also learn the basics of Data Modelling and how to start with it logically and physically. You would also learn all the concepts related to Facts, Dimensions, Aggregations and commonly used techniques of ETL.
  • Upon completion of this course, you would have a clear idea about, all the concepts related to the Data Warehouse, that should be sufficient to help you start off with the next step of becoming an ETL developer or Administering the Data warehouse environment with the help of various tools.
  • All the Best and Happy Learning !
View Curriculum
Requirements
  • Basic understanding of the IT (Software) industry
  • Familiar with the basic concept of Database/RDBMS.
Description

In this course, you will learn all the concepts and terminologies related to the Data Warehouse , such as the OLTP, OLAP, Dimensions, Facts and much more, along with other concepts related to it such as what is meant by Start Schema, Snow flake Schema, other options available and their differences. It also explains how the data is managed with in the Data Warehouse and explains the process of reading and writing data onto the Warehouse. Later in the course you would also learn the basics of Data Modelling and how to start with it logically and physically. You would also learn all the concepts related to Facts, Dimensions, Aggregations and commonly used techniques of ETL. Upon completion of this course, you would have a clear idea about, all the concepts related to the Data Warehouse, that should be sufficient to help you start off with the next step of becoming an ETL developer or Administering the Data warehouse environment with the help of various tools. All the Best and Happy Learning !

Who is the target audience?
  • This Course is intended for freshers who are new to the Data Warehouse world, Application/ETL developers, Mainframe develoeprs, database administrators, system administrators, and database application developers who design, maintain, and use data warehouses.
  • Recent graduates looking to get a job in the IT Industry
  • Finance professionals wanting to learn about Data Warehouses for reporting and analysis purposes, businesses looking to update and improve on data storage, analysis, and reporting processes.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
Expand All 123 Lectures Collapse All 123 Lectures 07:27:02
+
Introduction
1 Lecture 06:20

Thank you for choosing this course. In this lecture you would know how to navigate and get the best out of this course. 


All the Best and Happy Learning!! :)

Preview 06:20
+
Brief about the Data warehouse
4 Lectures 21:22

The original intent of the data warehouse was to segregate analytical operations from mainframe transaction processing in order to avoid slowdowns in transaction response times, and minimize the increased CPU costs accrued by running ad hoc queries and creating and distributing reports. Over time, the enterprise data warehouse became a core component of information architectures, and it's now rare to find a mature business that doesn't employ some form of an EDW or a collection of smaller data marts to support business intelligence, reporting and analytics applications.

In this lecture we see what will be the future of Data warehouse in the age of Big Data. 

Preview 04:25

The concept of data warehousing is not hard to understand. The notion is to create a permanent storage space for the data needed to support reporting, analysis, and other BI functions. In this lecture we understand what are the main reasons behind creating a data warehouse and the benefits of it. 

This long list of benefits is what makes data warehousing an essential management tool for businesses that have reached a certain level of complexity.

Preview 05:26

A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.

In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.


Preview 05:48

A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth by William (Bill) Inmon:

•Subject Oriented

•Integrated

•Nonvolatile

•Time Variant

Preview 05:43

Test your understanding on what is a Data Warehouse

Test your understanding on what is a Data Warehouse
3 questions
+
Business Intelligence
4 Lectures 23:37

Business intelligence (BI) is a technology-driven process for analyzing data and presenting actionable information to help corporate executives, business managers and other end users make more informed business decisions.

What is Business Intelligence?
05:37

Business intelligence (BI) is the use of computing technologies for the identification, discovery and analysis of business data - like sales revenue, products, costs and incomes.

BI technologies provide current, historical and predictive views of internally structured data for products and departments by establishing more effective decision-making and strategic operational insights through functions like online analytical processing (OLAP), reporting, predictive analytics, data/text mining, bench marking and Business Performance Management (BPM). These technologies and functions are often referred to as information management.

Business Intelligence -Extended Explanation
03:34

  • Business operations reporting
  • Forecasting
  • Dashboard
  • Multidimensional analysis
  • Finding correlation among different factors
Uses of Business Intelligence
08:02

The most common tools used for business intelligence are discussed in this lecture.

Tools used for (in) Business Intelligence
06:24

Test your understanding on Business Intelligence

Test your understanding on Business Intelligence
3 questions
+
Data Warehouse Architectures
8 Lectures 32:12

In this lecture we see how the Centralized architecture is set up, in which there exists only one data warehouse which stores all data necessary for the business analysis

Preview 04:46

In a Federated Architecture the data is logically consolidated but stored in separate physical database, at the same or at different physical sites. The local data marts store only the relevant information for a department. 

The amount of data is reduced in contrast to a central data warehouse. The level of detail is enhanced in this kind of model. 

Federated Architecture
03:05

A Multi Tired architecture is a distributed data approach. This process cannot be done in a one step because many sources have to be integrated into a warehouse.

Multi-Tired Architecture
03:13

Different data warehousing systems have different structures. Some may have an ODS (operational data store), while some may have multiple data marts. Some may have a small number of data sources, while some may have dozens of data sources. In view of this, it is far more reasonable to present the different layers of a data warehouse architecture rather than discussing the specifics of any one system.

In general, all data warehouse systems have the following layers:

  • Data Source Layer
  • Data Extraction Layer
  • Staging Area
  • ETL Layer
  • Data Storage Layer
  • Data Logic Layer
  • Data Presentation Layer
  • Metadata Layer
  • System Operations Layer
Components of a Data Warehouse
03:57

This is where data is stored prior to being scrubbed and transformed into a data warehouse / data mart. Having one common area makes it easier for subsequent data processing / integration. Based on the business architecture and design there can be more than one staging area which can be termed with different naming conventions. 

Purpose of a Staging Area in Data Warehouse Architecture - Part 1
04:49

This is where data is stored prior to being scrubbed and transformed into a data warehouse / data mart. Having one common area makes it easier for subsequent data processing / integration. Based on the business architecture and design there can be more than one staging area which can be termed with different naming conventions. 

Purpose of a Staging Area in Data Warehouse Architecture - Part 2
03:41

Data warehouses tend to have a high query success, as they have complete control over the four main areas of data management systems:

  • Clean data
  • Indexes: multiple types
  • Query processing: multiple options
  • Security: data and access
Advantages of Traditional warehouse
02:33

There are considerable disadvantages involved in moving data from multiple, often highly disparate, data sources to one data warehouse that translate into long implementation time, high cost, lack of flexibility, dated information and limited capabilities.

Other disadvantages are discussed in the video. 

Limitations of Traditional Data Warehouses
06:08

Test your understanding on DWH Architectures

Test your understanding on DWH Architectures
1 question
+
ODS - Operational Data Store
4 Lectures 16:37

An ODS is designed for relatively simple queries on small amounts of data (such as finding the status of a customer order), rather than the complex queries on large amounts of data typical of the data warehouse. 

An ODS is similar to your short term memory in that it stores only very recent information; in comparison, the data warehouse is more like long term memory in that it stores relatively permanent information.

What is ODS?
02:26

The typical definition of an operational data store (ODS) is that it’s a set of logically related data structures within a database. The data within an ODS is integrated, volatile and at a non-historical granular level that is designed to address a set of operational functions for a specific business purpose. The ODS must also be based on the enterprise standards for data management for the organization.

Define ODS
07:40

Features and Benefits of ODS
02:24

Please refer to the additional resources of this section which contains the Info-graphic on the differences between the ODS, DWH, OLTP, OLAP, DSS and DM (Data Mart).

Differences between ODS,DWH, OLTP, OLAP, DSS
04:07

Test your understanding on ODS

Test your understanding on ODS Concepts
3 questions
+
OLAP
7 Lectures 28:15

OLAP (Online Analytical Processing) is the technology behind many Business Intelligence (BI) applications. OLAP is a powerful technology for data discovery, including capabilities for limitless report viewing, complex analytical calculations, and predictive “what if” scenario (budget, forecast) planning.

OLAP Overview
05:17

OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transnational databases is the entity model (usually 3NF). 

OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).

OLTP Vs OLAP - Part 1_U
04:05

OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transnational databases is the entity model (usually 3NF). 

OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).

OLTP Vs OLAP - Part 2
05:31

MOLAP (multidimensional online analytical processing) is online analytical processing (OLAP) that indexes directly into a multidimensional database.

OLAP Architecture - MOLAP
05:56

Relational online analytical processing (ROLAP) is a form of online analytical processing (OLAP) that performs dynamic multidimensional analysis of data stored in a relational database rather than in a multidimensional database (which is usually considered the OLAP standard).

ROLAP
03:35

HOLAP (hybrid online analytical processing) is a combination of ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP) which are other possible implementations of OLAP. HOLAP allows storing part of the data in a MOLAP store and another part of the data in a ROLAP store, allowing a tradeoff of the advantages of each.

HOLAP
02:20

DOLAP (Desktop OLAP or Database OLAP)provide multidimensional analysis locally in the client machine on the data collected from relational or multidimensional database servers.

DOLAP
01:31

Test your understanding on OLAP

Test your understanding on OLAP
4 questions
+
Data Mart
6 Lectures 13:52

The data mart is a subset of the data warehouse that is usually oriented to a specific business line or team. Data marts are small slices of the data warehouse. Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department.

What is a Data Mart?
01:40

Data Warehouse:

  • Holds multiple subject areas
  • Holds very detailed information
  • Works to integrate all data sources
  • Does not necessarily use a dimensional model but feeds dimensional models.

Data Mart:

  • Often holds only one subject area- for example, Finance, or Sales
  • May hold more summarized data (although many hold full detail)
  • Concentrates on integrating information from a given subject area or set of source systems
  • Is built focused on a dimensional model using a star schema.
Fundamental Difference between DWH and DM
00:40

The primary advantages are:

  • Data Segregation: Each box of information is developed without changing the other ones. This boosts information security and the quality of data.
  • Easier Access to Information: These data structures provide a easiest way of interpret the information stored on the database
  • Faster Response: Derived from the adopted structure
  • Simple queries: Based on the structure and size of the data
  • Subject full detailed data: Might also provide summarization of the information
  • Specific to User Needs: This set of data is focused on the end user needs
  • Easy to Create and Mantain
029 Advantages of a Data Mart_U
02:46

  • Easy access to frequently needed data
  • Creates collective view by a group of users
  • Improves end-user response time
  • Ease of creation
  • Lower cost than implementing a full data warehouse
  • Potential users are more clearly defined than in a full data warehouse
  • Contains only business essential data and is less cluttered.
030 Charecteristics of a Data Mart_U
03:37

Like any other system, data marts have many issues including functionality, data size, scalability, performance, data access, and consolidation. Since data marts can be broken into different departments to focus on their individual needs. This approach makes data access, consolidation, and cleansing very difficult.

For instance, when a company has a data mart for each of its departments including sales, inventory, tracking, shipping, receiving, and production. Combining revenue information from each of these departments into a single data mart can be overwhelming and confusing, due to the volume of data to be analyzed.

Disadvantages of a Data Mart
03:01

This lecture talks about the mistakes and the mis-conceptions one have with regard to the Data warehouse. 

Mistakes and MisConceptions of a Data Mart
02:08

Test your understanding on Data marts

Test your understanding on Data marts
3 questions
+
Metadata
7 Lectures 21:00

Metadata is data that describes other data. Meta is a prefix that in most information technology usages means "an underlying definition or description." Metadata summarizes basic information about data, which can make finding and working with particular instances of data easier.

Overview of Metadata
01:50

Metadata provides a number of very important benefits to the enterprise, including:

Consistency of definitions: Metadata contains information about data that helps reconcile the difference in terminology such as "clients" and "customers," "revenue" and "sales," etc.

Clarity of relationships: Metadata helps resolve ambiguity and inconsistencies when determining the associations between entities stored throughout data environment. 

Clarity of data lineage: Metadata contains information about the origins of a particular data set and can be granular enough to define information at the attribute level; metadata may maintain allowed values for a data attribute, its proper format, location, owner, and steward. 

Benefits of Metadata
01:47

Types of Metadata
05:38

In this lecture, we will see what is to be considered for setting up the projects with Metadata. 

Projects on Metadata
05:28

The standard best practices are discussed in this project. Though these may vary from project to project and enterprise to enterprise, these can be considered as the ground rules. 

Best Practices for Metadata Setup
01:30

The standard best practices are discussed in this project. Though these may vary from project to project and enterprise to enterprise, these can be considered as the ground rules. 

Best Practices for Metadata Setup
01:36

A quick summary on what is discussed on Metadata. 

Summary
03:11

Test your understanding on Metadata

Test your understanding on Metadata
3 questions
+
Data Modeling
2 Lectures 05:53

Data modeling is the formalization and documentation of existing processes and events that occur during application software design and development. 

What is Data Modeling?
02:11

Data modeling techniques and tools capture and translate complex system designs into easily understood representations of the data flows and processes, creating a blueprint for construction and/or re-engineering. 

Data Modeling Techniques
03:42
+
Entity Relational Data Model
14 Lectures 35:46

An entity–relationship model (ER model) is a data model for describing the data or information aspects of a business domain or its process requirements, in an abstract way that lends itself to ultimately being implemented in a database such as a relational database.

ER - (Entity Relation) Data Model
03:37

An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. 

For example, in a university database, students, teachers, classes, and courses offered can be considered as entities. All these entities have some attributes or properties that give them their identity.

An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing similar values. 

For example, a Students set may contain all the students of a university; likewise a Teachers set may contain all the teachers of a university from all faculties. Entity sets need not be disjoint.

ER Data Model - What is Entity?
02:01

Entity Types: The following types are discussed in this lecture

  • Independent/Fundamental Entity (###b/b###)
  • Dependent/Attributive Entity (Weak)
  • Associative Entity


ER Data Model - Types of Entities - Part 1
03:57

Entity Types: The following types are discussed in this lecture

  • Super type Entity
  • Sub type Entity
ER Data Model - Types of Entities - Part 2
01:49

Entities are represented by means of their properties, called attributes. All attributes have values. 

ER Data Model - Attributes
01:54

  • Simple attribute
  • Composite attribute
  • Derived attribute
  • Single-value attribute 
  • Multi-value attribute
ER Data Model - Types of Attributes
03:59

Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.

  • Super Key
  • Candidate Key
  • Primary Key
ER Data Model - Entity-Set and Keys
02:42

One or more attribute uniquely identifies an instance of an entity.

ER Data Model - Identifier
01:53

The association among entities is called a relationship. For example, an employee works_at a department, a customer enrolls for an offer. Here, Works_at and Enrolls are called relationships.

ER Data Model - Relationship
01:15

ER-Diagram is a visual representation of data that describes how data is related to each other.

ER Data Model - Notation
02:34

A logical data model describes the data in as much detail as possible, without regard to how they will be physical implemented in the database. Features of a logical data model include:

  • Includes all entities and relationships among them.
  • All attributes for each entity are specified.
  • The primary key for each entity is specified.
  • Foreign keys (keys identifying the relationship between different entities) are specified.
  • Normalization occurs at this level.

The steps for designing the logical data model are as follows:

  1. Specify primary keys for all entities.
  2. Find the relationships between different entities.
  3. Find all attributes for each entity.
  4. Resolve many-to-many relationships.
  5. Normalization.
ER Data Model - Logical Data Model
01:30

ER Data Model - Moving from Logical Data Model to Physical Data Model
02:14

Here we compare these three types of data models. 

We can see that the complexity increases from conceptual to logical to physical. This is why we always first start with the conceptual data model (so we understand at high level what are the different entities in our data and how they relate to one another), then move on to the logical data model (so we understand the details of our data without worrying about how they will actually implemented), and finally the physical data model (so we know exactly how to implement our data model in the database of choice). In a data warehousing project, sometimes the conceptual data model and the logical data model are considered as a single deliverable.

ER Data Model - Differences between CDM, LDM and PDM
03:06

Following are advantages of an E-R Model:

  • Straightforward relation representation
  • Easy conversion for E-R to other data model
  • Graphical representation for better understanding

Disadvantages of E-R Data Model:

  • No industry standard for notation
  • Popular for high-level design
  • Software cannot usefully query a general ER model
  • Non-intuitive 
ER Data Model - Disadvantages
03:15

Test your understanding on ER Models

Test your understanding on ER Models
3 questions
8 More Sections
About the Instructor
Sid Inf
4.3 Average rating
1,050 Reviews
6,157 Students
5 Courses
Data/ETL Architect

Business Intelligence Consultant and Trainer with 14+ years of extensive work experience on various client engagements. Delivered many large data warehousing projects and trained numerous professionals on business intelligence technologies. Extensively worked on all facets of data warehousing including requirement gathering, gap analysis, database design, data integration, data modeling, enterprise reporting, data analytics, data quality, data visualization, OLAP.

Has worked on broad range of business verticals and hold exceptional expertise on  various ETL tools like Informatica Powercenter, SSIS, ODI and IDQ, Data Virtualization, DVO, MDM.