The Star Schema

Paul Scotchford
A free video tutorial from Paul Scotchford
Data Analyst and Qlik Development Specialist @ awari
4.5 instructor rating • 6 courses • 86,745 students

Learn more from the full course

The Complete QlikView Boot Camp

Learn QlikView for Data Analytics - Designer & Developer step by step. Real-World Data Analytics. Real data included.

15:15:50 of on-demand video • Updated April 2020

  • Handle interview questions with ease
  • Start developing solutions for your company as soon as you have finished this course
  • Receive an achievement based certificate. This is based on your test results so you can show the world your skill levels
  • Understand and apply Associative Data Modelling concepts
  • Learn about different Model Schema types
  • Learn to code Joins & understand how they work
  • Learn how the Associative Data Model is stored internally
  • Learn how to load and connect to different data sources (e.g. SQL, Web )
  • Code Scripts for ETL data processing
  • Learn to code calculations using Set Analysis
  • How to code What-If analysis
  • 14 Assignment projects for hands on practice, followed by Quizzes testing your knowledge
  • Learn charting best practices
  • Learn about advanced calculation functions like Aggr() , Rank() , IntervalMatch()
  • Create Charts such as Gauges, Pie, Bar, Combo, Line, Tornado, Heat Maps, Scatter
  • Learn performance tuning for charts and scripting
  • Add actions and conditions to charts and other objects
  • Apply many of your new skills to QlikSense e.g. Modelling, Set Analysis
  • Over 150 Quiz test questions for you to check your learning progress
  • Real world sample data available for download and analysis
  • Install QlikView 12 (and SQLServer express if you want to connect to sample SQL databases)
English [INTRO MUSIC] Report CC errors, thank you. Hi there welcome back. In this lecture I want to teach you about schemas. The first schema were going to look at is the star schema. Typically, when you create a model within your Qlik document you will end up with a star schema or a snowflake schema. Let's focus on the star schema. First of all, the star schema by definition is a schema that consists of at least one fact table and one or more dimension tables. Now let's get the terminology right. First of all, a fact is really a record or a row that relates to some event or some measure you've heard the term measures before in analytics. I presume so a measure is a measure of something. So for example, it could be the measure of distance between where we started our journey and where we finished our journey and what events took place along the way. Or, for example, a sale of a particular item in your shop so a fact is all about recording something that occurred were going to measure those events at some point during analytics, but getting back to a definition typically we'll have one or more dimensions. The dimension itself is an attribute that describes the event. So for example if I sold something in a shop or a retail online business potentially then I have a product that I've sold or a service. So the dimension will define such things as products and the products will have attributes about them, colour, size, location, and so forth. So the dimension is really about defining a description or an attribute that tells us what that product or item is now is not just restricted the products of course dimensions cover a whole raft of subject. For example, calendar is a dimension. For example, a date within a calendar is a dimension value that describes when an event took place at a particular point in our chronology so to look at the definition on this on the whiteboard here. It kind of starts to look like star and I'm not very good at drawing so excuse my terrible diagram here, but this is the fundamental what we are going to do in this lecture is connect to our SQL Server database. Our insurance database and download some tables to create a schema within our QlikView document never start a new QlikView document for this one just so you can preserve your existing work so far. So let's get onto the desktop now and start tinkering. See you in there ... now we're going to download I should say extract data from our SQL Server database, the insurance database insurance database in itself is considered to be a data warehouse as it contains a number of dimension tables as you can see on the screen and a fact table. The notion here is that when we load this rather rigid relational model that I've had to manually create within SQL Server. The QIX engine In other words, the QlikView indexing engine will work out what the associations are going to be and creates the model itself. We don't have to worry about manually creating anything as far as the model is concerned within QlikView. But there are some tips to remember it's very important that the tables contain common names or columns to associate across the individual tables and just remember this. It's absolutely vitally important because the association is based upon the names so member key in the dimension and member key in the fact and enterprise key in the fact to enterprise key in the dimension. This is important now Qlik won't try to ascertain what the content of that data is. That's up to you as a developer to ensure that your data is clean and example would be if you had a fact table and the member key happened to be a text field. In other words you set it up as varchar(x) , for example, and the dimension had it as a member key integer then, of course, whilst the association will be made quite happily from a Qlik point of view the data itself is gonna be suspect. Isn't it just one more point here is this this actual model is quite rigid, in as much as all I created here are primary and foreign key constraints, which means it's only gonna maintain integrity when data is added or removed from this particular model. The SQL side of it is quite restrictive and if I want to query this data, I actually have to form joins across the tables to return a row set that answers a question we don't have to worry about that in QlikView so let's get into QlikView and create a new document and create a document, you just click on the new icon here. Just quickly save it so I don't forget I'm just gonna call this Schemas and just rename the year sheet property here ... call this lecture 10 and immediately we shall now use Control-E or you could use the script button here on the toolbar to actually create our load. What we need to do here is firstly connect to our SQL database. In fact, the first action you should always take is form your connection 1st to wherever that data happens to reside. So if I click the connect button here it's gonna set up a Wizard to ask me where I'm going to go this case. I want the provider for SQL Server and I'm gonna type in my server name and your server name will be your PC name ... use the Windows NT integrated security and select Insurance I'm just going to drop down the box. Might take a sec .. to connect and there is if you want to do a test connection which is redundant. Feel free. Click okay and it'll create a string for you and there's our connection string next I'm gonna create a dimensions load script in a separate tab and again we need to come down to the bottom left hand of the editor here and this time we'll select the SELECT button and what' it's done is made a connection to the SQL database. I specified in my connection string and it's going to ask me what tables do I actually want to bring in. What I'll do here is initially I'll click on dimMember and it's given me the option to either select individual fields or columns, or just select all of them so for this exercise. Selecting all will be fine. Click okay and it writes the SQL code into here and we'll do one more and select the SELECT button again you could hand type these if you like, but for now will just go through the Wizard to get you familiar with it and I'm going to select the dimEnterprise and we'll just take a Select * ; click okay and we're gonna create one more tab for our facts just name Facts for now again we're going to need to come down to the SELECT option and I'm just gonna hunt down the list for my Employer Contributions Select * ; click okay and that's done what we'll do now is, we'll save and then reload by clicking the Reload button and you'll see it's connecting to the database loading our data and it is finished you can close this dynamically by selecting the close when finished button here. I'm gonna do it manually and now we have all of our table fields loaded. But let's look at the model and see what it gave us ... we've created a Star schema dynamically try and open it up a bit for you and it looks very much like what we had on the whiteboard didn't it. It's very close to ... well it's not close to it, it is a star schema! We have a dimension member a factEmployerContributions and a dimEnterprise. Let's have a quick look at the data in there before we wrap this lecture up ; do a preview and there is a data and just remember the preview doesn't give you a list of all the data it just gives you a few hundred rows just so you can do a visual check - alright so that's this lecture done before my voice goes and what we do in the next lecture is cover the Snowflake schema, see you in the next lecture.