
Welcome to Data Mining with SAP HANA. In this lecture we will look at all the content the course will cover.
In this lecture we will have a first look at the PAL library, We will cover the following topics:
In this lecture we will look at ABC Analysis as simple implementation of a Grouping Algorithm. We will go through the creation of the ABC Analysis step by step utilizing the various tools available:
All the HANA resources and data used in this lecture is available in the Downloadable Materials section of this lecture.
In this lecture we will look at Single, Double and Triple Exponential Smoothing. We will use Smoothing to make predictions using real-world data on Sea Surface temperatures. This lecture will cover:
All the SQL for the lecture is contained in the downloadable material below.
In the second part of Exponential Smoothing we will look at the HANA Analysis Process (AP) in SAP BW. Not only are we going to create the HANA Analysis Process in BW, but we will create a data flow to use the table we created in HANA. In more detail, we will:
Why are our best and most experienced employees leaving prematurely? We will try to predict which valuable employees will leave next.
In this lecture we will do some basic data exploration to get a feel for the dataset and also visualize the data in Lumira.
Generally, before we start the analysis of data, we have to do some data preparation.In this lecture we are going to do binning of the last performance rating into 5 categories. This will greatly reduce the complexity of the decision trees we are going to build in the next lecture.
This lecture discusses how decision tress are constructed. This lecture is optional and can be skipped if you are familiar with the math of decision tree construction.
Now that we have prepared our data, we will look at the various decision trees available to us. In particular we will look at:
Once we have completed the theory, we will run a small sql file to verify the first level split of both types of decision trees.
Note: Only the construction of the first level of the tree is covered in this lecture. The rest of the levels are left as an exercise and the answers are in the downloadable material for this lecture.
Now that we have prepared our data, we can use the full dataset to construct the decision trees. We will use the CART tree in this lecture to see which of our high performing employees are at risk of leaving. As usual the sql can be found in the downloadable material. Not only is the script for the CART tree attached, but the script for the C4.5 tree is also attached and you can run it to contrast the results of the two procedures.
This lecture will contrast the results of the decision tree when defining data as either continuous or discrete intervals. In the data preparation lecture we binned the last evaluation column into 5 discrete bins. In contrast, what would the effect be if we incorrectly classify Hours Worked as categorical?
*** This course requires a access to an SAP HANA System.**
This Entry Level to Intermediate SAP HANA Predictive Analytics course will help you master many important techniques to start creating sophisticated, predictive analytics applications that utilize the power of SAP HANA and Business Intelligence.
The course is designed so that you can master all the techniques gradually, starting from basic and relatively simple techniques before moving on to the more demanding techniques that Business Intelligence Professionals use to create predictive analytics applications for their customers.
The course will take you step by step through the process of creating the required HANA objects, such as tables, views and predictive analytics SQL scripts. In particular, from this course you will learn:
Fundamentals of the Predictive Analytics Library,
The structures involved, such as HANA Tables, Views, PAL SQL procedures and more,
A comparison of the raw PAL SQL code with the HANA Analytical Processes available in SAP BW by creating the comparable HANA AP in BW,
Integrating Predictive Analytics into SAP BW and SAP Lumira
Prerequisites:
This course assumes no knowledge of the HANA Predictive Analytics Library.
BW and HANA experience would be helpful.
What this course is not:
This course does not cover every single Predictive Analytics algorithm. It covers enough of the algorithms for you to get comfortable with using them and apply the techniques to any other functions. Covering all algorithms will result in a high level of repetition without any real value.
What sets this course apart from anything available on other platforms is the fact that it covers the integration and application of the Predictive Analytics Library with the various other SAP BW and visualization platforms.
This course will always expend so check back regularly for updates and more content, for example, integration of PAL into SAP BPC Embedded, more case studies for Regression Algorithms, Text Analytics and more!