
Introduction to the course, content specification
Description of the Knime platform environment, explanation how to use each sections in the Knime analytics platform.
If you are new to KNIME, please switch the interface into the classic one - see in the resources section.
We will guide you through the installation of the Knime analytics platform.
Please also, when installing new KNIME analytics platform, see the enclosed PDF document where you will find information about the difference between standard (older versions 3 - 4) and new 5 version.
There are launched 5.x versions which brought significant changes.
But do not worry, in the enclosed PDF you can find how to switch to classic environment. Also few nodes will have slightly different naming (like instead of Pivoting Pivot etc.).
If you are new to KNIME, please switch the interface into the classic one - see in the resources section.
1. If you are new to KNIME, please switch the interface into the classic one - see in the resources section.
2. Due to often KNIME upgrades, the names of the nodes might sometimes be changed, e.g.:
- File Reader (for CSV reading) use CSV Reader in newer versions
- Excel Reader (XLS) use Excel Reader in newer versions
After this lecture you will be able to read data by using KNIME (xls and csv)
In this tutorial we will learn how to use the KNIME nodes
We will merge data which we have read in the previous lecture
In this lecture we will learn how to get information about the data frames and we will learn how to transpose the table (switch columns and rows)
After this lecture, you will be able to split your data into datasets and filter your data according to the selected values
After this lecture you will be able to partition your data, group them and pivot, similarly to the pivoting and grouping in MS Excel
Due to often KNIME upgrades, the names of the nodes might sometimes be changed, e.g.:
- Column Renam use Column Renamer in newer versions
We will create numeric binners of our numeric data into groups according the boundaries we will set up
After this lecture you will be able to convert the data types, rename the columns, add constant value
In this lecture we will count basic calculations by using the expressions in the math formula node
We will filter our data set by using column filter node and missing value column node, so we will learn how to filter out certain columns.
We will split our columns into more columns by using several splitting nodes
How to handle when having missing values? Use the missing values node and use more options in there
After this lecture you will be able to change the data types to date format and count the difference between two days
During this lecture you will see how easy is to extract different information from the date and time format, e.g. year, month, day etc.
Histogram / column charts enables to compare values of different classes in columns
To depict how the values of some variable develop over time is useful to illustrate by Line plot
Check the size of certain class is high in comparison with a total size
By using scatter plot we check the relation between two variables
Learn about the value distribution.
Please download those files if you continue in this old part (videos available on YTB).
Welcome in the Machine Learning section
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
To get the whole picture, we should understand AI, Data Science, Machine Learning and Big Data
Welcome in the Machine Learning part
Here we prepare the folder for our hands-on part. Please download all files.
Please be aware, that with the version KNIME 3.X there is a problem to load Heart Disease file including in some cases (7 rows) instead of numbers the sign ?.
For those of you reading the Heart disease file and getting error when using Excel reader node, use the file HeartDisease_noQM.xlsx and therefore you won´t need to use row filters as shown in the next video in the preprocessing part.
Learn about different classification ML techniques
Let´s collect data to our first workflow.
Please be aware, that with the version KNIME 3.X there is a problem to load Heart Disease file including in some cases (7 rows) instead of numbers the sign ?.
For those of you reading the Heart disease file and getting error when using Excel reader node, use the file HeartDisease_noQM.xlsx and therefore you won´t need to use row filters as shown in the video with the preprocessing part.
To understand data we need to use different exploration techniques
Before applying machine learning, we need to do data processing.
Please be aware, that with the version KNIME 3.X there is a problem to load Heart Disease file including in some cases (7 rows) instead of numbers the sign ?.
For those of you reading the Heart disease file and getting error when using Excel reader node, use the file HeartDisease_noQM.xlsx and therefore you won´t need to use row filters as shown in this video.
Before applying machine learning, we need to do data processing
Before applying machine learning, we need to do data processing
We apply first classification algorithm - decision tree
Let´s try another, more powerful methods
Let´s try another methods
In the second part we will work on regression problem
Firstly, we load our data to KNIME
To understand our data we need to use exploration nodes
Before applying machine learning techniques we preprocess our data
Before applying machine learning techniques we preprocess our data
Finally we can apply all machine learning techniques
We apply decision tree on churn model classification problem
We predict churn (yes / no) on new customer
Learn very useful features Metanodes and Components
Please see the pdf document
All downloadable documents are available on : https://1drv.ms/u/s!AolTGH3TVJWGhRtpLJYUFSVBeL-e?e=SVNWdr
The goal of this course is to gain knowledge how to use open source Knime Analytics Platform for data analysis and machine learning predictive models on real data sets.
The course has two main sections:
1. PRE-PROCESSING DATA: TRANSOFRMING AND VISUALIZING DATA FRAMES
In this part we will cover the operations how to model, transform and prepare data frames and visualize them, mainly:
table transformation (merging data, table information, transpose, group by, pivoting etc.)
row operations (eg. filter)
column operations (filtering, spiting, adding, date information, missing values, adding binners, change data types, do basic math operations etc.)
data visualization (column chart, line plot, pie chart, scatter plot, box plot)
2. MACHINE LEARNING - REGRESSION AND CLASSIFICATION: We will create machine learning models in standard machine learning process way, which consists in:
data collection with reading nodes into the KNIME software (the data frames are available in this course for download)
pre-processing and transforming data to get well prepared data frame for the prediction
visualizing data with KNIME visual nodes (we will create basic plots and charts to have clear picture about our data)
understanding what machine learning is and why it is important
creating machine learning predictive models and evaluating them:
Simple and Multiple linear Regression
Polynomial Regression
Decision Tree Classification
Decision Tree Regression
Random Forest Regression
Random Forest Classification
Naive Bayes
SVM
Gradient booster
I will also explain the Knime Analytics Platform environment, guide you through the installation , and show you where to find help and hints.
One lecture is focused on working with Metanodes and Components.