Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
WEKA - Data Mining with Open Source Machine Learning Tool
Rating: 4.4 out of 5(374 ratings)
15,144 students

WEKA - Data Mining with Open Source Machine Learning Tool

WEKA tool for data preparation, classification, regression, clustering, association rules mining, and visualization
Last updated 3/2019
English

What you'll learn

  • Students can learn WEKA tool for data pre-processing, classification, regression, clustering, association rules, and visualization

Course content

1 section5 lectures3h 29m total length
  • Waikato Environment for Knowledge Analysis (WEKA)44:25

    Waikato Environment for Knowledge Analysis (WEKA):

    • WEKA is a data mining / machine learning tool developed by Department of Computer Science, University of Waikato, New Zealand.

    • Weka is a collection of machine learning algorithms for data mining tasks.

    • The algorithms can either be applied directly to a dataset or called from your own Java code.

    • Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

    • It is also well-suited for developing new machine learning schemes.

    • Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature.

    • The name is pronounced and the bird sounds like WEKA.

    • There are two versions of Weka: Weka 3.8 is the latest stable version, and Weka 3.9 is the development version.


    Main Features:

    1. 49 data pre-processing tools

    2. 76 classification / regression algorithms

    3. 8 clustering algorithms

    4. 3 algorithms for finding association rules

    5. 15 attribute /subset evaluates + 10 search algorithms for feature selection.


    Installation of Weka:

    • You can download Weka from the official website http://www.cs.waikato.ac.nz/ml/weka/


    Weka application interfaces:

    • There are totally five application interfaces available for Weka.

    • When we open Weka, it will start the Weka GUI Chooser screen from where we can open the Weka application interface.

    1. Explorer

    2. Experimenter

    3. KnowledgeFlow

    4. Workbench

    5. Simple CLI


    Weka Data Formats:

    • Weka uses the Attribute Relation File Format for data analysis, by default.

    • But listed below are some formats that Weka supports, from where data can be imported:

    1. CSV

    2. ARFF

    3. Database using ODBC


    Attribute Relation File Format (ARFF):

    This has two parts:

    1. The header section defines the relation (data set) name, attribute name and the type.

    2. The data section lists the data instances.

    An ARFF file requires the declaration of the relation, attribute and data.

    @relation: This is the first line in any ARFF file, written in the header section, followed by the relation/data set name. The relation name must be a string and if it contains spaces, then it should be enclosed between quotes.


    @attribute: These are declared with their names and the type or range in the header section. Weka supports the following data types for attributes:

    • Numeric

    • <nominal-specification>

    • String

    • date


    @data: Defined in the Data section followed by the list of all data segments.

  • Analysis & Prediction using WEKA Machine Learning Toolkit44:43
  • Python Libraries for Data Science22:32
  • Introduction to Data Science53:16

    INTRODUCTION TO DATA SCIENCE:

    • What is Data Science?

    • Who is Data Scientist?

    • Who can be Data Scientist?

    • Data Science Process

    • Modern Data Scientist

    • Data Science Workflow

    • Technologies used in Data Science


    What is DATA SCIENCE :

    • Data science is a "concept to statistics, data analysis, machine learning and their related methods" in order to "understand and analyze” with data.

    • Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.

    • Data Science is also called as "The Sexiest Job of the 21st Century".


    DATA ANALYSIS:

    • Data analysis is the process of extracting information from data. It involves multiple stages including establishing a data set, preparing the data for processing, applying models, identifying key findings and creating reports.

    • The goal of data analysis is to find actionable insights that can inform decision making.

    • Data analysis can involve data mining, descriptive and predictive analysis, statistical analysis, business analytics and big data analytics.


    Who is Data Scientist:

    • Statistician + Software Engineer

    • A person who is better at statistics than  any software engineer or a person who is better at software engineering than any statistician is a data scientist.


    Who can be Data Scientist:

    • Computing Skills + Mathematics, Probability & Statistical Knowledge + Domain Expertise can be a data scientist


    Data Science Process:

    Real World   ->  Raw data collected  ->  Data is processed  -> Clean Data set  ->  Exploratory Data Analysis  ->  Models & Algorithms  ->  Communicate visual report (Making Decisions) ->  Data Product  ->  Real World


    Modern Data Scientist:

    • Math & Statistics

    • Programming & Database

    • Domain Knowledge & Soft Skills

    • Communication & Visualization


    Data Science Workflow:

    • Problem definition

    • Data Collection & Preparing

    • Model Development

    • Model Deployment

    • Performance Improvement


    Technologies used in Data Science:

    • R

    • Python

    • Weka  etc.......

  • Introduction to Machine Learning44:38

    Machine Learning:

    • It is similar like Human Learning

    • Machine learning is the sub-field of computer science that, according to Arthur Samuel, gives "computers the ability to learn without being explicitly programmed."

    • Samuel, an American pioneer in the field of computer gaming and artificial intelligence, coined the term "machine learning" in 1959 while at IBM.

    • Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to "learn" (e.g., progressively improve performance on a specific task) with data, without being explicitly programmed.


    Traditional Programming vs Machine Learning:

    • In traditional programming, if we give inputs + programs to the computer, then computer gives the output.

    • In machine learning, if we give inputs + outputs to the computer, then computer gives the program (Predictive Model).


    Example 1:  Here "a" and "b" are inputs and "c" is output

    a b c

    1 2 3

    2 3 5

    3 4 7

    4 5 9

    9 10 ?

    What is the output of c?


    Example 2: Here "x" is input and "y" is output

    x y

    1 10

    2 20

    3 30

    4 40

    5 ?

    500 ?

    y ~ x :     y=10x


    Example 3: Here "x" is input and "y" is output

    x y

    1 14

    2 18

    3 22

    4 26

    5 ?

    500 ?

    here we can observe linear regression

    y ~ x :     y=mx+c    here m is slope and c is constant

                  y=4x+10

Requirements

  • Basic Mathematics is enough

Description

Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.

Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this, and the bird sounds like this.

Weka is open source software issued under the GNU General Public License.

We have put together several free online courses that teach machine learning and data mining using R Programming, Python Programming, Weka Toolkit and SQL.

Yes, it is possible to apply Weka to process big data and perform deep learning!

Who this course is for:

  • Graduates or Pursuing BTech Students