
Installation of MYSQL database
Refer to the next session.
Installation of MYSQL Workbench for writing SQL queries.
Refer to the next session.
For oracle refer to the video and the following:
Installation of Oracle ( XE Edition )
Oracle has provided the community edition : Express Edition ( XE)
Min system requirement
1) Microsoft Windows 7
2) RAM 512 MB
3) Disk space 2 GB
Reference docsfor installation Guide https://docs.oracle.com/cd/E17781_01/install.112/e18803/toc.htm#XEINW119
Download Link for XE edition
https://www.oracle.com/database/technologies/xe-downloads.html
Installation of SQL Developer Client
SQL developer is the freeware software for writing SQL from simplest to the most complex.
Other developer client which are used in Industries are :
Toad , DB Visualiser , SQL Workbench , PLSQL Developer
Reference link for download SQL developer
https://www.oracle.com/tools/downloads/sqldev-v192-downloads.html
The commands are broadly categorised as follows:
Data Definition Language (DDL)
Data Manipulation Language (DML)
As the name suggests, the DDL is used to create a new schema as well as to modify the existing schema. The typical commands in DDL are — CREATE, ALTER and DROP. As a data analyst, the majority of your work will be focused on insight generation, and you will be working with DML commands, specifically the SELECT command.
In this lecture, you learned the basic constructs in the SELECT query. The session will also cover the creation of schema for the database that will be used throughout the session. You will specifically learn the following:
SELECT clause
FROM Clause
WHERE Clause
Basic Sorting and Filtering in SQL
Pattern Matching using LIKE function
Pattern matching is an important concept in the string or text-based processing. In SQL, certain characters are reserved as wildcards that can match any number of preceding or trailing characters
Sorting
In SQL, sorting is done using the clauses 'asc' and 'desc' for ascending and descending order respectively. You will also learn to use the IN, NOT IN and IS NULL clauses.
In this lecture, you also learned to use the following clauses:
IN
NOT IN
IS NULL
Asc
desc
Summary of Learning Till now
Till now you learn the basics of Database and SQL. Database was invented to store the data in a more consistent manner and to access with ease. Such databases are called as RDBMS.
You then learnt that in an RDBMS, the data is organised in tables inside a database and SQL is the language to access and manipulate data in an RDBMS. There are two major categories of SQL commands:
Data Definition Language i.e. DDL
Data Manipulation Language i.e. DML
The DDL commands are typically used to change the structure of schema by creating new tables or adding new columns in existing tables or dropping tables etc. Such activities are typically done by DBA .
As a data analyst you will be frequently using the DML commands.
In this session you learned the basics of SQL , select commands, where command, filter conditions and order by clause.
In the next session you are going to learn aggregate functions , and advance SQL queries which you will be using more frequently in your day to day projects.
Introduction to advance SQL
Previously, you learnt the basics of DBMS, RDBMS, and the data retrieval language, SQL. You now know that a database is a collection of related information, and as such, the data is generally arranged using the relational model, i.e. in rows and columns.
The relational model forms the base of RDBMSs. In RDBMS, the data is organised using various tables, which are made up of a number of rows and columns. The columns necessarily represent the attributes associated with the data. These attributes are also known as fields. A database can have multiple attributes. When a particular entity is referred to using all such attributes, we get a record. The record is necessarily a row in the table.
A table can have thousands of records. If you wish to identify a particular record from this collection, you would need some field which can uniquely identify the record. This unique identification attribute is known as the ‘Primary Key’. Further, you also learnt about connecting tables with each other. The concept of ‘Foreign Key’ is used to create relationships between tables. Further, you were also introduced to referential integrity, which helps enforce data consistency within the database.
You now know two types of commands, namely:
Data Definition Language
Data Manipulation Language
The Data Definition Language (DDL) is used to create and modify the schema of the database. Commands like CREATE, ALTER and DROP are part of this language.
As a data analyst, you would always be actively involved in data retrieval activities. Here, the Data Manipulation Language (DML) commands would come in handy, e.g. the DML command SELECT, its purpose, various clauses and filtering operations.
We will cover :
Learning the following technique is the integral part of Data Analyst ; and we will cover in the following sessions
Order by clause
Group by clause
Grouped aggregations
Having clause
Joins
Nested and subqueries
As a data analyst, you would frequently prepare reports which present an overall picture of the data in hand. This task usually includes calculating sums, averages, finding highest and lowest, counting the qualifying records, etc.
In other words, you will often need to find aggregate values of certain variables like the average age, total salary of employees, the number of males or females etc. You know how to do all these things in R.
Wondering if you can perform the same operations using SQL? Of-course you can. SQL provides various built-in functions for these things. The functions used to generate collected reports are known as ‘aggregate functions’.
Many times as an analyst, you would have to generate reports related to specific departments. In such scenarios, you would collect information on departments, products, assembly lines, vendors, etc. SQL provides a special clause called ‘Group by’ for collecting facts about certain categories. In this lecture, we talked about the group by clause in detail.
In this lecture, you learnt how to use the group by clause. To summarise, you use group by when you need to find aggregate values of a column C1 'grouped by' a certain column C2. The general structure of the query is:
select column_to_be_grouped_by, f(col_to_be_aggregated) from table where some_col = x group by column_to_be_grouped_by;
Suppose your manager asks you to count all the employees whose salary is more than the average salary in that particular department. Now, intuitively, you know that two aggregate functions would be used here — count() and avg(). You decide to apply the where condition on the average salary of the department, but to your surprise, the query fails. In fact, you should try writing this query before moving ahead.
How do you generate the answer? Is it even possible to get answers to such queries in SQL? In this lecture, you learned the concept of ‘Having Clause’, which can be used as a filtering condition on the aggregated output.
The having clause is typically used when you have to apply a filter condition on an 'aggregated value'. This is because the 'where' clause is applied before the aggregation takes place, and thus it is not useful when you want to apply a filter on an aggregated value.
In other words, the having clause is equivalent to a where clause after the group by has been executed but before the select part is executed.
This is important to understand to avoid getting confused between the 'having' and 'where' clauses. For example, if you want to display the list of all employees having a salary >= 30,000, you can use the where clause since there is no aggregation happening in this query. But if you want to display the list of all employees having a salary <= the average salary, where avg() is the aggregation function, you'll have to use the having clause.
*Lifetime access to course materials . Udemy offers a 30-day refund guarantee for all courses*
*Taught by instructor with 15+ years of Data Science and Big Data Experience*
The course is packed with real life projects examples and has all the contents to make you Data Literate.
Get Transformed from Beginner to Expert .
Become expert in SQL, Excel and R programming.
Start using SQL queries in Oracle , MySQL and apply learning in any kind of database
Start doing the extrapolatory data analysis ( EDA) on any kind of data and start making the meaningful business decisions.
Start writing simple to the most advanced SQL queries.
Integrate R and Python with Database and execute SQL command on them for data analysis and Visualizations.
Start making visualizations charts - bar chart , box plots which will give the meaningful insights
Learn the art of Data Analysis , Visualizations for Data Science Projects
Learn to play with SQL on R and Python Console.
Integrate RDBMS database with R and Python
Create own database in your laptop/Desktop - Oracle and MySQL
Import and export data from and to external files.
Real world Case Studies Include the analysis from the following datasets
1. Bank Marketing datasets ( R )
2. Identify which customers are eligible for credit card issuance ( R)
3. Root Cause Analysis of Uber Demand Supply Gap ( R)
4. Investment Case Studies: To identify the top 3 countries and investment type to help the Asset Management Company to understand the global trends ( EXCEL)
5. Acquisition Analytics on the Telemarketing datasets : Find out which customers are most likely to buy future bank products using tele-channel. ( EXCEL) )
6. Market fact data.( SQL)