
Installation of MYSQL database
Refer to the next session.
Installation of MYSQL Workbench for writing SQL queries.
Refer to the next session.
For oracle refer to the video and the following:
Installation of Oracle ( XE Edition )
Oracle has provided the community edition : Express Edition ( XE)
Min system requirement
1) Microsoft Windows 7
2) RAM 512 MB
3) Disk space 2 GB
Reference docsfor installation Guide https://docs.oracle.com/cd/E17781_01/install.112/e18803/toc.htm#XEINW119
Download Link for XE edition
https://www.oracle.com/database/technologies/xe-downloads.html
Installation of SQL Developer Client
SQL developer is the freeware software for writing SQL from simplest to the most complex.
Other developer client which are used in Industries are :
Toad , DB Visualiser , SQL Workbench , PLSQL Developer
Reference link for download SQL developer
https://www.oracle.com/tools/downloads/sqldev-v192-downloads.html
The commands are broadly categorised as follows:
Data Definition Language (DDL)
Data Manipulation Language (DML)
As the name suggests, the DDL is used to create a new schema as well as to modify the existing schema. The typical commands in DDL are — CREATE, ALTER and DROP. As a data analyst, the majority of your work will be focused on insight generation, and you will be working with DML commands, specifically the SELECT command.
In this lecture, you learned the basic constructs in the SELECT query. The session will also cover the creation of schema for the database that will be used throughout the session. You will specifically learn the following:
SELECT clause
FROM Clause
WHERE Clause
Basic Sorting and Filtering in SQL
Pattern Matching using LIKE function
Pattern matching is an important concept in the string or text-based processing. In SQL, certain characters are reserved as wildcards that can match any number of preceding or trailing characters
Sorting
In SQL, sorting is done using the clauses 'asc' and 'desc' for ascending and descending order respectively. You will also learn to use the IN, NOT IN and IS NULL clauses.
In this lecture, you also learned to use the following clauses:
IN
NOT IN
IS NULL
Asc
desc
Summary of Learning Till now
Till now you learn the basics of Database and SQL. Database was invented to store the data in a more consistent manner and to access with ease. Such databases are called as RDBMS.
You then learnt that in an RDBMS, the data is organised in tables inside a database and SQL is the language to access and manipulate data in an RDBMS. There are two major categories of SQL commands:
Data Definition Language i.e. DDL
Data Manipulation Language i.e. DML
The DDL commands are typically used to change the structure of schema by creating new tables or adding new columns in existing tables or dropping tables etc. Such activities are typically done by DBA .
As a data analyst you will be frequently using the DML commands.
In this session you learned the basics of SQL , select commands, where command, filter conditions and order by clause.
In the next session you are going to learn aggregate functions , and advance SQL queries which you will be using more frequently in your day to day projects.
Introduction to advance SQL
Previously, you learnt the basics of DBMS, RDBMS, and the data retrieval language, SQL. You now know that a database is a collection of related information, and as such, the data is generally arranged using the relational model, i.e. in rows and columns.
The relational model forms the base of RDBMSs. In RDBMS, the data is organised using various tables, which are made up of a number of rows and columns. The columns necessarily represent the attributes associated with the data. These attributes are also known as fields. A database can have multiple attributes. When a particular entity is referred to using all such attributes, we get a record. The record is necessarily a row in the table.
A table can have thousands of records. If you wish to identify a particular record from this collection, you would need some field which can uniquely identify the record. This unique identification attribute is known as the ‘Primary Key’. Further, you also learnt about connecting tables with each other. The concept of ‘Foreign Key’ is used to create relationships between tables. Further, you were also introduced to referential integrity, which helps enforce data consistency within the database.
You now know two types of commands, namely:
Data Definition Language
Data Manipulation Language
The Data Definition Language (DDL) is used to create and modify the schema of the database. Commands like CREATE, ALTER and DROP are part of this language.
As a data analyst, you would always be actively involved in data retrieval activities. Here, the Data Manipulation Language (DML) commands would come in handy, e.g. the DML command SELECT, its purpose, various clauses and filtering operations.
We will cover :
Learning the following technique is the integral part of Data Analyst ; and we will cover in the following sessions
Order by clause
Group by clause
Grouped aggregations
Having clause
Joins
Nested and subqueries
As a data analyst, you would frequently prepare reports which present an overall picture of the data in hand. This task usually includes calculating sums, averages, finding highest and lowest, counting the qualifying records, etc.
In other words, you will often need to find aggregate values of certain variables like the average age, total salary of employees, the number of males or females etc. You know how to do all these things in R.
Wondering if you can perform the same operations using SQL? Of-course you can. SQL provides various built-in functions for these things. The functions used to generate collected reports are known as ‘aggregate functions’.
Many times as an analyst, you would have to generate reports related to specific departments. In such scenarios, you would collect information on departments, products, assembly lines, vendors, etc. SQL provides a special clause called ‘Group by’ for collecting facts about certain categories. In this lecture, we talked about the group by clause in detail.
In this lecture, you learnt how to use the group by clause. To summarise, you use group by when you need to find aggregate values of a column C1 'grouped by' a certain column C2. The general structure of the query is:
select column_to_be_grouped_by, f(col_to_be_aggregated) from table where some_col = x group by column_to_be_grouped_by;
Suppose your manager asks you to count all the employees whose salary is more than the average salary in that particular department. Now, intuitively, you know that two aggregate functions would be used here — count() and avg(). You decide to apply the where condition on the average salary of the department, but to your surprise, the query fails. In fact, you should try writing this query before moving ahead.
How do you generate the answer? Is it even possible to get answers to such queries in SQL? In this lecture, you learned the concept of ‘Having Clause’, which can be used as a filtering condition on the aggregated output.
The having clause is typically used when you have to apply a filter condition on an 'aggregated value'. This is because the 'where' clause is applied before the aggregation takes place, and thus it is not useful when you want to apply a filter on an aggregated value.
In other words, the having clause is equivalent to a where clause after the group by has been executed but before the select part is executed.
This is important to understand to avoid getting confused between the 'having' and 'where' clauses. For example, if you want to display the list of all employees having a salary >= 30,000, you can use the where clause since there is no aggregation happening in this query. But if you want to display the list of all employees having a salary <= the average salary, where avg() is the aggregation function, you'll have to use the having clause.
You know that a database is a collection of multiple related tables. While generating insights from the data, you may need to refer to these multiple tables in a query. There are two ways to deal with such types of queries:
Joins
Nested queries/Subqueries
To summarise, you learnt nested sub-queries which are typically used when you have to select columns from one table based on filter conditions from another table. In such cases, you put a sub-query inside the 'where' clause instead of a certain value. In other words, if you want to select columns 'a' and 'b' from table_1 and the condition is to be applied on table_2, then the general structure of the query will is:
select a, b from table_1 where c = (select d from table_2 where e = x);
The other typical use case of nested queries is when you have to apply a filter condition on an aggregated value, for example, display two columns a, b from table_1 such that a is greater than the average value of c:
select a, b from table_1 where c > (select avg(c) from table_1);
Previously, you have learnt nested queries which are used to retrieve data from multiple tables. However, as you must have noticed, a nested query refers to only one table at a time. What if you want to refer to multiple tables in a single query?
In the previous lecture, we mentioned ‘Joins’. In this lecture, you learned the concept of joins.
To summarise, join is a way to retrieve data from multiple tables. The most common join is the inner join, which selects only those rows from two tables where the common column has the same value. The general structure of an inner join statement is as follows:
select * from table_1 inner join table_2 on table_1.column_x = table_2.column_y;
You can join multiple tables using the common attributes between pairs of tables. This is possible because the result of a join is also a table which you can join further to another table (with a common attribute).
For example, you wanted to retrieve columns from the tables employee and project, but there is no common attribute in these tables. Thus, you join the employee table with works_on using the common attribute 'ssn'. The result of this join is then further joined to the project table.
You also learnt that the schema can be useful to understand the links between tables, which is useful to write multi-way join queries.
Outer join is used when you want to display the rows in one table even if they do not have a corresponding entry in the other table. For example, if you inner join the employee table with the dependent table, you will get only those employees who have at least one dependent. To avoid this 'loss of information' you can use an outer join.
Also, the outer join is of two types - left outer join and right outer join. It doesn't really matter which table you treat as left and right, i.e. you can choose the one you are more comfortable with
The Only Data Analytics Course You’ll Ever Need to Start Your Career.
Are you tired of jumping between disconnected tutorials? Most beginners get stuck because they learn SQL in a vacuum and Python in another. In the real world, these two work together.
This course was designed to bridge that gap. SQL and Python: Complete Data Analytics Course for Beginners is a project-based journey that takes you from the very first line of code to building a professional data portfolio.
Why this course?
Data is the new oil, but only if you know how to refine it. Organizations are desperate for individuals who can bridge the gap between "raw data" and "business decisions." By mastering both SQL (for data retrieval) and Python (for data analysis), you become a "double-threat" candidate in the job market.
What’s inside?
SQL Fundamentals: We start with the basics of SELECT statements and move rapidly into advanced Joins, Aggregations, and Subqueries.
Python Foundations: No prior coding experience? No problem. We teach you Python from scratch, focusing only on the parts you need for data analytics.
Data Analysis with Pandas: Learn the "Excel on Steroids" library. We’ll show you how to handle millions of rows of data with ease.
Visual Storytelling: Data is useless if people can't understand it. Learn to create stunning charts and graphs that tell a story.
Capstone Projects: You won't just watch me code. You will analyze real-world datasets—from retail sales to social media trends—to build a portfolio that stands out.
Who is this for?
Total Beginners: People with zero coding or technical background.
Career Switchers: Professionals looking to move into Data Analyst, Business Analyst, or Marketing Analyst roles.
Students: Anyone looking to add high-value technical skills to their resume.
Excel Users: Those who feel limited by spreadsheets and want to upgrade to more powerful tools.
Stop scrolling and start building. Enroll today and take the first step toward your future in Data Analytics!