SQL for Data Analysts
What is SQL
Setting up the Environment
Files to Download
Installation
Dataset
Structure of an SQL environment
Servers
Databases
Schemas
Tables
Different Database Types
SQL Data Types
Basic Querying
Obtaining Data
SELECT
LIMIT
DISTINCT
WHERE
ORDER BY
Aliasing
Comments
Aggregations
GROUP BY
AVG
COUNT
COUNT (DISTINCT)
SUM
MIN
MAX
HAVING
Combining Data
Joins and Unions
SQL for Data Analysts 1
, UNION
JOIN (LEFT, INNER, OUTER, RIGHT)
Advanced Querying
Subquerying
Advanced Filtering
LIKE
Wildcard Operators
BETWEEN
IN
Advanced Aggregation
Window Functions
OVER and PARTITION BY
ROW_NUMBER()
RANK()
DENSE_RANK()
NTILE
LAG
LEAD
Manipulating Tables
Basics
CREATE
DROP
ALTER
Views
Tips
Further Research
References
What is SQL
SQL (pronounced S.Q.L. or Sequel, no one really cares) stands for Structured Querying
Language and is a language that is used to query, form, and manipulate databases.
Setting up the Environment
SQL for Data Analysts 2
, Files to Download
SQL Course Public Downloads Folder - Google Drive
https://drive.google.com/drive/folders/1t1l7mcnhxN0rvYH-WTS
SSuwocyTujXAl?usp=sharing
Installation
For this tutorial we're going to be using SQLite which is a database engine that works in
a single application. In order to access it we'll be using DBeaver. DBeaver is an open-
source database manager that connects to almost all database engines.
NOTE: Use DBeaver Community, it's free
DBeaver
Free multi-platform database tool for developers, database administrators, analysts and all people who need to
work with databases. Supports all popular databases: MySQL, PostgreSQL, SQLite, Oracle, DB2, SQL Server,
Sybase, MS Access, Teradata, Firebird, Apache Hive, Phoenix, Presto, etc. This website uses cookies to
https://dbeaver.io/
Dataset
The dataset we'll be using is from the University of California Irvine's Machine Learning
Repository
Abstract: This data studies whether a person will accept the coupon recommended to him in different driving
scenarios Source: Tong Wang, tong-wang '@' uiowa.edu, University of Iowa Cynthia Rudin, cynthia '@'
cs.duke.edu, Duke University Data Set Information: This data was collected via a survey on Amazon
https://archive.ics.uci.edu/ml/datasets/in-vehicle+coupon+recommendation
Structure of an SQL environment
Servers
SQL for Data Analysts 3
, The computer that contains your database or databases. This is typically your highest
level of "container" for databases.
Databases
A database is a collection of Schemas and Tables that we can interact with.
Schemas
Schemas are the highest level of organization in many Databases and can contain
multiple tables.
Tables
Tables can be thought of in the same way you think about sheets in Excel. It's a
collection of columns and rows that we'll be interacting with.
Different Database Types
I'll be talking about SQL like it's a standardized language and although it technically is,
in practice SQL databases you'll most likely be interacting with are going to created and
run by major corporations who will sprinkle their own flavor of SQL on top of or in lieu of
standard SQL. The commands we'll be focusing in this course are pretty standard and
shouldn't be drastically different from that which you'll be using in most other databases.
Some of the biggest databases are listed below:
MySQL - Open Source - Created by Oracle
SQL Server - Created by Microsoft
PostgreSQL - Open Source
Oracle Database - Created by Oracle
SQL Data Types
Depending on the database type that you use, the datatypes that you'll be working with
will be different. Here are the basic types and what they might be referred to by in your
database.
Text types: Used to store text, can even store numbers as text
CHAR
SQL for Data Analysts 4
What is SQL
Setting up the Environment
Files to Download
Installation
Dataset
Structure of an SQL environment
Servers
Databases
Schemas
Tables
Different Database Types
SQL Data Types
Basic Querying
Obtaining Data
SELECT
LIMIT
DISTINCT
WHERE
ORDER BY
Aliasing
Comments
Aggregations
GROUP BY
AVG
COUNT
COUNT (DISTINCT)
SUM
MIN
MAX
HAVING
Combining Data
Joins and Unions
SQL for Data Analysts 1
, UNION
JOIN (LEFT, INNER, OUTER, RIGHT)
Advanced Querying
Subquerying
Advanced Filtering
LIKE
Wildcard Operators
BETWEEN
IN
Advanced Aggregation
Window Functions
OVER and PARTITION BY
ROW_NUMBER()
RANK()
DENSE_RANK()
NTILE
LAG
LEAD
Manipulating Tables
Basics
CREATE
DROP
ALTER
Views
Tips
Further Research
References
What is SQL
SQL (pronounced S.Q.L. or Sequel, no one really cares) stands for Structured Querying
Language and is a language that is used to query, form, and manipulate databases.
Setting up the Environment
SQL for Data Analysts 2
, Files to Download
SQL Course Public Downloads Folder - Google Drive
https://drive.google.com/drive/folders/1t1l7mcnhxN0rvYH-WTS
SSuwocyTujXAl?usp=sharing
Installation
For this tutorial we're going to be using SQLite which is a database engine that works in
a single application. In order to access it we'll be using DBeaver. DBeaver is an open-
source database manager that connects to almost all database engines.
NOTE: Use DBeaver Community, it's free
DBeaver
Free multi-platform database tool for developers, database administrators, analysts and all people who need to
work with databases. Supports all popular databases: MySQL, PostgreSQL, SQLite, Oracle, DB2, SQL Server,
Sybase, MS Access, Teradata, Firebird, Apache Hive, Phoenix, Presto, etc. This website uses cookies to
https://dbeaver.io/
Dataset
The dataset we'll be using is from the University of California Irvine's Machine Learning
Repository
Abstract: This data studies whether a person will accept the coupon recommended to him in different driving
scenarios Source: Tong Wang, tong-wang '@' uiowa.edu, University of Iowa Cynthia Rudin, cynthia '@'
cs.duke.edu, Duke University Data Set Information: This data was collected via a survey on Amazon
https://archive.ics.uci.edu/ml/datasets/in-vehicle+coupon+recommendation
Structure of an SQL environment
Servers
SQL for Data Analysts 3
, The computer that contains your database or databases. This is typically your highest
level of "container" for databases.
Databases
A database is a collection of Schemas and Tables that we can interact with.
Schemas
Schemas are the highest level of organization in many Databases and can contain
multiple tables.
Tables
Tables can be thought of in the same way you think about sheets in Excel. It's a
collection of columns and rows that we'll be interacting with.
Different Database Types
I'll be talking about SQL like it's a standardized language and although it technically is,
in practice SQL databases you'll most likely be interacting with are going to created and
run by major corporations who will sprinkle their own flavor of SQL on top of or in lieu of
standard SQL. The commands we'll be focusing in this course are pretty standard and
shouldn't be drastically different from that which you'll be using in most other databases.
Some of the biggest databases are listed below:
MySQL - Open Source - Created by Oracle
SQL Server - Created by Microsoft
PostgreSQL - Open Source
Oracle Database - Created by Oracle
SQL Data Types
Depending on the database type that you use, the datatypes that you'll be working with
will be different. Here are the basic types and what they might be referred to by in your
database.
Text types: Used to store text, can even store numbers as text
CHAR
SQL for Data Analysts 4