What is SQL?
Structured Query Language (SQL) is a standardised programming language that is used to manage relational databases and perform various operations on the data in them.
What is it used for?
SQL queries and other operations take the form of commands written as statements and are aggregated into programs that enable users to:
- modify database tables and index structures;
- add, update and delete rows of data;
- retrieve subsets of information from within relational database management systems (RDBMSes) - this information can be used for transaction processing, analytics applications and other applications that require communicating with a relational database.
Most of the actions you perform on a database are done with SQL statements.
How to use SQL
The order of operations in SQL is important - the basic order for getting it right is:
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
SQL is not case sensitive i.e. select is the same as SELECT, however, it is good practice to write all SQL keywords in UPPER CASE. In Snowflake, field names and values are case sensitive, but not all databases are.
SQL requires single quotes around text values and double quotes around field names. Numeric fields should not be enclosed in quotes.
SQL SELECT Statement
The SELECT statement is used to select data from a database.
Syntax:
SELECT "column_name1" ,"column_name2", ”column_name3”
FROM ”table_name”
Example 1
Select all records from the Superstore table.
SELECT *
FROM "Superstore_Orders"
Example 2
Select all product names from Superstore table.
SELECT “Product Name”
FROM "Superstore_Orders"
SQL SELECT DISTINCT Statement
The SELECT DISTINCT statement returns only distinct (different) values in the result set.
Syntax:
SELECT DISTINCT “column_name1”, “column_name2”
FROM ”table_name”
Example
Return a unique list of product names from Superstore table.
SELECT DISTINCT “Product Name”
FROM ”Superstore_Orders”
SQL WHERE Clause
The WHERE clause filters a result set to include only records that fulfill a specified condition.
Syntax:
SELECT ”column_name1”, “column_name2”
FROM ”table_name”
WHERE “column_name1” operator value
Operators in the WHERE clause:
Example 1 - SQL equal to
Get all unique States and Products in the Central region.
SELECT DISTINCT “State”, “Product Name”
FROM ”Superstore_Orders”
WHERE “Region”='Central'
Example 2 - SQL not equal to
Get a unique list of Regions and States from all regions except Central.
SELECT DISTINCT “Region”, ”State”
FROM ”Superstore_Orders”
WHERE “Region” <> 'Central'
SQL ORDER BY Keyword
The ORDER BY command is used to sort the result set in ascending or descending order. This command sorts the result set in ascending order by default. To sort the records in descending order, use the DESC keyword.
Syntax:
SELECT "column_name1", "column_name2"
FROM "table_name"
ORDER BY "column_name" ASC|DESC, "column_name" ASC|DESC
Example 1
Get a unique list of states ordered in reverse alphabetical order.
SELECT DISTINCT “State”
FROM ”Superstore_Orders”
ORDER BY “State” DESC
Example 2
Get a unique list of states and cities ordered alphabetically by State then ordered in reverse alphabetical order by city.
SELECT DISTINCT "State", "City"
FROM "Superstore"
ORDER BY "State" ASC, "City" DESC
SQL GROUP BY Statement
The GROUP BY command is used to group the result set (used with aggregate functions: COUNT, MAX, MIN, SUM, AVG).
Syntax:
SELECT "column_name1", aggregate_function("column_name2")
FROM "table_name"
WHERE "column_name1" operator value
GROUP BY "column_name"
Example 1
Find the average profit by ship mode and region in the Central or West region.
SELECT "Ship Mode", "Region",
AVG("Profit") AS 'Avg Profit'
FROM "Superstore"
WHERE "Region" IN ('Central', 'West')
GROUP BY "Ship Mode", "Region"
SQL HAVING Clause
The HAVING command is used instead of WHERE with aggregate functions.
Syntax:
SELECT "column_name", aggregate_function("column_name")
FROM "table_name"
WHERE "column_name" operator value
GROUP BY "column_name"
HAVING aggregate_function("column_name") operator value
Example
Which days had an average profit greater than 500? Order them from largest to smallest.
SELECT ("Order Date"), AVG("Profit") AS 'Avg Profit'
FROM "Superstore"
GROUP BY ("Order Date")
HAVING AVG("Profit") > 500
ORDER BY AVG("Profit") DESC