Databases

SQL Fundamentals

3 min read
Focus: DATABASES

TL;DR — Quick Summary

  • SQL is the universal language for querying relational databases — SELECT, INSERT, UPDATE, DELETE are the foundation.
  • WHERE filters rows before grouping; HAVING filters grouped results after GROUP BY.
  • Always use parameterized queries in application code — never concatenate user input into SQL.
  • Use IS NULL / IS NOT NULL for null checks — = NULL never works.

Lesson Overview

🗄️ SQL: The Language of Data

SQL (Structured Query Language) is the universal language for communicating with relational databases. Whether you're building a simple blog or a high-traffic e-commerce platform, SQL is how your application reads, writes, and manages data.

SQL was created in 1974 at IBM and became an ANSI standard in 1986. Today it's supported by every major relational database: PostgreSQL, MySQL, SQLite, SQL Server, and Oracle.

⚡ The Four Core Operations (CRUD)

  • CREATE data → INSERT INTO
  • READ data → SELECT
  • UPDATE data → UPDATE ... SET
  • DELETE data → DELETE FROM

🔍 Filtering and Sorting

Raw data is only useful when you can filter it to what you need. SQL gives you powerful clauses:

  • WHERE — filter rows by condition
  • ORDER BY — sort results ascending/descending
  • LIMIT — cap the number of rows returned
  • DISTINCT — remove duplicate rows
  • LIKE — pattern matching with wildcards (%)

SQL is declarative — you describe what data you want, not how to get it. The database engine figures out the most efficient execution plan.

Conceptual Deep Dive

Think of a database table like a spreadsheet. Each row is a record (one user, one order, one product). Each column is a field (name, email, price). SQL is how you ask questions about that spreadsheet — but with the power to handle billions of rows in milliseconds. The WHERE clause is your filter. Every condition evaluates to true or false for each row — only rows where the condition is true are returned. You can chain conditions with AND / OR and group them with parentheses, exactly like boolean logic in programming.

Architecture & Data Flow

SQL Query Execution Order
Rendering diagram…

Implementation Lab

Core CRUD Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- CREATE: Insert a new user
INSERT INTO users (name, email, age)
VALUES ('Alice Johnson'Johnson', 'alice@example.com', 28);
 
-- READ: Select users over 18, newest first
SELECT id, name, email
FROM users
WHERE age > 18
ORDER BY created_at DESC
LIMIT 10;
 
-- UPDATE: Change a specific user's email
UPDATE users
SET email = 'alice.new@example.com', updated_at = NOW()
WHERE id = 42;
 
-- DELETE: Remove a specific user
DELETE FROM users
WHERE id = 42;
 
-- SAFE DELETE: Always check how many rows first!
SELECT COUNT(*) FROM users WHERE id = 42;  -- verify before deleting
Filtering, Sorting, and Pattern Matching
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- Multiple WHERE conditions
SELECT name, email, age
FROM users
WHERE age BETWEEN 18 AND 35
  AND email LIKE '%@gmail.com'
  AND status = 'active'
ORDER BY name ASC;
 
-- IN operator (cleaner than multiple OR)
SELECT * FROM products
WHERE category IN ('electronics', 'phones', 'laptops');
 
-- NULL checks (never use = NULL, use IS NULL)
SELECT * FROM users
WHERE last_login IS NULL;  -- users who never logged in
 
-- DISTINCT removes duplicates
SELECT DISTINCT country FROM users
ORDER BY country;
Aggregate Functions — Summarizing Data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Count, sum, average, min, max
SELECT
  COUNT(*) AS total_users,
  COUNT(last_login) AS users_who_logged_in,  -- ignores NULLs
  AVG(age) AS average_age,
  MIN(created_at) AS first_signup,
  MAX(created_at) AS latest_signup
FROM users
WHERE status = 'active';
 
-- GROUP BY: count users per country
SELECT country, COUNT(*) AS user_count
FROM users
GROUP BY country
HAVING COUNT(*) > 100  -- HAVING filters AFTER grouping (WHERE filters before)
ORDER BY user_count DESC;

Best Practices — Interactive Comparison

Never concatenate user input into SQL — always use parameterized queries

javascript
1
2
3
4
5
6
7
// DANGEROUS — user input injected directly
const query = `SELECT * FROM users
  WHERE email = '${req.body.email}'}'`;
 
// Attacker inputs: ' OR '1'='1' OR '1'='1
// Becomes: WHERE email = '' OR '1'='1'WHERE email = '' OR '1'='1'
// Returns ALL users — authentication bypassed!ALL users — authentication bypassed!

Always specify columns — never use SELECT * in production

sql
1
2
3
4
5
6
7
-- Fetches ALL columns including large blobs,
-- hashed passwords, internal fields
-- Wastes memory and network bandwidth
SELECT * FROM users
WHERE status = 'active';
 
-- Also breaks silently if a column is added/removed

Pro Tips — Senior Dev Insights

1

Use EXPLAIN (or EXPLAIN ANALYZE in PostgreSQL) before any query on a large table — it shows exactly how the database plans to execute it and whether indexes are being used.

2

Wrap dangerous operations in a transaction: BEGIN; DELETE FROM users WHERE ...; SELECT COUNT(*); ROLLBACK; — check the count, only COMMIT if it looks right.

3

Use COALESCE(column, default_value) to handle NULLs gracefully in your SELECT output.

4

In PostgreSQL, RETURNING * after INSERT/UPDATE/DELETE returns the affected rows — saves an extra SELECT round trip.

⚖️ WHERE vs HAVING — When to Use Each

FeatureWHEREHAVING
FiltersIndividual rowsGrouped results
RunsBefore GROUP BYAfter GROUP BY
Can use aggregates?
Can use column aliases?
ExampleWHERE age > 18HAVING COUNT(*) > 5

Common Developer Pitfalls

!
Using SELECT * in production queries — always specify exactly the columns you need for performance and clarity.
!
Writing WHERE column = NULL instead of WHERE column IS NULL — the first always returns zero rows.
!
Forgetting a WHERE clause on UPDATE or DELETE — one typo can wipe your entire table.
!
Confusing WHERE and HAVING — using aggregate functions in WHERE causes a syntax error.

Interview Mastery

WHERE filters individual rows before any grouping or aggregation happens — you can't use aggregate functions like COUNT() in a WHERE clause. HAVING filters grouped results after GROUP BY has been applied — it's designed for filtering aggregated data. Example: WHERE age > 18 filters rows; HAVING COUNT(*) > 5 filters groups.

In SQL, NULL represents an unknown value — not zero, not empty string, not false. Any comparison with NULL using = returns NULL (not TRUE or FALSE), so the row is never returned. Always use IS NULL or IS NOT NULL. This is one of SQL's most common gotchas.

SQL injection is an attack where malicious SQL code is inserted into an input field, manipulating the query. Example: entering ' OR '1'='1 as a password could bypass authentication. Prevention: always use parameterized queries (prepared statements) — never concatenate user input into SQL strings. Modern ORMs handle this automatically, but raw SQL must be parameterized manually.

Real-World Blueprint

"Every time you search for a product on Amazon, an SQL query runs behind the scenes — filtering by category, price range, ratings, and availability across hundreds of millions of rows. The results come back sorted by relevance in under 100 milliseconds thanks to carefully crafted SQL queries and indexes. The same fundamentals you're learning here power platforms serving billions of queries per day."

Hands-on Lab Exercises

1
Create a products table and insert 10 rows. Write a SELECT that returns only products under $50, sorted by price ascending.
2
Write a query using GROUP BY to count how many products are in each category, showing only categories with more than 3 products.
3
Practice the difference: write one query using WHERE to filter rows, and another using HAVING to filter grouped results on the same table.
4

Write an UPDATE that changes the price of all products in the 'electronics' category by adding a 10% tax. Verify with a SELECT before running it.

Real-World Practice Scenarios

Your app has a users table with 5 million rows. A query SELECT * FROM users is being called on every page load. What problems does this cause and how do you fix the query?
A developer wrote: WHERE email = NULL and is confused why no results come back even though NULL emails exist. Explain the issue and fix.

You need to find all users who signed up in the last 30 days AND have placed at least one order. How do you structure this query?

An intern ran DELETE FROM orders without a WHERE clause. What just happened, and how do you prevent this in future?