Database System
1 SQL Part Ⅰ
1.1 SQL Introduction
SQL = Structured Query Language
Although over 40 years old, it keeps re-emerging as the standard.
Features:
Declarative!
Specify what you want, not how to get it.
Implemented widely
With varying levels of efficiency, completeness.
Constrained
Not targeted at Turing-complete tasks.
General-purpose and feature-rich
Many years of added features.
Extensible: Callouts to other languages, databases.
1.2 Relational Terminology
Definitions:
Database: Set of named relations.
Relation (aka Table):
Schema: description ( "metadata").
Instance: set of data satisfying the schema.
Attribute (aka Column, Field)
Tuple (aka Record, Row)
Cardinality: Number of rows in a table.

Properties:
Schema is fixed, unique attribute names, atomic (aka primitive) types.
Tables are NOT ordered, they are sets or multisets (bags).
Tables are FLAT, no nested attributes.
Tables DO NOT prescribe how they are implemented/stored on disk, which is called physical data independence.
1.3 SQL Language
SQL Language:
Two Sublanguages:
DDL (Data Definition Language): Define and modify schema.
DML (Data Manipulation Language): Queries can be written intuitively.
Relational Database Management System (RDBMS): responsible for efficient evaluation, choose and run algorithms for declarative queries (choice of algorithm must not affect query answer).
SQL DDL Examples:
Sailors
sid | sname | rating | age |
---|---|---|---|
1 | Fred | 7 | 22 |
2 | Jim | 2 | 39 |
3 | Nancy | 8 | 27 |
Boats
bid | bname | color |
---|---|---|
101 | Nina | red |
102 | Pinta | blue |
103 | Santa Maria | red |
Reserves
sid | bid | day |
---|---|---|
1 | 102 | 9/12 |
2 | 102 | 9/13 |
1.4 SQL Queries
Basic Single Table Queries:
Produce all tuples in the table that satisfy the predicate.
Output the expressions in the SELECT list,
Expression can be a column reference, or an arithmetic expression over column refs.
Example:
SELECT, FROM, WHERE lines:
Return all unique (name, GPA) pairs from students.
DISTINCT specifies removal of duplicate rows before output.
Can refer to the students table as S, this is called an alias.
ORDER line:
ORDER BY clause specifies output to be sorted.
Sorts the output by GPA in descending order, then by name in ascending order.
Also computes the value of age*2 and names it a2.
Obviously must refer to columns in the output.
Note the AS clause for naming output columns!
LIMIT line:
Only produces the first 3 output rows.
Typically used with ORDER BY.
Otherwise the output is non- deterministic.
Not a "pure" declarative construct in that case – output set depends on algorithm for query processing.
Aggregates:
Before producing output, compute a summary (aka an aggregate) of some arithmetic expression.
Produces 1 row of output, with one column of average in this case.
Other aggregates: SUM, COUNT, MAX, MIN (and others).
Group By:
Partition table into groups with same GROUP BY column values, can group by a list of columns.
Produce an aggregate result per group, cardinality (rows) of output = number of distinct group values.
Note: can put grouping columns in SELECT list (in this case, average GPA + department).
Having:
The HAVING predicate filters groups
HAVING is applied after grouping and aggregation
Hence can contain anything that could go in the SELECT list
i.e., aggs or GROUP BY columns
HAVING can only be used in aggregate queries
It's an optional clause

2 SQL Part Ⅱ
2.1 Join Queries
Join: Form cross product of the tables, output all tuples and select specific columns.
Example:
Sailors
sid | sname | rating | age |
---|---|---|---|
1 | Popeye | 10 | 22 |
2 | OliveOyl | 11 | 39 |
3 | Garfield | 1 | 27 |
4 | Bob | 5 | 19 |
Reserves
sid | bid | day |
---|---|---|
1 | 102 | 9/12 |
2 | 102 | 9/13 |
1 | 101 | 10/01 |
Output

sid | sname | bid |
---|---|---|
1 | Popeye | 102 |
2 | OliveOyl | 102 |
Self-Joins
Output
sname1 | age1 | sname2 | age2 |
---|---|---|---|
Popeye | 22 | Bob | 19 |
OliveOyl | 39 | Popeye | 22 |
OliveOyl | 39 | Garfield | 27 |
OliveOyl | 39 | Bob | 19 |
Garfield | 27 | Popeye | 22 |
Garfield | 27 | Bob | 19 |
Inner Join
2.2 Select & Where Advanced
Arithmetic Expressions
SQL Calculator
three | two | one | sanity |
---|---|---|---|
3.0 | 2.0 | 1.0 | 1 |
String Comparison
Boolean Logic vs. Set Operators
Reserve a red or a green boat
Reserve a red and a green boat
Set Semantics
Default (Think of each letter as being a tuple in a relation):
R = {A, A, A, A, B, B, C, D}
S = {A, A, B, B, B, C, E}
UNION: {A, B, C, D, E}
INTERSECT: {A, B, C}
EXCEPT: {D}
"ALL": Multiset Semantics
R = {A, A, A, A, B, B, C, D} = {A(4), B(2), C(1), D(1)}
S = {A, A, B, B, B, C, E} = {A(2), B(3), C(1), E(1)}
UNION ALL (sum of all cardinalities): {A(6), B(5), C(2), D(1), E(1)}
INTERSECT ALL (min of cardinalities): {A(min(4,2)), B(min(2,3)), C(min(1,1)), D(min(1,0)), E(min(0,1))} = {A, A, B, B, C}
EXCEPT ALL (subtract cardinalities): {A(4-2), B(2-3), C(1-1), D(1-0), E(0-1)} = {A, A, D}
2.3 Nested Queries
In
Example
SELECT S.sname FROM Sailors S WHERE S.sid IN (SELECT R.sid FROM Reserves R WHERE R.bid = 102); -- SubqueryNot In
Example
SELECT S.sname FROM Sailors S WHERE S.sid NOT IN (SELECT R.sid FROM Reserves R WHERE R.bid = 103); -- SubqueryExists
Example 1 (bit odd, but legal)
SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103); /* The query will output all the sailors if there is at least one sailor who reserves boat 103 or return nothing if there isn't */Example 2 (Nested Queries with Correlation)
SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103 AND R.sid = S.sid); /* The query will output all the sailors who reserve boat 103. * In this example, the subquery will take every tuple of the sailors and run it, * tuple by tuple. */
Example for Division
ARGMAX
Example: Sailor with the Highest Rating
3 Disks & Drives
3.1 DBMS Architectures

Features
Organized in layers
Each layer abstracts the layer below: Manage complexity (easy to use) & Performance assumptions (assume the perfomrance of lower layers)
Two cross-cutting issues related to storage and memory management: Concurrency control & Recovery
3.2 Disks
Memory Hierarchy

Disk Components

Platters spin together (around 15000 rpm). Arm assembly can move in or out to position a head on a desired track, but only one head reads/writes at any one time.
Disk Operation & Access Time
Disks read and write data in sector-size blocks with access time of three components.
Seek Time: Moving arms to position disk head on track that contains that contains the target sector.
Rotational Delay: Waiting for block to rotate the first bit of the target sector to pass under the head.
Transfer Time: Moving data to/from disk surface (Read/Write the contents of the sector).
Disk Capacity
Disk capacity is determined by the following technology factors.
Recording Density (bits/in):# bits that can be squeezed into a 1-inch segment of a track.
Track density (tracks/in):# tracks that can be squeezed into a 1-inch segment of the radius extending from the center of the platter.
Areal density (bits/in2): The product of the recording density and the track density.