View on GitHub

Notes

reference notes

Important Relational Model Terms

Entity​

Relation​

Functional dependency​

Determinant​

Candidate key​

Composite key​

Primary key​

Surrogate key​

Foreign key​

Referential integrity constraint

Entity

An entity is some identifiable thing that users want to track (tangible or intangible):​

Relation

Relational DBMS products store data about entities in relations, which are a special type of table.

A relation is a two-dimensional table that has the following characteristics:​

Tables That Are Not Relations:

A Relation with Values​ of Varying Length:

Relation with Values of Varying Length

This table is a relation even though the length of the Comment column varies from row to row. It is a relation because, even though the comments have different lengths, there is only one comment per cell.

Alternative Terminology:

Equivalent Terms

Functional Dependency

We begin with a short excursion into the world of algebra. Suppose you are buying boxes of cookies and someone tells you that each box costs $5.00. With this fact, you can compute the cost of several boxes with the following formula:

CookieCost = NumberOfBoxes * $5

A more general way to express the relationship between CookieCost and NumberOfBoxes is to say that CookieCost depends on NumberOfBoxes. Such a statement tells us the character of the relationship between CookieCost and NumberOfBoxes, even though it doesn’t give us the formula. More formally, we can say that CookieCost is functionally dependent on NumberOfBoxes. Such a statement can be written as:

NumberOfBoxes -> CookieCost

This expression can be read as “NumberOfBoxes determines CookieCost.” The variable on the left, here NumberOfBoxes, is called the determinant. Using another formula, we can compute the extended price of a part order by multiplying the quantity of the item by its unit price, or:

ExtendedPrice = Quantity * UnitPrice

In this case, we say that ExtendedPrice is functionally dependent on Quantity and UnitPrice, or:

(Quantity, UnitPrice) -> ExtendedPrice

Here the determinant is the composite (Quantity, UnitPrice).


Composite Determinants

Composite determinant = a determinant of a functional dependency that consists of more than one attribute​

(StudentID, CourseID) -> (Grade)

Finding Functional Dependencies

Functional Dependencies in the SKU_DATA Table

To find functional dependencies in a table, we must ask “Does any column determine the value of another column?” For example, consider the values of the SKU_DATA table in this Figure:

SKU_DATA Table

Consider the last two columns. If we know the value of Department, can we determine a unique value of Buyer? No, we cannot, because a Department may have more than one Buyer. In these sample data, ‘Water Sports’ is associated with Pete Hansen and Nancy Meyers. Therefore, Department does not functionally determine Buyer.

What about the reverse? Does Buyer determine Department? In every row, for a given value of Buyer, do we find the same value of Department? Every time Jerry Martin appears, for example, is he paired with the same department? The answer is yes. Further, every time Cindy Lo appears, she is paired with the same department. The same is true for the other buyers. Therefore, assuming that these data are representative, Buyer does determine Department, and we can write:

Buyer -> Department

Does Buyer determine any other column? If we know the value of Buyer, do we know the value of SKU? No, we do not, because a given buyer has many SKUs assigned to him or her. Does Buyer determine SKU_Description? No, because a given value of Buyer occurs with many values of SKU_Description.

What about the other columns? It turns out that if we know the value of SKU, we also know the values of all of the other columns. In other words:

SKU -> SKU_Description

because a given value of SKU will have just one value of SKU_Description. Next,

SKU -> Department

because a given value of SKU will have just one value of Department. And, finally,

SKU -> Buyer

because a given value of SKU will have just one value of Buyer. We can combine these three statements as:

SKU -> (SKU_Description, Department, Buyer)

For the same reasons, SKU_Description determines all of the other columns, and we can write:

SKU_Description -> (SKU, Department, Buyer)

In summary, the functional dependencies in the SKU_DATA table are:

SKU -> (SKU_Description, Department, Buyer)​
SKU_Description -> (SKU, Department, Buyer)​
Buyer -> Department

Functional Dependencies in the ORDER_ITEM Table

Now consider the ORDER_ITEM table in this Figure. For convenience, here is a copy of the data in that table:

ORDER_ITEM Table

What are the functional dependencies in the ORDER_ITEM table? Start on the left. Does OrderNumber determine another column? It does not determine SKU because several SKUs are associated with a given order. For the same reasons, it does not determine Quantity, Price, or ExtendedPrice.

What about SKU? SKU does not determine OrderNumber because several OrderNumbers are associated with a given SKU. It does not determine Quantity or ExtendedPrice for the same reason.

What about SKU and Price? From this data, it does appear that

SKU -> Price

but that might not be true in general. In fact, we know that prices can change after an order has been processed. Further, an order might have special pricing due to a sale or promotion. To keep an accurate record of what the customer actually paid, we need to associate a particular SKU price with a particular order. Thus:

(OrderNumber, SKU) -> Price

Considering the other columns, Quantity, Price, and ExtendedPrice do not determine anything else. You can decide this by looking at the sample data. You can reinforce this conclusion by thinking about the nature of sales. Would a Quantity of 2 ever determine an OrderNumber or an SKU? This makes no sense. At the grocery store, if I tell you I bought two of something, you have no reason to conclude that my OrderNumber was 1010022203466 or that I bought carrots. Quantity does not determine OrderNumber or SKU. Similarly, if I tell you that the price of an item was $3.99, there is no logical way to conclude what my OrderNumber was or that I bought a jar of green olives. Thus, Price does not determine OrderNumber or SKU. Similar comments pertain to ExtendedPrice. It turns out that no single column is a determinant in the ORDER_ITEM table.

What about pairs of columns? We already know that

(OrderNumber, SKU) -> Price

Examining the data, (OrderNumber, SKU) determines the other two columns as well. Thus:

(OrderNumber, SKU) -> (Quantity, Price, ExtendedPrice)

This functional dependency makes sense. It means that given a particular order and a particular item on that order, there is only one quantity, one price, and one extended price. Notice, too, that because ExtendedPrice is computed from the formula ExtendedPrice = (Quantity * Price) we have:

(Quantity, Price) -> ExtendedPrice

In summary, the functional dependencies in ORDER_ITEM are:

(OrderNumber, SKU) -> (Quantity, Price, ExtendedPrice)
(Quantity, Price) -> ExtendedPrice

Keys

A key is a combination of one or more columns that is used to identify rows in a relation.​

A composite key is a key that consists of two or more columns.

Candidate Keys

A candidate key is a key that determines all of the other columns in a relation. (just like in the SKU_DATA table. the SKU and SKU_Description columns are candidate keys because they determine all of the other columns in the table.)​

Candidate keys identify a unique row in a relation. Given the value of a candidate key, we can find one and only one row in the relation that has that value. For example, given the SKU value of 100100, we can find one and only one row in SKU_DATA. Similarly, given the OrderNumber and SKU values (2000, 101100), we can find one and only one row in ORDER_ITEM.

Primary Keys

When designing a database, one of the candidate keys is selected to be the primary key.

This term is used because this key will be defined to the database management system (DBMS), and the DBMS will use it as its primary means for finding rows in a table. A table has only one primary key. The primary key can have one column, or it can be a composite.

in short: A primary key is a candidate key selected as the primary means of identifying rows in a relation.

Surrogate Keys

A surrogate key is an artificial column added to a relation to serve as a primary key.​

Foreign Keys

A foreign key is the primary key of one relation that is placed in another relation to form a link between the relations.​

The Referential Integrity Constraint

A referential integrity constraint is a statement that limits the values of the foreign key to those already existing as primary key values in the corresponding relation.

Foreign Key with a​ Referential Integrity Constraint:

NOTE: The primary key of the relation is bolded. The foreign key is in italics.

SKU_DATA (SKU, SKU_Description, Department, Buyer)​

ORDER_ITEM (OrderNumber, SKU , Quantity, Price, ExtendedPrice)​

Where ORDER_ITEM.SKU must exist in SKU_DATA.SKU