PostgreSQL Tutorial – Referring to Other Tables

In the previous article you learned how to create a single table. One table does not make for a very interesting database. Since we looked at how theactor table was created, let’s take a look at the most logical follow up: thefilm table.

The film table

Without going into all of the details of the film table, we do know that the table looks something like this:

CREATE TABLE public.film  ( 
    film_id                 int4 NOT NULL DEFAULT nextval('film_film_id_seq'::regclass),
    title                   varchar(255) NOT NULL,
    description             text NULL,
    release_year            year NULL,
    language_id             int2 NOT NULL,
    original_language_id    int2 NULL,
    rental_duration         int2 NOT NULL DEFAULT 3,
    rental_rate             numeric(4,2) NOT NULL DEFAULT 4.99,
    length                  int2 NULL,
    replacement_cost        numeric(5,2) NOT NULL DEFAULT 19.99,
    rating                  mpaa_rating NULL DEFAULT 'G'::mpaa_rating,
    last_update             timestamp NOT NULL DEFAULT now(),
    special_features        text NULL,
    fulltext                tsvector NOT NULL,
    PRIMARY KEY(film_id)
)

There are a few features that will be covered later – like enumerations and full text searching – but this is a pretty standard table. The one thing that’s missing, though, is a way to reference the actors in each film. We could do this by adding a number of columns (actor1_id, actor2_id, etc.) but that would become cumbersome and is not an effective way to model data.

The alternative is to create what is commonly called a cross reference table.

Cross Reference Tables

A cross reference table is a simple way to create a cross reference between two objects that have a many-to-many relationship. This is typically done using a two column table – one column for the primary key of each table:

CREATE TABLE public.film_actor  ( 
    actor_id    int4 NOT NULL,
    film_id     int4 NOT NULL,
    last_update timestamp NOT NULL DEFAULT now(),
    PRIMARY KEY(actor_id,film_id)
)

Here’s the problem: as it stands, we could insert any values into this table. The whole point of the film_actor table is to map valid actors with valid films. Without any restrictions in place, we can’t make sure that any actor or film id in our table isn’t changed to an invalid value.

Enter Referential Integrity

Referential integrity is a way of saying that every value of a column in one table must exist as a value in a column in another table. In other words: every value in the referenced column or columns must exist in the referenced table. PostgreSQL makes it easy for us to enforce referential integrity through the use of foreign keys.

In order to make sure that we only have valid actors in the film_actor table, we’ll create a foreign key:

ALTER TABLE public.film_actor
ADD CONSTRAINT film_actor_actor_id_fkey
FOREIGN KEY (actor_id)
REFERENCES actor(actor_id)
ON UPDATE CASCADE
ON DELETE RESTRICT

That’s a very shorthand way of saying

The table "public.film_actor"
has a foreign key constraint named "film_actor_actor_id"
on the column "actor_id"
which references the column "actor_id" in the table "actor"
  so that every value of "actor_id" in "public.film_actor"
  must also be present in the "actor_id" column 
  of the table "public.actor"
if the value of "actor_id" is changed in the "public.actor" table
  we should update all values in "public.fim_actor" that are the same
and if the value of "actor_id" is deleted from the "public.actor" table
  we should prevent that delete from happening if there are rows
  in the "public.film_actor" table with the same value

We’ll create a similar foreign key to enforce referential integrity to the filmtable:

ALTER TABLE public.film_actor
ADD CONSTRAINT film_actor_film_id_fkey
FOREIGN KEY (film_id)
REFERENCES film(film_id)
ON UPDATE CASCADE
ON DELETE RESTRICT

This foreign key is almost identical to the first one. The only difference between the two are the names of the table being referenced. If you’ve taken a look at the other tables in the database, or if you’ve looked back at the previous article about creating the schema, you may have noticed something else: all of our foreign keys reference the primary key of another table.

Foreign keys must refer to a unique value, this prevents us from creating a foreign key that points to any column. However, since a primary key requires every value in that column be unique, it uniquely identifies the row, we can safely create a foreign key that points to the primary key column.

What if we want to create a foreign key that points to a different column? We can do that too. PostgreSQL has a feature called a unique constraint. This is an internal database feature that makes it possible to ensure that all values in a column, or set of columns, are unique. A unique constraint is actually implemented as a unique index in PostgreSQL, just as it is in many databases.

CREATE UNIQUE INDEX idx_unq_manager_staff_id
ON public.store(manager_staff_id)

This index wouldn’t be a good candidate for a foreign key, but it does illustrate how you can create a unique index that you can reference with a foreign key.

References

Table Design Patterns: Cross-Reference Validation

Menu