PostgreSQL Tutorial – Referring to Other Tables
In the previous article you learned how to create a single table. One table does not make for a very interesting database. Since we looked at how the
actor table was created, let’s take a look at the most logical follow up: the
Without going into all of the details of the
film table, we do know that the table looks something like this:
CREATE TABLE public.film ( film_id int4 NOT NULL DEFAULT nextval('film_film_id_seq'::regclass), title varchar(255) NOT NULL, description text NULL, release_year year NULL, language_id int2 NOT NULL, original_language_id int2 NULL, rental_duration int2 NOT NULL DEFAULT 3, rental_rate numeric(4,2) NOT NULL DEFAULT 4.99, length int2 NULL, replacement_cost numeric(5,2) NOT NULL DEFAULT 19.99, rating mpaa_rating NULL DEFAULT 'G'::mpaa_rating, last_update timestamp NOT NULL DEFAULT now(), special_features text NULL, fulltext tsvector NOT NULL, PRIMARY KEY(film_id) )
There are a few features that will be covered later – like enumerations and full text searching – but this is a pretty standard table. The one thing that’s missing, though, is a way to reference the actors in each film. We could do this by adding a number of columns (
actor2_id, etc.) but that would become cumbersome and is not an effective way to model data.
The alternative is to create what is commonly called a cross reference table.
Cross Reference Tables
A cross reference table is a simple way to create a cross reference between two objects that have a many-to-many relationship. This is typically done using a two column table – one column for the primary key of each table:
CREATE TABLE public.film_actor ( actor_id int4 NOT NULL, film_id int4 NOT NULL, last_update timestamp NOT NULL DEFAULT now(), PRIMARY KEY(actor_id,film_id) )
Here’s the problem: as it stands, we could insert any values into this table. The whole point of the
film_actor table is to map valid actors with valid films. Without any restrictions in place, we can’t make sure that any actor or film id in our table isn’t changed to an invalid value.
Enter Referential Integrity
Referential integrity is a way of saying that every value of a column in one table must exist as a value in a column in another table. In other words: every value in the referenced column or columns must exist in the referenced table. PostgreSQL makes it easy for us to enforce referential integrity through the use of foreign keys.
In order to make sure that we only have valid actors in the
film_actor table, we’ll create a foreign key:
ALTER TABLE public.film_actor ADD CONSTRAINT film_actor_actor_id_fkey FOREIGN KEY (actor_id) REFERENCES actor(actor_id) ON UPDATE CASCADE ON DELETE RESTRICT
That’s a very shorthand way of saying
The table "public.film_actor" has a foreign key constraint named "film_actor_actor_id" on the column "actor_id" which references the column "actor_id" in the table "actor" so that every value of "actor_id" in "public.film_actor" must also be present in the "actor_id" column of the table "public.actor" if the value of "actor_id" is changed in the "public.actor" table we should update all values in "public.fim_actor" that are the same and if the value of "actor_id" is deleted from the "public.actor" table we should prevent that delete from happening if there are rows in the "public.film_actor" table with the same value
We’ll create a similar foreign key to enforce referential integrity to the
ALTER TABLE public.film_actor ADD CONSTRAINT film_actor_film_id_fkey FOREIGN KEY (film_id) REFERENCES film(film_id) ON UPDATE CASCADE ON DELETE RESTRICT
This foreign key is almost identical to the first one. The only difference between the two are the names of the table being referenced. If you’ve taken a look at the other tables in the database, or if you’ve looked back at the previous article about creating the schema, you may have noticed something else: all of our foreign keys reference the primary key of another table.
Foreign keys must refer to a unique value, this prevents us from creating a foreign key that points to any column. However, since a primary key requires every value in that column be unique, it uniquely identifies the row, we can safely create a foreign key that points to the primary key column.
What if we want to create a foreign key that points to a different column? We can do that too. PostgreSQL has a feature called a unique constraint. This is an internal database feature that makes it possible to ensure that all values in a column, or set of columns, are unique. A unique constraint is actually implemented as a unique index in PostgreSQL, just as it is in many databases.
CREATE UNIQUE INDEX idx_unq_manager_staff_id ON public.store(manager_staff_id)
This index wouldn’t be a good candidate for a foreign key, but it does illustrate how you can create a unique index that you can reference with a foreign key.
This is Jeremiah
I live in Portland, OR. I have two dogs.
I recently received a Master's of Science in Computer Science from Portland State University.
I'm was Microsoft MVP from 2009 - 2018 with a pile of certifications. Somewhere along the way, I wrote a database client for Riak and then handed it off to the community.