Database Administration
postgresql database-design foreign-key composite-types enum
Updated Fri, 10 Jun 2022 17:50:18 GMT

ENUM and ID as a Foreign Key


Using PostgreSQL 11.

I've tried digging through both Stack Overflow and here and was unable to find an answer on best practices.

I'm working on a database design and have arrived at a schema that uses a generic "join table". This join table contains five columns:

CREATE TABLE many_joins_table (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  object_id int NOT NULL,
  object_table joins_object_t NOT NULL,
  parent_id int NOT NULL,
  parent_table joins_parent_t NOT NULL);

I've been using this table to represent adjacent and many-to-many relationships between objects in my database. One such example is tags.

CREATE TABLE tag (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  name text NOT NULL UNIQUE);
CREATE TABLE comment (
  id PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY);
CREATE TYPE joins_object_t AS ENUM ('tag');
CREATE TYPE joins_parent_t AS ENUM ('comment');

When a tag is added to the comment table, I would insert a new row into this join table with the following fields:

INSERT INTO many_joins_table
VALUES(1, 'tag'::joins_object_t, 1, 'comment'::joins_parent_t);

Besides inflexibility of enums, addressed with PostgreSQL 9.1 https://stackoverflow.com/questions/1771543/adding-a-new-value-to-an-existing-enum-type/7834949#7834949.

Are there any significant disadvantages or advantages of such an approach? I'm worried that I've mistakenly implemented an anti-pattern. Are there any best practices I can apply to improve this implementation (indexing or constraints)?

Thanks!

Note: I'm aware there are better ways to implementing tags, namely using intarrays. I'm just using tags as an example since it's easily understood. https://stackoverflow.com/questions/23508551/integer-array-lookup-using-postgres

Edit: Removed the UUIDs since it may be a distraction to the question.




Solution

First, stay off enums for things like this. Enum values can never be removed, so use them only if you are sure that that will never be necessary, which doesn't seem the case here.

Anyway, I would say that your design is too complicated, and still lacks the crucial feature of referential integrity.

Use a junction table for each pair of objects that can be related. This way, you

  • make clear which objects can be related

  • can have referential integrity

Having many tables is something that a database is good at. If you contend that you have 1000 tables and each object can be related to each other, that would be too many tables. But in that case you should probably go for a model where you have not a table per object type anyway.





Comments (2)

  • +0 – Thanks for the clear answer. If referential integrity can be addressed via before triggers are there any significant disadvantages here? Im going to review my schema based on your and evans feedback to see if there are simplifications that may be made. I just wanted to fully understand the trade offs of this approach. — Dec 11, 2019 at 16:36  
  • +2 – Verifying constraints with triggers is always vulnerable to race conditions unless you use the expensive SERIALIZABLE isolation level. — Dec 11, 2019 at 16:43