Database Administration
postgresql query
Updated Fri, 20 May 2022 19:00:24 GMT

How to address aggregate fanout in the below query


I have a table user. Every user has a one-to-many relationships with topic_likes and topic_dislikes. I want to query for a user and have their one-to-many relationships exist in their own column in the result as arrays of JSON objects.

Let's say the below is the user table:

CREATE TABLE user(
    id UUID NOT NULL PRIMARY KEY,
    email VARCHAR(254) NOT NULL UNIQUE,
    username TEXT NOT NULL,
    creationTimestamp TIMESTAMPTZ NOT NULL
);

topic_like:

CREATE TABLE topic_like(
    user_id UUID NOT NULL REFERENCES user(id) ON UPDATE RESTRICT ON DELETE CASCADE,
    likeable_topic_id INTEGER REFERENCES likeable_topic(id) ON UPDATE RESTRICT ON DELETE RESTRICT,
    index INTEGER NOT NULL,
    UNIQUE(user_id, index),
    UNIQUE(user_id, likeable_topic_id)
);

and topic_dislike:

CREATE TABLE topic_dislike(
    user_id UUID NOT NULL REFERENCES user(id) ON UPDATE RESTRICT ON DELETE CASCADE,
    dislikeable_topic_id INTEGER REFERENCES dislikeable_topic(id) ON UPDATE RESTRICT ON DELETE RESTRICT,
    index INTEGER NOT NULL,
    UNIQUE(user_id, index),
    UNIQUE(user_id, dislikeable_topic_id)
);

What I have now is a query like so:

SELECT user.id, email, username, 
json_agg(json_build_object('likeable_topic_id', likeable_topic_id, 'index', topic_like.index)),
json_agg(json_build_object('dislikeable_topic_id', dislikeable_topic_id, 'index', topic_dislike.index))
FROM user
LEFT JOIN topic_like ON user.id=topic_like.user_id
LEFT JOIN topic_dislike ON user.id=topic_dislike.user_id
WHERE user.id='58b6fe31-f3f6-4781-af06-93e29cb05bca'
GROUP BY user.id;

But when I do this, I get a result where, if there are 5 topic_like records for a given user and 5 topic_dislike records for a given user, the resulting output will duplicate every topic_like record and every topic_dislike record in their JSON arrays five times each. Resulting in a JSON array for each column that contains 25 elements despite there being only 5 records in each corresponding table.

How to restructure this query such that each JSON array only contains as many elements as there are matching records in the corresponding table?




Solution

Use correlated sub-selects rather than left joins

SELECT user.id, email, username, 
    (select json_agg(json_build_object('likeable_topic_id', likeable_topic_id, 'index', topic_like.index)) from topic_like WHERE user.id=topic_like.user_id) liked_json,
    (select json_agg(json_build_object('dislikeable_topic_id', dislikeable_topic_id, 'index', topic_dislike.index)) from topic_dislike WHERE user.id=topic_dislike.user_id) disliked_json
FROM user
WHERE user.id='58b6fe31-f3f6-4781-af06-93e29cb05bca'