Jay Taylor's notes

back to listing index

PostgreSQL multi-column unique constraint and NULL values - Database Administrators Stack Exchange

[web search]
Original source (dba.stackexchange.com)
Tags: database postgres postgresql indexing null dba.stackexchange.com
Clipped on: 2023-03-08

Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It only takes a minute to sign up.

Sign up to join this community
Anybody can ask a question
Anybody can answer
The best answers are voted up and rise to the top
  1. Home
    1. Public
    2. Questions
    3. Tags
    4. Users
    5. Companies
    6. Unanswered
    1. Teams
      Stack Overflow for Teams – Start collaborating and sharing organizational knowledge. Create a free Team Why Teams?
Asked 11 years, 1 month ago
Modified 8 months ago
Viewed 152k times
145

I have a table like the following:

create table my_table (
    id   int8 not null,
    id_A int8 not null,
    id_B int8 not null,
    id_C int8 null,
    constraint pk_my_table primary key (id),
    constraint u_constrainte unique (id_A, id_B, id_C)
);

And I want (id_A, id_B, id_C) to be distinct in any situation. So the following two inserts must result in an error:

INSERT INTO my_table VALUES (1, 1, 2, NULL);
INSERT INTO my_table VALUES (2, 1, 2, NULL);

But it doesn't behave as expected because according to the documentation, two NULL values are not compared to each other, so both inserts pass without error.

How can I guarantee my unique constraint even if id_C can be NULL in this case? Actually, the real question is: can I guarantee this kind of uniqueness in "pure sql" or do I have to implement it on a higher level (java in my case)?

asked Dec 27, 2011 at 9:10
Manuel Leduc
1,56122 gold badges1010 silver badges55 bronze badges
  • So, say you have values (1,2,1) and (1,2,2) in the (A,B,C) columns. Should a (1,2,NULL) be allowed to be added or not? Dec 27, 2011 at 9:27
  • A and B can't be null but C can be null or any positive integer value. So (1,2,3) and (2,4,null) are valid but (null,2,3) or (1,null,4) are invalid. And [(1,2,null), (1,2,3)] does not break unique constraint but [(1,2, null), (1,2,null)] must break it. Dec 27, 2011 at 9:39
  • 2
    Are there any values that will never appear in those columns (like negative values?) Dec 27, 2011 at 10:43
  • 1
    You don't have to label your constraints in pg. It'll automagically generate a name. Just FYI. Dec 2, 2016 at 20:13
  • Replace the null values with a default non-null value fix the unique issue, e.g for varchar use empty string ''.
    – Eric
    Apr 27, 2021 at 2:54

3 Answers

Sorted by:
159

Postgres 15

This works out of the box with NULLS NOT DISTINCT:

ALTER TABLE my_table
  DROP CONSTRAINT IF EXISTS u_constrainte
, ADD CONSTRAINT u_constrainte UNIQUE NULLS NOT DISTINCT (id_A, id_B, id_C);

See:

Postgres 14 or older (original answer)

You can do that in pure SQL. Create a partial unique index in addition to the one you have:

CREATE UNIQUE INDEX ab_c_null_idx ON my_table (id_A, id_B) WHERE id_C IS NULL;

This way you can enter for (id_A, id_B, id_C) in your table:

(1, 2, 1)
(1, 2, 2)
(1, 2, NULL)

But none of these a second time.

Or use two partial UNIQUE indexes and no complete index (or constraint). The best solution depends on the details of your requirements. Compare:

While this is elegant and efficient for a single nullable column in the UNIQUE index, it gets out of hand quickly for more than one. Discussing this - and how to use UPSERT with partial indexes:

Asides

No use for mixed case identifiers without double quotes in PostgreSQL.

You might consider a serial column as primary key or an IDENTITY column in Postgres 10 or later. Related:

So:

CREATE TABLE my_table (
   my_table_id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY  -- for pg 10+
-- my_table_id bigserial PRIMARY KEY  -- for pg 9.6 or older
 , id_a int8 NOT NULL
 , id_b int8 NOT NULL
 , id_c int8
 , CONSTRAINT u_constraint UNIQUE (id_a, id_b, id_c)
);

If you don't expect more than 2 billion rows (> 2147483647) over the lifetime of your table (including waste and deleted rows), consider integer (4 bytes) instead of bigint (8 bytes).

answered Dec 27, 2011 at 10:51
Erwin Brandstetter
162k2222 gold badges401401 silver badges547547 bronze badges
  • 1
    The docs advocate this method, Adding a unique constraint will automatically create a unique B-tree index on the column or group of columns listed in the constraint. A uniqueness restriction covering only some rows cannot be written as a unique constraint, but it is possible to enforce such a restriction by creating a unique partial index. Dec 2, 2016 at 20:21
19

I had the same problem and I found another way to have unique NULL into the table.

CREATE UNIQUE INDEX index_name ON table_name( COALESCE( foreign_key_field, -1) )

In my case, the field foreign_key_field is a positive integer and will never be -1.

So, to answer Manual Leduc, another solution could be

CREATE UNIQUE INDEX  u_constrainte (COALESCE(id_a, -1), COALESCE(id_b,-1),COALESCE(id_c, -1) )

I assume that ids won't be -1.

What is the advantage on creating a partial index ?

In case where you don't have the NOT NULL clause, id_a, id_b and id_c can be NULL together only once.

With a partial index, the 3 fields could be NULL more than once.

answered May 17, 2012 at 20:57
Luc M
57911 gold badge66 silver badges1313 bronze badges
  • 5
    > What is the advantage on creating a partial index ? The way you've done it with COALESCE can be effective in restricting the duplicates, but the index wouldn't be very useful in querying as its an expression index that probably won't match query expressions. That is, unless you SELECT COALESCE(col, -1) ... you wouldn't be hitting the index.
    – Bo Jeanes
    Aug 5, 2016 at 4:58
  • @BoJeanes The index has not been created for a performance issue. It has been created to fullfill the business requirement.
    – Luc M
    Aug 5, 2016 at 19:51
9

A Null can mean that value is not known for that row at the moment but will be added, when known, in the future (example FinishDate for a running Project) or that no value can be applied for that row (example EscapeVelocity for a black hole Star).

In my opinion, it's usually better to normalize the tables by eliminating all Nulls.

In your case, you want to allow NULLs in your column, yet you want only one NULL to be allowed. Why? What kind of relationship is this between the two tables?

Perhaps you can simply change the column to NOT NULL and store, instead of NULL, a special value (like -1) that is known never to appear. This will solve the uniqueness constraint problem (but may have other possibly unwanted side effects. For example, using -1 to mean "not known / does not apply" will skew any sum or average calculations on the column. Or all such calculations will have to take into account the special value and ignore it.)

answered Dec 28, 2011 at 0:09
ypercubeᵀᴹ
95.2k1313 gold badges204204 silver badges300300 bronze badges
  • 2
    In my case NULL is really NULL (id_C is a foreign key to table_c for exemple so it can't have -1 value), it means their is no relationship between "my_table" and "table_c". So it has a functional signification. By the way [(1, 1,1,null), (2, 1,2,null), (3,2,4,null)] is a valid list of inserted data. Dec 28, 2011 at 9:40
  • 1
    It's not really a Null as used in SQL because you want only one in all rows. You could change your database schema either by adding the -1 to table_c or by adding another table (which would be supertype to subtype table_c). Dec 28, 2011 at 9:48
  • 3
    I'd just like to point out to @Manuel that the opinion on nulls in this answer is not universally held, and is much debated. Many, like me, think that null can be used for any purpose you wish (but should only mean one thing for each field and be documented, possibly in the field name or a column comment) Dec 29, 2011 at 7:03
  • 1
    You can't use a dummy value when your column is a FOREIGN KEY.
    – Luc M
    May 17, 2012 at 18:42
  • 1
    +1 I am with you: if we want some combination of columns to be unique, then you need to consider an entity in which this combination of columns is a PK. The OPs' database schema should probably change to a parent table and a child one.
    – A-K
    Nov 11, 2013 at 22:23

Your Answer

Sign up or log in

Sign up using Google
Sign up using Facebook
Sign up using Email and Password

Post as a guest

Name
Email

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.

Hot Network Questions