The Importance of Naming Constraints

An important topic worth mentioning is that of constraint naming conventions. As we’ve proceeded here, we’ve talked about adding tables and columns, and we’ve also hinted at lots of other operations listed in Operation Reference such as those which support adding or dropping constraints like foreign keys and unique constraints. The way these constraints are referred to in migration scripts is by name, however these names by default are in most cases generated by the relational database in use, when the constraint is created. For example, if you emitted two CREATE TABLE statements like this on Postgresql:

test=> CREATE TABLE user_account (id INTEGER PRIMARY KEY);
CREATE TABLE
test=> CREATE TABLE user_order (
test(>   id INTEGER PRIMARY KEY,
test(>   user_account_id INTEGER REFERENCES user_account(id));
CREATE TABLE

Suppose we wanted to DROP the REFERENCES that we just applied to the user_order.user_account_id column, how do we do that? At the prompt, we’d use ALTER TABLE <tablename> DROP CONSTRAINT <constraint_name>, or if using Alembic we’d be using Operations.drop_constraint(). But both of those functions need a name - what’s the name of this constraint?

It does have a name, which in this case we can figure out by looking at the Postgresql catalog tables:

test=> SELECT r.conname FROM
test->  pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
test->  JOIN pg_catalog.pg_constraint r  ON c.oid = r.conrelid
test->  WHERE c.relname='user_order' AND r.contype = 'f'
test-> ;
             conname
---------------------------------
 user_order_user_account_id_fkey
(1 row)

The name above is not something that Alembic or SQLAlchemy created; user_order_user_account_id_fkey is a naming scheme used internally by Postgresql to name constraints that are otherwise not named.

This scheme doesn’t seem so complicated, and we might want to just use our knowledge of it so that we know what name to use for our Operations.drop_constraint() call. But is that a good idea? What if for example we needed our code to run on Oracle as well. OK, certainly Oracle uses this same scheme, right? Or if not, something similar. Let’s check:

Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> CREATE TABLE user_account (id INTEGER PRIMARY KEY);

Table created.

SQL> CREATE TABLE user_order (
  2     id INTEGER PRIMARY KEY,
  3     user_account_id INTEGER REFERENCES user_account(id));

Table created.

SQL> SELECT constraint_name FROM all_constraints WHERE
  2     table_name='USER_ORDER' AND constraint_type in ('R');

CONSTRAINT_NAME
-----------------------------------------------------
SYS_C0029334

Oh, we can see that is…..much worse. Oracle’s names are entirely unpredictable alphanumeric codes, and this will make being able to write migrations quite tedious, as we’d need to look up all these names.

The solution to having to look up names is to make your own names. This is an easy, though tedious thing to do manually. For example, to create our model in SQLAlchemy ensuring we use names for foreign key constraints would look like:

from sqlalchemy import MetaData, Table, Column, Integer, ForeignKey

meta = MetaData()

user_account = Table('user_account', meta,
                  Column('id', Integer, primary_key=True)
              )

user_order = Table('user_order', meta,
                  Column('id', Integer, primary_key=True),
                  Column('user_order_id', Integer,
                    ForeignKey('user_account.id', name='fk_user_order_id'))
              )

Simple enough, though this has some disadvantages. The first is that it’s tedious; we need to remember to use a name for every ForeignKey object, not to mention every UniqueConstraint, CheckConstraint, Index, and maybe even PrimaryKeyConstraint as well if we wish to be able to alter those too, and beyond all that, all the names have to be globally unique. Even with all that effort, if we have a naming scheme in mind, it’s easy to get it wrong when doing it manually each time.

What’s worse is that manually naming constraints (and indexes) gets even more tedious in that we can no longer use convenience features such as the .unique=True or .index=True flag on Column:

user_account = Table('user_account', meta,
                  Column('id', Integer, primary_key=True),
                  Column('name', String(50), unique=True)
              )

Above, the unique=True flag creates a UniqueConstraint, but again, it’s not named. If we want to name it, manually we have to forego the usage of unique=True and type out the whole constraint:

user_account = Table('user_account', meta,
                  Column('id', Integer, primary_key=True),
                  Column('name', String(50)),
                  UniqueConstraint('name', name='uq_user_account_name')
              )

There’s a solution to all this naming work, which is to use an automated naming convention. For some years, SQLAlchemy has encourgaged the use of DDL Events in order to create naming schemes. The after_parent_attach() event in particular is the best place to intercept when Constraint and Index objects are being associated with a parent Table object, and to assign a .name to the constraint while making use of the name of the table and associated columns.

But there is also a better way to go, which is to make use of a feature new in SQLAlchemy 0.9.2 which makes use of the events behind the scenes known as naming_convention. Here, we can create a new MetaData object while passing a dictionary referring to a naming scheme:

convention = {
  "ix": "ix_%(column_0_label)s",
  "uq": "uq_%(table_name)s_%(column_0_name)s",
  "ck": "ck_%(table_name)s_%(constraint_name)s",
  "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
  "pk": "pk_%(table_name)s"
}

metadata = MetaData(naming_convention=convention)

If we define our models using a MetaData as above, the given naming convention dictionary will be used to provide names for all constraints and indexes.

Integration of Naming Conventions into Operations, Autogenerate

As of Alembic 0.6.4, the naming convention feature is integrated into the Operations object, so that the convention takes effect for any constraint that is otherwise unnamed. The naming convention is passed to Operations using the MigrationsContext.configure.target_metadata parameter in env.py, which is normally configured when autogenerate is used:

# in your application's model:

meta = MetaData(naming_convention={
        "ix": "ix_%(column_0_label)s",
        "uq": "uq_%(table_name)s_%(column_0_name)s",
        "ck": "ck_%(table_name)s_%(constraint_name)s",
        "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
        "pk": "pk_%(table_name)s"
      })
Base = declarative_base(metadata=meta)

# .. in your Alembic env.py:

# add your model's MetaData object here
# for 'autogenerate' support
from myapp import mymodel
target_metadata = mymodel.Base.metadata

# ...

def run_migrations_online():

    # ...

    context.configure(
                connection=connection,
                target_metadata=target_metadata
                )

Above, when we render a directive like the following:

op.add_column('sometable', Column('q', Boolean(name='q_bool')))

The Boolean type will render a CHECK constraint with the name "ck_sometable_q_bool", assuming the backend in use does not support native boolean types.

We can also use op directives with constraints and not give them a name at all, if the naming convention doesn’t require one. The value of None will be converted into a name that follows the appropriate naming conventions:

def upgrade():
    op.create_unique_constraint(None, 'some_table', 'x')

When autogenerate renders constraints in a migration script, it renders them typically with their completed name. If using at least Alembic 0.6.4 as well as SQLAlchemy 0.9.4, these will be rendered with a special directive Operations.f() which denotes that the string has already been tokenized:

def upgrade():
    op.create_unique_constraint(op.f('uq_const_x'), 'some_table', 'x')

For more detail on the naming convention feature, see Configuring Constraint Naming Conventions.