Wednesday, August 8, 2018

Why mandating primary keys is a mistake in defining the RM and in RDBMS design.

The relational model of data was intended as a general-purpose model.  General-purpose in there means : suitable for addressing every conceivable business scenario and/or subject matter and/or problem type.  Stress : _every conceivable_.

Here's a relation schema plus FDs :

(A1, A2) with {A1->A2 , A2->A1}

That's perfectly conceivable.  Seasoned modelers will recognize this as a "bidirectional translation table".  Stress _bidirectional_.  Meaning that there *will* be users wanting to query this relation to find A2 value given an A1 value, and there *will* be users wanting to query this relation to find A1 value given an A2 value.  Perfectly conceivable.

Normalization theory informs us that the set of keys for this relation schema is {{A1} {A2}}.

If the RM wants to meet its stated objective of being "general-purpose", it must sensibly support this use case.

Now, either you believe that some key being "primary" is absolutely foundational and MUST be a part of the data model and therefore MUST be an aspect of any logical database design and therefore any DBMS *must force* the designer to make certain choices about this.  And then the consequence is that those involved in [studying] the business of deriving logical database models from conceptual [business] models MUST [also be able to] provide an "algorithm" and/or list of checkpoints or some such that will allow a designer to at least make this choice on well-founded grounds.  If such well-founded grounds do not exist, then any choice is obviously entirely arbitrary, meaning the choice itself does not carry any real meaning, meaning the choice shouldn't have to be made.  Any and all supporters of the idea of mandating primary keys are invited to state their solution/approach for my (A1, A2) case.

Or you believe that "primality" of a key is meaningless and irrelevant and then, well, simply no one has any problem.  Just declare all the keys there are and use any one that suits your purposes as the identifier needed for the business use case at hand.  Observe that if "primality" of a key is not meaningless, then it must be possible to find some aspect of the behaviour of software/data systems that can be supported by "systems-with-primary-keys" but cannot be supported by systems without.  That is, if "primality" is meaningful, there must be some added value somewhere for the software/data system.  If that added value exists, it can be demonstrated.  I have never seen it.  And until it is demonstrated, the only reasonable/rational option is to treat the "primary" in "primary key" as the mere psychological distraction that it is.

No comments:

Post a Comment