Tuesday, August 14, 2012

"What relations 'mean' " ...

In a previous post, it was established what relations really are (mere mathematical things, i.e. objects that play a role in some mathematical theory).  And the relational model of data uses exactly those things as a building brick for our databases.  And those databases convey meaningful information for their users.  Databases can contain information such as "An order for five screwdrivers was placed on 2012-08-01 13:45:56 by a person named John Doe for an order price of 7 local currency units, and this order was delivered on 2012-08-14.'.  That's certainly a very "meaningful" statement.  It leads many people to believe that relations must have some kind of meaning.  Alas, they are mistaken ...


Lets consider a relation that consists of the 2-tuple (DAY1:Saturday, DAY2:Sunday).   In tabular form, it could be depicted as
DAY1 DAY2
Saturday Sunday
and that's about everything that "characterizes" this relation.

But by and of itself, does this "mean" anything ?  Well no, it doesn't.  Claiming the opposite would be similar to claiming that "the number one 'means something', by and of itself".  No it doesn't.

Relations (and the tuples they consist of) do not have any meaning of themselves.  If the opposite were true, then any reader would be in the possibility of concluding from the foregoing table that some firm statement is true, say, for example, "Saturday is the week-end day that comes before Sunday, which is also considered a week-end day.".  It would also be true that all readers would come to the very same conclusion.  And there would not be any reader who would come to the conclusion that this relations means "On Saturday, He created man, and on Sunday, He took a break to admire His own works thus far." ...

None of that is the case, of course.  If relations carry any meaning, then it is because they appear in a certain given context.  Note the choice of the word 'carry', as distinct from 'have'.  It's an important distinction.  Relations and the tuples they contain are merely carriers of "meaning", and the same relation and/or tuple might carry very different meanings in different contexts.

(For more stuff on why it is a good thing that "relations don't mean a thing", see, e.g., here.)

When it comes to context, two distinct cases apply which correspond to whether a user is updating the database or whether he is querying (inquiring) it.

In the case of updating, "what a relation means" (and/or "what a tuple means") is determined by what database relvar the update is targeting.  And in particular by the external predicate that has been associated with that relvar.

If that predicate happens to be (*) "§DAY1§ is a week-end day that comes immediately before §DAY2§, which is itself also a week-end day.", then a user inserting the 2-tuple (DAY1:Saturday, DAY2:Sunday) in that relvar is in effect making the assertion that "Saturday is a week-end day that comes immediately before Sunday, which is also a week-end day.".

(*) The things enclosed in § marks are obviously intended as placeholders for some value-to-be-filled-in.  Which value that is, is, obviously, an attribute from a tuple that [in this case of updating a relvar] is being inserted into the relvar.  This process of filling in "concrete" attribute values in the places where such placeholders appear in a predicate, is commonly called " instantiating the predicate".  Instantiating a predicate (such that all placeholders are effectively replaced) yields a proposition, and inserting a tuple means asserting that that proposition is a true one.

If that predicate happens to be "On §DAY1§, He created man, and on §DAY2§, He took a break to admire His own works thus far.", then a user inserting the 2-tuple (DAY1:Saturday, DAY2:Sunday) in that relvar is in effect making the assertion that "On Saturday, He created man, and on Sunday, He took a break to admire His own works thus far. ".

Observe how the inserted tuple itself is the very same in the two cases, but how the "meaning" that the tuple carries in either case is entirely different.

Back to (a relvar with) the predicate "§DAY1§ is a week-end day that comes immediately before §DAY2§, which is itself also a week-end day.".  Now suppose some user wants to record the assertion that "Sunday is a week-end day that comes immediately before Monday.".  Would it be correct of this user to insert a tuple (DAY1:Sunday, DAY2:Monday) in that relvar ?  Obviously no, it wouldn't, as that would mean that this user would also be asserting the part that says that "Monday is itself also a week-end day" !!!

Documenting the external predicate of each relvar in a database is about the most important aspect there is to database design.  And of course as always, it is important to be precise !  Even if differences or variations seem extremely subtle, they can ultimately still give rise to major misunderstandings and mistakes !

On to the meaning of relations/tuples when inquiring a database.  When that tuple (DAY1:Saturday, DAY2:Sunday) was inserted into the relvar, that was taken to be an assertion to the effect that the corresponding proposition was a true one.  That 'meaning' is of course never altered merely by the tuple "staying where it is (in the relvar)".  Thus, if we inquire this relvar and get back that same tuple in the result, that still tells the inquiring user that "Saturday is a week-end day that comes immediately before Sunday, which is also a week-end day.".

Note once again that if we got back the very same tuple from inquiring a totally different relvar (one with a totally different external predicate), then this very same tuple would mean something entirely different.  Nothing changes to the fact that relations and the tuples within them are mere carriers of meaning.

But when we inquire a database, we typically use constructs that are much more complex than just "naming a relvar" (and getting back their full contents).  Those constructs are the possible expressions of the relational algebra.  What matters about that in the context of 'meaning', is that each possible expression of the relational algebra, also has its very own "external predicate" that "defines the meaning" carried by the tuples appearing in the result of the query.

For example, if we apply a RESTRICTion to some relvar, then that expression has for its external predicate the external predicate of that relvar, ANDed together (logically conjuncted) with a rule that expresses/defines the restriction condition applied.  For example, applying to our week-end days relvar a restriction condition 'DAY1 = Sunday', gives us the external predicate §DAY1§ is a week-end day that comes immediately before §DAY2§, which is itself also a week-end day, AND [it is the case that] §DAY1§ is Sunday.".  The empty result we'd be getting back from that query, is a manifestation of the fact that there simply are no true instantiations of this predicate.

Note that the querying too indicates why it is so extremely important to document the external predicates of our database relvars : without that, there just is no formal way for us to tell what our query results actually mean (and without such a formal way to tell that, all that's left for us to do is to make mere assumptions - but as the saying goes, assumption is the mother of all screw-ups).

No comments:

Post a Comment