Wednesday, April 18, 2012

Natural Key Unique Record Constraint

The single most important aspect of table design, view design and handling data in general is a real data/natural key unique record constraint/definition (URIX) which defines and enforces data granularity (what exists in the table) . If a natural key unique record constraint hasn't been defined then data aggregated from the table is meaningless. Additionally without a proper unique record constraint the table data could be duplicated or worse and no one might ever recognize duplicate data as what exactly duplicate data was had not been defined. SQL group by statements can be used to determine the natural key unique record constraint for a table. With traditional MS Access applications the only way to implement unique record constraints are through the use of primary keys(PK) or composite primary keys which increases the complexity of designing access applications and is seldom done. Referential integrity is normally setup using auto-number/surrogate keys as the PK which adds some confusion to the purpose of the Primary Key as it no longer serves as a unique record constraint. While using a surrogate key to implement referential integrity may be optional having a natural key unique record constraint is mandatory if you want to ensure data integrity. Fortunately in SQL Server the unique record constraint can be implemented as a unique index or unique composite index and the unique key used to configure referential integrity is the actual primary key. This allows developers to locate a specific record with a where clause specifying a single value and reduces the links between tables in schema diagrams. More complicated data constraints are implemented using table triggers such as allowing only one record in a defined set of records to be marked as the active record or primary record such as primary phone number or primary address.  Unfortunately this can only be done at the form level in MS Access. Implementing unique record constraints in MS Access web applications takes even another approach as the primary key has to be a single surrogate key and unique record constraints are implemented through the use of data macros similar to SQL Server triggers.

As should be obvious by now the main goal of this blog article is to stress the importance of real data/natural key unique record constraints (URIX) . Unfortunately this relatively simple concept is often overlooked and is likely the main cause of garbage data.  The URIX is also actually also the most important aspect of joining tables in SQL statements/views. Since the URIX defines the data contained in a table when two tables are joined (foreign key(FK) to primary key(PK))  to create a view AND the URIX/PK of the second table  is not fully contained in the URIX of the first table then the URIX of the query or view ends up being a combination of the URIX of the first table plus any new columns from the PK/UIRX of the second table. Assuming the second table is joined using a left outer join the record count of the view will now reflect the record count of the first table plus the count of matching records in the second table grouped on the linked columns having COUNT(*) > 1 greater than one. In short URIX is the single most important aspect of both tables and SQL statements and if not understood and used correctly the cause of row duplication in SQL statements.

In short URIX is the single most important aspect of both tables and SQL statements and if not understood and used correctly the cause of row duplication in SQL statements.

In short URIX is the single most important aspect of both tables and SQL statements and if not understood and used correctly the cause of row duplication in SQL statements.
       
The following articles contain related information

No comments:

Post a Comment