admin posted on July 18, 2009 21:16

An example of why referential integrity is a good thing

As I wrap-up my latest gig, I look back to see just how I don’t miss working with the mainframe or AS/400. It has been almost six years since I have had to work exclusively with either, and I haven’t particularly miss them. So I’m querying data, and I get finished with a simple data fetch only to find out that I’m unexpectedly missing rows. WTF?  I’m writing the code correctly, and I am following the little cheat-sheet that I was given. I think to myself “HTH can I be missing rows??  Reality then strikes me as I remember that I’m querying an AS/400, and most likely the data structures, schemas, and objects are probably not created correctly. And this turns out to be the case here. I find that there is no concept of referential integrity in the tables that I am working with! Moreover, working with this data I’ve already found violations of any and all normal forms, records referencing other records within the same table as you might find in old ISAM and VSAM files, entities with multiple primary keys, minimal referential integrity, mainframe low characters (which can oftentimes be very hard to remove), and my favorite - invalid dates. Rule #1 when working with the mainframe or the AS/400: you’re going to find crap, so be ready for it. How could I have forgotten?

Back to my example. After I realize that neither table knows what the other is doing, I think that I can use some OUTER JOINS to find the missing rows. The problem though is that I’d have to do either multiple outer joins or a FULL OUTER JOIN; I don’t particularly like using FULL OUTER JOINS, however, since you have to use ISNULL or COALESCE in order to remove the columns returning NULL values. Therefore, below is an example that you can investigate to find rows that could be missing in more than one table of a join. In my example here, the transactions table acts as a “ledger” table; you would think that a ledger table might have rows for each business day, but this cannot be assumed. The PlantDetail table and the Transaction tables are mutually exclusive – neither can be the “drive” table, if you will, for querying these two joined tables. My solution, then, is to create a list of plants and dates, and use this to drive the query.

IF OBJECT_ID ('Plant') IS NOT NULL 
    DROP TABLE Plant 
GO 
IF OBJECT_ID ('Transactions') IS NOT NULL 
    DROP TABLE Transactions 
GO 
IF OBJECT_ID ('PlantDetail') IS NOT NULL 
    DROP TABLE PlantDetail 
GO 

CREATE TABLE Plant (PlantId int, Name varchar (10)
,
CONSTRAINT Pid PRIMARY KEY (PlantId
)) GO CREATE TABLE Transactions
(PlantId int, TransDt datetime, TransType char (2)
,
CONSTRAINT Tid PRIMARY KEY(PlantId, TransDt
)) GO CREATE TABLE PlantDetail
(PlantId int, DetailDt datetime, Value varchar (30)
,
CONSTRAINT Pdid PRIMARY KEY (PlantId, DetailDt
)) GO INSERT Plant VALUES (1,'North'), (2,'South'), (3, 'East'), (4, 'West'); GO INSERT Transactions VALUES (1, '8/1/09', 'AA') INSERT Transactions VALUES (2, '8/1/09', 'AA') GO INSERT PlantDetail VALUES (3, '8/2/09', 'Cloudy'); GO

--solution
SELECT
P.PlantId, R.dt, dp.Value FROM Plant p
INNER JOIN ( SELECT TransDt as dt, PlantId as PlantId FROM Transactions UNION ALL SELECT DetailDt, PlantId FROM PlantDetail ) r ON P.PlantId = R.PlantId LEFT OUTER JOIN Transactions t ON t.PlantId = r.PlantId AND t.TransDt = r.dt LEFT OUTER JOIN PlantDetail dp ON r.PlantId = dp.PlantId AND r.dt= dp.DetailDt GO


This one is much easier to implement than using FULL OUTER JOINs and then doing and ISNULL or COALESCE for the rows that may or may not be in either the Transactions or PlantDetail table.

Lee

----------------------------------------

 

Wait until you see the blog on using the SSIS task for SAPBW


 

 


Posted in:   Tags:
blog comments powered by Disqus

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog