Lee posted on February 1, 2008 23:46
Cross Joins are rarely used, eh?

In the TSQL MOC for SQL 2000, Microsoft states that “While cross joins are rarely used on a normalized database, you can use them to generate test data for a database or lists of all possible combinations for checklists or business templates.” Although I’d dispute the “rarely”, I’ve never really thought much about checklists or business templates part until yesterday when I was working on a report for a client. They wanted to find aggregations based on ranges, but wanted to display the domain of ranges regardless of whether values existed or not.
 
I thought of a couple of ways to do it, but decided why not try an old trick that I haven’t used in awhile. The Cross Join portion of the following is an adaptation of a piece of code that I stole from Itzik Ben-Gan without permission, and have been using for years, mostly to create time dimensions for data marts. (Sorry Mr. Ben-Gan, I didn’t dig for the article in my stack of SQL Mags dating back to 1998, sitting in a closet floor at home, and was too lazy to look it up on SQLmag.com).
 
Run the following script and then review the results. Notice that there are gaps in the reporting ranges. The code checks for 10 “bands”, yet only four show in the result set.
 
SET NOCOUNT ON
DROP TABLE #loans
 
CREATE TABLE #loans
     (LoanId int identity (1,1) NOT NULL
     ,CustomerId int
     ,LoanPrincipal money NOT NULL DEFAULT 0
     ,LoanDate smalldatetime NOT NULL DEFAULT current_timestamp
     ,LoanOffice char(1) DEFAULT 'D'
     ,Constraint pk_loans PRIMARY KEY (LoanId, CustomerId)
     )
 
--***Insert data
INSERT INTO #loans VALUES (1,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (2,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (4,150,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (4,200,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (2,200,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (4,200,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (3,500,DEFAULT,DEFAULT)
 
--***Has 10 or more loans
INSERT INTO #loans VALUES (10,500,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,400,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,500,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,600,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
INSERT INTO #loans VALUES (10,100,DEFAULT,DEFAULT)
 
--***Gives you the correct count but doesn't fill in the missing --ranges
SELECT "PaydayLoanFrequency"= CASE WHEN CNT =1 THEN '1 times'
           WHEN CNT =2 THEN '2 times'       
           WHEN CNT =3 THEN '3 times'       
           WHEN CNT =4 THEN '4 times'       
           WHEN CNT =5 THEN '5 times'       
           WHEN CNT =6 THEN '6 times'       
           WHEN CNT =7 THEN '7 times'       
           WHEN CNT =8 THEN '8 times'       
           WHEN CNT =9 THEN '9 times'       
           WHEN CNT >=10 THEN '10 times or more'              
          END,
          COUNT(ID) AS cnt       
FROM (
     SELECT COUNT(d.LoanId) as cnt
          ,CustomerId as ID
     FROM #Loans d with (nolock)
     GROUP BY customerid
     ) X
GROUP BY cnt
 
 
 
That’s result set might be ok sometimes, but not good for the display result that they were looking for. You could do it in they reporting tool of your choice, but I chose to do it in TSQL.
 
 
--*** A cartesian product is just the trick
--*** Notice how I use f.id
SELECT   F.id "Payday Loan Frequency" 
     ,ISNULL(z.cnt,0) "Total Number Of Borrowers"
FROM      (
     SELECT "PaydayLoanFrequency"= CASE
          WHEN CNT =1 THEN '1 time'
           WHEN CNT =2 THEN '2 times'       
           WHEN CNT =3 THEN '3 times'       
           WHEN CNT =4 THEN '4 times'       
           WHEN CNT =5 THEN '5 times'       
           WHEN CNT =6 THEN '6 times'       
           WHEN CNT =7 THEN '7 times'       
           WHEN CNT =8 THEN '8 times'       
           WHEN CNT =9 THEN '9 times'       
           WHEN CNT >=10 THEN '10 times or more'              
          END,
          COUNT(ID) AS cnt       
     FROM (
          SELECT COUNT(d.LoanId) as cnt
              ,CustomerId as ID
          FROM #Loans d with (nolock)
          GROUP BY customerid
          ) X
     GROUP BY cnt
     ) z
 --*** Here is my product
RIGHT OUTER JOIN
     (
     SELECT CASE WHEN CAST((a1.id + a0.id) as varchar (4))
                + ' times' = '10 times' then '10 times or more'
          WHEN CAST((a1.id + a0.id) as varchar (4))
                + ' times' = '1 times' then '1 time'
          ELSE CAST((a1.id + a0.id) as varchar (4))
                + ' times' end "id"
     FROM
          (SELECT 0 id UNION ALL
          SELECT 1 UNION ALL
          SELECT 2 UNION ALL
          SELECTUNION ALL
          SELECTUNION ALL
          SELECTUNION ALL
          SELECTUNION ALL
          SELECTUNION ALL
          SELECTUNION ALL
          SELECT 9
          ) a0
     CROSS JOIN   
          (SELECT 0 id UNION ALL
          SELECT 10 UNION ALL
          SELECT 20 UNION ALL
          SELECT 30 UNION ALL
          SELECT 40 UNION ALL
          SELECT 50 
          ) a1
      
     ) f
          on z.paydayloanfrequency = f.id
WHERE CAST(LEFT(id,2) AS INT) BETWEEN 1 and 10
ORDER BY CAST(LEFT(id,2) AS INT)
 
 
The table aliased “f” is the table to obtain my discrete values. The f.ID attribute, along with the RIGHT OUTER JOIN, guarantees me that I’ll get all of my values regardless of whether the data exists. From here, I created three tables in Microsoft Reporting Services, and set them side-by-side, each with different ranges of values for the report that I was looking for. (RIGHT JOIN can be substituted for LEFT OUTER JOIN, of course).
 
Lee
 
-----------------------
 
Interested in a bridge I have for sale in Arizona???
 

blog comments powered by Disqus

by Lee Everest, M.S.

Poll

Do you use Azure or cloud in your organization?



Show Results

Ads

Search


Month List

Calendar

«  February 2012  »
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
2728291234
567891011
View posts in large calendar

Tags

Disclaimer
The opinions, code, examples, et.al. expressed herein are my own personal opinions and do not represent my employer's view in any way, shape form, or fashion.  All code for demonstration purposes - no guarantees, either written or implied, are made.

© Copyright 2012 Lee Everest's SQL Server, etc. weblog