Showing posts with label SQL Statements. Show all posts
Showing posts with label SQL Statements. Show all posts

Thursday, March 15, 2012

SQL Statement Errors

Row duplication caused by a join to a non-unique index.

Joins on denormalized data vs. joins on unique code lists.

Semi-random updates: This is where there is one record to be updated but the SQL statement specifying the value to be used in the update returns multiple values. Obviously only one value can be used to update one column of one row.

Omission of data due to inner joins.

SELECT *
FROM t1
           INNER JOIN t2 ON t1.fk2 = t2.pk2
           INNER JOIN t3 ON t2.fk3 = t3.pk3

The number of records being returned is first determined by the inner join between t1 and t2 however a join between t1 and t3 may return a different number of records so if we join t3 on t2 we may end up ommiting data.
Instead we should join t3 on t1 as shown in the following example.

SELECT *
FROM t1
           INNER JOIN t2 ON t1.fk2 = t2.pk2
           INNER JOIN t3 ON t1.fk3 = t3.pk3

Constrant of SQL: We must also remember to avoid row duplication caused by the following type of joins.

SELECT *
FROM t1
           INNER JOIN t2 ON t1.fk1 = t2.pk1
           INNER JOIN t3 ON t1.fk1= t3.pk1

As long the joins are one to one we are OK however row duplication will occur if more than one record is linked to more than one other record.
a know constraint with SQL.

Omission of data due to active inactive flags instead of selecting between two dates such as acquired date - disposed date. Selecting an inactive flag in a historica report will omit inactive records so the method will not work to report historical data.

Omission of data by putting where clause criteria on left outer joined table elements in the where clause instead of the join clause.

Omission or potential omission of data due to linking to a view on data coming from a left outer join within the view. In general joins between views should be done from values in the base tables otherwise data will potentially be omitted.

Omission of data by joining outer or inner joined tables to each other instead of a root data set containing all possible keys for both tables.

Summing unrelated percentages.



Aggregating distinct values.

select DISTINCT
       isnull(amount,0) as TotAmt
      , PERIOD   
from table1
GROUP BY PERIOD

Excluding data in calculations due to bad join and where clause criteria.

Summing Percentages

Here is another write-up on summing percentages. Click here to download or view the calculations in Excel.



Example 1: Using the sum of all the numerators as the denominator for all the percentages to be summed yields a valid result.
NumeratorNumerator (Running Total)DenominatorDenominator (Running Total)PercentageRunning Sum of %Running Sum of % divided by the number of percentages summedRunning Percentage
112102100.48%0.48%0.48%0.48%
232104200.95%1.43%0.71%0.71%
362106301.43%2.86%0.95%0.95%
4102108401.90%4.76%1.19%1.19%
51521010502.38%7.14%1.43%1.43%
62121012602.86%10.00%1.67%1.67%
72821014703.33%13.33%1.90%1.90%
83621016803.81%17.14%2.14%2.14%
94521018904.29%21.43%2.38%2.38%
105521021004.76%26.19%2.62%2.62%
116621023105.24%31.43%2.86%2.86%
127821025205.71%37.14%3.10%3.10%
139121027306.19%43.33%3.33%3.33%
1410521029406.67%50.00%3.57%3.57%
1512021031507.14%57.14%3.81%3.81%
1613621033607.62%64.76%4.05%4.05%
1715321035708.10%72.86%4.29%4.29%
1817121037808.57%81.43%4.52%4.52%
1919021039909.05%90.48%4.76%4.76%
2021021042009.52%100.00%5.00%5.00%
Example 2: Using 100 as the denominator for all the percentages to be summed does not yield a valid result.
NumeratorNumerator (Running Total)DenominatorDenominator (Running Total)PercentageRunning Sum of %Running Sum of % divided by the number of percentages summedRunning Percentage
111001001.00%1.00%1.00%1.00%
231002002.00%3.00%1.50%1.50%
361003003.00%6.00%2.00%2.00%
4101004004.00%10.00%2.50%2.50%
5151005005.00%15.00%3.00%3.00%
6211006006.00%21.00%3.50%3.50%
7281007007.00%28.00%4.00%4.00%
8361008008.00%36.00%4.50%4.50%
9451009009.00%45.00%5.00%5.00%
1055100100010.00%55.00%5.50%5.50%
1166100110011.00%66.00%6.00%6.00%
1278100120012.00%78.00%6.50%6.50%
1391100130013.00%91.00%7.00%7.00%
14105100140014.00%105.00%7.50%7.50%
15120100150015.00%120.00%8.00%8.00%
16136100160016.00%136.00%8.50%8.50%
17153100170017.00%153.00%9.00%9.00%
18171100180018.00%171.00%9.50%9.50%
19190100190019.00%190.00%10.00%10.00%
20210100200020.00%210.00%10.50%10.50%
Example 3: summing percentages calculated with different denominators does not yield meaningful data.
NumeratorNumerator (Running Total)DenominatorDenominator (Running Total)PercentageRunning Sum of %Running Sum of % divided by the number of percentages summedRunning Percentage
11101010.00%10.00%10.00%10.00%
2325358.00%18.00%9.00%8.57%
36124725.00%43.00%14.33%12.77%
410156226.67%69.67%17.42%16.13%
515177929.41%99.08%19.82%18.99%
621881676.82%105.90%17.65%12.57%
7283985651.76%107.66%15.38%4.96%
8364961416.33%123.98%15.50%5.86%
9455066418.00%141.98%15.78%6.78%
10556673015.15%157.13%15.71%7.53%
11665735220.00%377.13%34.28%8.98%
12787581016.00%393.13%32.76%9.63%
13914985926.53%419.66%32.28%10.59%
141053088946.67%466.33%33.31%11.81%
151203192048.39%514.72%34.31%13.04%
161363295250.00%564.72%35.29%14.29%
171533398551.52%616.23%36.25%15.53%
1817134101952.94%669.17%37.18%16.78%
1919035105454.29%723.46%38.08%18.03%
2021036109055.56%779.02%38.95%19.27%