Here is another write-up on summing percentages. Example 1 is
the only example that produces a valid result. Unless the sum of all the
numerators is being used as the denominator for all the percentages to be
summed it is necessary to calculate group percentages the same way as the
individual percentages are being calculated using the sum of the numerators
divided by the sum of the denominators. There are additional issues
calculating percentages from reports with options to filter the data as then
the calculations need to be based on the filtered data rendering any data
summed in the view useless. |
|||||||
Example 1: Using the sum of all the numerators as the denominator for all the percentages to be summed yields a valid result. | |||||||
Numerator | Numerator (Running Total) | Denominator | Denominator (Running Total) | Percentage | Running Sum of % | Running Sum of % divided by the number of percentages summed | Running Percentage |
1 | 1 | 210 | 210 | 0.48% | 0.48% | 0.48% | 0.48% |
2 | 3 | 210 | 420 | 0.95% | 1.43% | 0.71% | 0.71% |
3 | 6 | 210 | 630 | 1.43% | 2.86% | 0.95% | 0.95% |
4 | 10 | 210 | 840 | 1.90% | 4.76% | 1.19% | 1.19% |
5 | 15 | 210 | 1050 | 2.38% | 7.14% | 1.43% | 1.43% |
6 | 21 | 210 | 1260 | 2.86% | 10.00% | 1.67% | 1.67% |
7 | 28 | 210 | 1470 | 3.33% | 13.33% | 1.90% | 1.90% |
8 | 36 | 210 | 1680 | 3.81% | 17.14% | 2.14% | 2.14% |
9 | 45 | 210 | 1890 | 4.29% | 21.43% | 2.38% | 2.38% |
10 | 55 | 210 | 2100 | 4.76% | 26.19% | 2.62% | 2.62% |
11 | 66 | 210 | 2310 | 5.24% | 31.43% | 2.86% | 2.86% |
12 | 78 | 210 | 2520 | 5.71% | 37.14% | 3.10% | 3.10% |
13 | 91 | 210 | 2730 | 6.19% | 43.33% | 3.33% | 3.33% |
14 | 105 | 210 | 2940 | 6.67% | 50.00% | 3.57% | 3.57% |
15 | 120 | 210 | 3150 | 7.14% | 57.14% | 3.81% | 3.81% |
16 | 136 | 210 | 3360 | 7.62% | 64.76% | 4.05% | 4.05% |
17 | 153 | 210 | 3570 | 8.10% | 72.86% | 4.29% | 4.29% |
18 | 171 | 210 | 3780 | 8.57% | 81.43% | 4.52% | 4.52% |
19 | 190 | 210 | 3990 | 9.05% | 90.48% | 4.76% | 4.76% |
20 | 210 | 210 | 4200 | 9.52% | 100.00% | 5.00% | 5.00% |
Example 2: Using 100 as the denominator for all the percentages to be summed does not yield a valid result. | |||||||
Numerator | Numerator (Running Total) | Denominator | Denominator (Running Total) | Percentage | Running Sum of % | Running Sum of % divided by the number of percentages summed | Running Percentage |
1 | 1 | 100 | 100 | 1.00% | 1.00% | 1.00% | 1.00% |
2 | 3 | 100 | 200 | 2.00% | 3.00% | 1.50% | 1.50% |
3 | 6 | 100 | 300 | 3.00% | 6.00% | 2.00% | 2.00% |
4 | 10 | 100 | 400 | 4.00% | 10.00% | 2.50% | 2.50% |
5 | 15 | 100 | 500 | 5.00% | 15.00% | 3.00% | 3.00% |
6 | 21 | 100 | 600 | 6.00% | 21.00% | 3.50% | 3.50% |
7 | 28 | 100 | 700 | 7.00% | 28.00% | 4.00% | 4.00% |
8 | 36 | 100 | 800 | 8.00% | 36.00% | 4.50% | 4.50% |
9 | 45 | 100 | 900 | 9.00% | 45.00% | 5.00% | 5.00% |
10 | 55 | 100 | 1000 | 10.00% | 55.00% | 5.50% | 5.50% |
11 | 66 | 100 | 1100 | 11.00% | 66.00% | 6.00% | 6.00% |
12 | 78 | 100 | 1200 | 12.00% | 78.00% | 6.50% | 6.50% |
13 | 91 | 100 | 1300 | 13.00% | 91.00% | 7.00% | 7.00% |
14 | 105 | 100 | 1400 | 14.00% | 105.00% | 7.50% | 7.50% |
15 | 120 | 100 | 1500 | 15.00% | 120.00% | 8.00% | 8.00% |
16 | 136 | 100 | 1600 | 16.00% | 136.00% | 8.50% | 8.50% |
17 | 153 | 100 | 1700 | 17.00% | 153.00% | 9.00% | 9.00% |
18 | 171 | 100 | 1800 | 18.00% | 171.00% | 9.50% | 9.50% |
19 | 190 | 100 | 1900 | 19.00% | 190.00% | 10.00% | 10.00% |
20 | 210 | 100 | 2000 | 20.00% | 210.00% | 10.50% | 10.50% |
Example 3: summing percentages calculated with different denominators does not yield meaningful data. | |||||||
Numerator | Numerator (Running Total) | Denominator | Denominator (Running Total) | Percentage | Running Sum of % | Running Sum of % divided by the number of percentages summed | Running Percentage |
1 | 1 | 10 | 10 | 10.00% | 10.00% | 10.00% | 10.00% |
2 | 3 | 25 | 35 | 8.00% | 18.00% | 9.00% | 8.57% |
3 | 6 | 12 | 47 | 25.00% | 43.00% | 14.33% | 12.77% |
4 | 10 | 15 | 62 | 26.67% | 69.67% | 17.42% | 16.13% |
5 | 15 | 17 | 79 | 29.41% | 99.08% | 19.82% | 18.99% |
6 | 21 | 88 | 167 | 6.82% | 105.90% | 17.65% | 12.57% |
7 | 28 | 398 | 565 | 1.76% | 107.66% | 15.38% | 4.96% |
8 | 36 | 49 | 614 | 16.33% | 123.98% | 15.50% | 5.86% |
9 | 45 | 50 | 664 | 18.00% | 141.98% | 15.78% | 6.78% |
10 | 55 | 66 | 730 | 15.15% | 157.13% | 15.71% | 7.53% |
11 | 66 | 5 | 735 | 220.00% | 377.13% | 34.28% | 8.98% |
12 | 78 | 75 | 810 | 16.00% | 393.13% | 32.76% | 9.63% |
13 | 91 | 49 | 859 | 26.53% | 419.66% | 32.28% | 10.59% |
14 | 105 | 30 | 889 | 46.67% | 466.33% | 33.31% | 11.81% |
15 | 120 | 31 | 920 | 48.39% | 514.72% | 34.31% | 13.04% |
16 | 136 | 32 | 952 | 50.00% | 564.72% | 35.29% | 14.29% |
17 | 153 | 33 | 985 | 51.52% | 616.23% | 36.25% | 15.53% |
18 | 171 | 34 | 1019 | 52.94% | 669.17% | 37.18% | 16.78% |
19 | 190 | 35 | 1054 | 54.29% | 723.46% | 38.08% | 18.03% |
20 | 210 | 36 | 1090 | 55.56% | 779.02% | 38.95% | 19.27% |
Showing posts with label Business Intelligence. Show all posts
Showing posts with label Business Intelligence. Show all posts
Thursday, August 7, 2014
Summing Percentages
Tuesday, April 15, 2014
SQL Server 2012 - BI - Master Data Services,Data Quality Services and Data Mining
To my surprise sometime while I was buried in SQL cleaning data, configuring unique record constraints, primary keys, clustered indexes, and referential integrity Microsoft built Master Data Services and Data Quality Services.
Master Data Warehouse: When Microsoft SQL Server 2008 Enterprises was released it came with a new product named "Master Data Services". From my perspective Master data services is basically a user interface to procedures performing many of the tasks I would handle as a DBA\Developer. It provides an area to bring all your data together into one spot and the means to easily add linking data from different sources, add missing look-up data needed for reporting, track all changes to the data to support historical reporting and more all with a nice web interface.
Data Anomaly Detection: Before Microsoft SQL Server 2012 Enterprise data anomaly detection and cleanup was done by someone like myself writing a lot of T-SQL to find and fix data quality issues so that databases could eventually be configured with the four mandatory table design rules. While these rules prevent numerous data issues they don't address data cleanup issues such as missing or inaccurate loan dates or someone with a drivers licenses with their age specified as 3 years old. Contracts associated with the mortgage and loan collection industries have a myriad of start dates, cutoff dates, interest modification that can only occur after predefined periods of time and basically a multitude of dependencies that are quite easy to make mistakes with. In comes Data Quality Services to save the day! A DBA is still needed to configure the table schema correctly however with some training much of the data cleanup can be handled by SMEs(Subject matter experts) Now SMEs can be data stewards for their own data.
I've posted a few youtube videos below which explain both new products in detail but this one "How to Integrate DQS, MDS and Your Data Warehouse" provides a great big picture of how they all work together.
The best place to start with DQS and MDS appears to be the EIM Tutorial.docx
Enterprise Information Management Tutorials - download
Then download and take a look at the SSIS package Cleanse and Curate Master Data.
It needs more documentation but looks like the automated way to cleanse large amounts of data.
http://msftdbprodsamples.codeplex.com
https://ssisdqsmatching.codeplex.com
http://www.nikoport.com/2014/05/26/useful-links-for-learning-using-in-data-quality-projects-in-sql-server
https://domainvalueimport.codeplex.com
http://analyticssource.codeplex.com
Microsoft® OData Source for Microsoft SQL Server® 2012
Data Mining
The link below shows a simple but amazing demonstration of the power of OLAP and SQL Server BI.
Master Data Services
Efficiently Manage Data with the New Master Data Services Add-inn for Excel
SQL Server Series: Master Data Services in SQL Server 2012
Microsoft SQL Server Analysis Services Dimension Management
SQL Server Series: Master Data Services in SQL Server 2012
Microsoft SQL Server Analysis Services Dimension Management
Data Quality Services
Data Quality Services
Data Quality Services SQL server 2012Understanding SQL Server 2012 Data Quality Services
SSIS-Balanced-Data-Distributor
dqsinstaller.exe
-upgrade
Data Mining Demos
SSIS term extraction components
Resources
SQLPASSTVSQLServerDataMining.com
What's new in SQL Server 2014 feature drilldown