Thursday, August 7, 2014

Summing Percentages

Here is another write-up on summing percentages. Example 1 is the only example that produces a valid result. Unless the sum of all the numerators is being used as the denominator for all the percentages to be summed it is necessary to calculate group percentages the same way as the individual percentages are being calculated using the sum of the numerators divided by the sum of the denominators. There are additional issues calculating percentages from reports with options to filter the data as then the calculations need to be based on the filtered data rendering any data summed in the view useless.

Example 1: Using the sum of all the numerators as the denominator for all the percentages to be summed yields a valid result.
Numerator Numerator (Running Total) Denominator Denominator (Running Total) Percentage Running Sum of % Running Sum of % divided by the number of percentages summed Running Percentage
1 1 210 210 0.48% 0.48% 0.48% 0.48%
2 3 210 420 0.95% 1.43% 0.71% 0.71%
3 6 210 630 1.43% 2.86% 0.95% 0.95%
4 10 210 840 1.90% 4.76% 1.19% 1.19%
5 15 210 1050 2.38% 7.14% 1.43% 1.43%
6 21 210 1260 2.86% 10.00% 1.67% 1.67%
7 28 210 1470 3.33% 13.33% 1.90% 1.90%
8 36 210 1680 3.81% 17.14% 2.14% 2.14%
9 45 210 1890 4.29% 21.43% 2.38% 2.38%
10 55 210 2100 4.76% 26.19% 2.62% 2.62%
11 66 210 2310 5.24% 31.43% 2.86% 2.86%
12 78 210 2520 5.71% 37.14% 3.10% 3.10%
13 91 210 2730 6.19% 43.33% 3.33% 3.33%
14 105 210 2940 6.67% 50.00% 3.57% 3.57%
15 120 210 3150 7.14% 57.14% 3.81% 3.81%
16 136 210 3360 7.62% 64.76% 4.05% 4.05%
17 153 210 3570 8.10% 72.86% 4.29% 4.29%
18 171 210 3780 8.57% 81.43% 4.52% 4.52%
19 190 210 3990 9.05% 90.48% 4.76% 4.76%
20 210 210 4200 9.52% 100.00% 5.00% 5.00%
Example 2: Using 100 as the denominator for all the percentages to be summed does not yield a valid result.
Numerator Numerator (Running Total) Denominator Denominator (Running Total) Percentage Running Sum of % Running Sum of % divided by the number of percentages summed Running Percentage
1 1 100 100 1.00% 1.00% 1.00% 1.00%
2 3 100 200 2.00% 3.00% 1.50% 1.50%
3 6 100 300 3.00% 6.00% 2.00% 2.00%
4 10 100 400 4.00% 10.00% 2.50% 2.50%
5 15 100 500 5.00% 15.00% 3.00% 3.00%
6 21 100 600 6.00% 21.00% 3.50% 3.50%
7 28 100 700 7.00% 28.00% 4.00% 4.00%
8 36 100 800 8.00% 36.00% 4.50% 4.50%
9 45 100 900 9.00% 45.00% 5.00% 5.00%
10 55 100 1000 10.00% 55.00% 5.50% 5.50%
11 66 100 1100 11.00% 66.00% 6.00% 6.00%
12 78 100 1200 12.00% 78.00% 6.50% 6.50%
13 91 100 1300 13.00% 91.00% 7.00% 7.00%
14 105 100 1400 14.00% 105.00% 7.50% 7.50%
15 120 100 1500 15.00% 120.00% 8.00% 8.00%
16 136 100 1600 16.00% 136.00% 8.50% 8.50%
17 153 100 1700 17.00% 153.00% 9.00% 9.00%
18 171 100 1800 18.00% 171.00% 9.50% 9.50%
19 190 100 1900 19.00% 190.00% 10.00% 10.00%
20 210 100 2000 20.00% 210.00% 10.50% 10.50%
Example 3: summing percentages calculated with different denominators does not yield meaningful data.
Numerator Numerator (Running Total) Denominator Denominator (Running Total) Percentage Running Sum of % Running Sum of % divided by the number of percentages summed Running Percentage
1 1 10 10 10.00% 10.00% 10.00% 10.00%
2 3 25 35 8.00% 18.00% 9.00% 8.57%
3 6 12 47 25.00% 43.00% 14.33% 12.77%
4 10 15 62 26.67% 69.67% 17.42% 16.13%
5 15 17 79 29.41% 99.08% 19.82% 18.99%
6 21 88 167 6.82% 105.90% 17.65% 12.57%
7 28 398 565 1.76% 107.66% 15.38% 4.96%
8 36 49 614 16.33% 123.98% 15.50% 5.86%
9 45 50 664 18.00% 141.98% 15.78% 6.78%
10 55 66 730 15.15% 157.13% 15.71% 7.53%
11 66 5 735 220.00% 377.13% 34.28% 8.98%
12 78 75 810 16.00% 393.13% 32.76% 9.63%
13 91 49 859 26.53% 419.66% 32.28% 10.59%
14 105 30 889 46.67% 466.33% 33.31% 11.81%
15 120 31 920 48.39% 514.72% 34.31% 13.04%
16 136 32 952 50.00% 564.72% 35.29% 14.29%
17 153 33 985 51.52% 616.23% 36.25% 15.53%
18 171 34 1019 52.94% 669.17% 37.18% 16.78%
19 190 35 1054 54.29% 723.46% 38.08% 18.03%
20 210 36 1090 55.56% 779.02% 38.95% 19.27%

Tuesday, April 15, 2014

SQL Server 2012 - BI - Master Data Services,Data Quality Services and Data Mining

To my surprise sometime while I was buried in SQL cleaning data, configuring unique record constraints, primary keys, clustered indexes, and referential integrity Microsoft built Master Data Services and Data Quality Services. 

Master Data Warehouse: When Microsoft SQL Server 2008 Enterprises was released it came with a new product named "Master Data Services".  From my perspective Master data services is basically a user interface to procedures performing many of the tasks I would handle as a DBA\Developer.  It provides an area to bring all your data together into one spot and the means to easily add linking data from different sources, add missing look-up data needed for reporting, track all changes to the data to support historical reporting and more all with a nice web interface.

Data Anomaly Detection: Before Microsoft SQL Server 2012 Enterprise data anomaly detection and cleanup was done by someone like myself writing a lot of T-SQL to find and fix data quality issues so that databases could eventually be configured with the four mandatory table design rules. While these rules prevent numerous data issues they don't address data cleanup issues such as missing or inaccurate loan dates or someone with a drivers licenses with their age specified as 3 years old. Contracts associated with the mortgage and loan collection industries have a myriad of start dates, cutoff dates, interest modification that can only occur after predefined periods of time and basically a multitude of dependencies that are quite easy to make mistakes with. In comes Data Quality Services to save the day! A DBA is still needed to configure the table schema correctly however with some training much of the data cleanup can be handled by SMEs(Subject matter experts)  Now SMEs can be data stewards for their own data.  

I've posted a few youtube videos below which explain both new products in detail but this one "How to Integrate DQS, MDS and Your Data Warehouse" provides a great big picture of how they all work together.
Data Mining
The link below shows a simple but amazing demonstration of the power of OLAP and SQL Server BI.




Master Data Services
Data Mining - Adventure Works SESSION: Data Mining in SQL Server Analysis Services (Brian Knight)   
SSIS term extraction components
Resources
SQLPASSTV
SQLServerDataMining.com



What's new in SQL Server 2014 feature drilldown

Friday, November 30, 2012

SSRS - Centralized Formatting

Centralized formatting should really be done via .net DLLs but in many organizations short and long term maintenance of reporting is more often than not ignored and the idea of providing development software to build DLLs to people who merely do reporting discounted.  When this is the case SSRS report developers have a second option of using code in the reports to replace the functionality the .net DLL would have provided.  The formatting will not be centralized as it would in a DLL however it is possible to use search and replace on report RDLs to modify the code thereby simplifying the process of making the same changes to hundreds of reports. This makes using values defined in code the second best choice.  Make sure to make backups of the RDLs first as it is very possible to do something to many reports that will require reverting to the original version. In this situation it is advisable to use code something like show below to format your report titles, tables, fonts, etc.  This code goes in the code section of your report. Examples of calling the code are at the end of this article.  There are two  SQLServerCentral.com articles on creating DLLs for this purpose. This first is SSRS - Custom Code with External Assemblies and the second is Centralising Reporting Services Stylesheets. You can use these links along with the code below to create you own custom centralized formatting DLLs.  This Microsoft link on security may also help you get them working.

Public Const NumberFormat = "###,###,###,###,##0.00"
Public Const PercentageFormat = "##0.00 %"   
 
Function BackgroundColor(ByVal Style As String) As String         
Select Case UCase(Style)
Case "MAINTITLE":Return "White"            
CASE "SUBTITLE1":Return "White"            
CASE "SUBTITLE2":Return "White"            
CASE "SUBTITLE3":Return "White"
Case "DATEGROUP":return "White"
Case "TABLEHEADER":Return "CornflowerBlue"            
CASE "TABLEFOOTER" :Return "CornflowerBlue"
Case "GROUP1HEADER":Return "#8fb3f3"            
CASE "GROUP1FOOTER":Return "#8fb3f3"
Case "GROUP2HEADER":Return "#c7d9f9"            
CASE "GROUP2FOOTER":Return "#c7d9f9"
Case "SUBTOTAL":Return "#8fb3f3"
Case "GRANDTOTAL":Return "White"
Case "DETAIL": Return "White"            
CASE "PAGEFOOTER": Return "White"            
CASE "REPORTFOOTER": Return "White"            
CASE Else: Return "White"        
End Select
End Function
 
Function FontColor(ByVal BackGroundStyle As String) As String
Select CASE UCase(BackGroundStyle )  
CASE "PURPLE":Return "Black"  
CASE "DARKBLUE":Return "Black"  
CASE "WHITE":Return "Black"  
CASE "LIGHTSTEELBLUE":Return "Black"  
CASE "LIGHTGREY":Return "Black"  
CASE "LightGrey":Return "Black"  
CASE "#6e9eca":Return "Black"  
CASE "#e0e0e0":Return "Black"  
CASE Else: Return "Black" 
End Select
 
End Function

Function FontFamily(ByVal Style As String) As String
Select Case UCase(Style)            
CASE "MAINTITLE":Return "Arial"            
CASE "SUBTITLE1":Return "Arial"            
CASE "SUBTITLE2":Return "Arial"            
CASE "SUBTITLE3":Return "Arial"            
CASE "TABLEHEADER":Return "Arial Narrow"            
CASE "TABLEFOOTER" :Return "Arial Narrow"            
CASE "GROUP1HEADER":Return "Arial Narrow"            
CASE "GROUP1FOOTER":Return "Arial Narrow"            
CASE "GROUP2HEADER":Return "Arial Narrow"            
CASE "GROUP2FOOTER":Return "Arial Narrow"             
CASE "DATEGROUP":return "Arial Narrow"            
CASE "SUBTOTAL":Return "Arial Narrow"            
CASE "GRANDTOTAL":Return "Arial Narrow"            
CASE "DETAIL": Return "Arial Narrow"            
CASE "PAGEFOOTER": Return "Arial Narrow"            
CASE "REPORTFOOTER": Return "Arial Narrow"            
CASE Else: Return "Arial Narrow"        
End Select
 
End Function

Function FontSize(ByVal Style As String) As String
Select Case UCase(Style)            
CASE "MAINTITLE":Return "12pt"            
CASE "SUBTITLE1":Return "12pt"            
CASE "SUBTITLE2":Return "12pt"            
CASE "SUBTITLE3":Return "10pt"            
CASE "TABLEHEADER":Return "8pt"            
CASE "TABLEFOOTER" :Return "8pt"            
CASE "GROUP1HEADER":Return "8pt"            
CASE "GROUP1FOOTER":Return "8pt"            
CASE "GROUP2HEADER":Return "8pt"            
CASE "GROUP2FOOTER":Return "8pt"            
CASE "DATEGROUP":return "8pt"            
CASE "SUBTOTAL":Return "8pt"            
CASE "GRANDTOTAL":Return "8pt"            
CASE "DETAIL": Return "8pt"            
CASE "PAGEFOOTER": Return "8pt"            
CASE "REPORTFOOTER": Return "8pt"            
CASE Else: Return "8pt"        
End Select
End Function
 
Function FontWeight(ByVal Style As String) As String
Select  CASE UCase(Style)            
CASE "MAINTITLE":Return "Bold"            
CASE "SUBTITLE1":Return "Bold"            
CASE "SUBTITLE2":Return "Bold"            
CASE "SUBTITLE3":Return "Bold"            
CASE "TABLEHEADER":Return "Bold"            
CASE "TABLEFOOTER" :Return "Bold"            
CASE "GROUP1HEADER":Return "Normal"            
CASE "GROUP1FOOTER":Return "8pt"            
CASE "GROUP2HEADER":Return "8pt"            
CASE "GROUP2FOOTER":Return "Bold"             
CASE "DATEGROUP":return "Bold"            
CASE "SUBTOTAL":Return "Bold"            
CASE "GRANDTOTAL":Return "Bold"            
CASE "DETAIL": Return "Normal"            
CASE "PAGEFOOTER": Return "Normal"            
CASE "REPORTFOOTER": Return "Normal"            
CASE Else: Return "Normal"        
End Select
 
End Function

Function FontStyle(ByVal Style As String) As String 
 Select CASE UCase(Style)            
CASE "MAINTITLE":Return "Normal"            
CASE "SUBTITLE1":Return "Normal"            
CASE "SUBTITLE2":Return "Normal"            
CASE "SUBTITLE3":Return "Normal"            
CASE "TABLEHEADER":Return "Normal"            
CASE "TABLEFOOTER" :Return "Normal"            
CASE "GROUP1HEADER":Return "Normal"            
CASE "GROUP1FOOTER":Return "Normal"            
CASE "GROUP2HEADER":Return "Normal"            
CASE "GROUP2FOOTER":Return "Normal"             
CASE "DATEGROUP":return "Normal"            
CASE "SUBTOTAL":Return "Normal"            
CASE "GRANDTOTAL":Return "Normal"            
CASE "DETAIL": Return "Normal"            
CASE "PAGEFOOTER": Return "Normal"            
CASE "REPORTFOOTER": Return "Normal"            
CASE Else: Return "Normal"        
End Select
 
End Function 
  
----------------------------------------------------------------------------

Examples of using this code in your report.

<Color>=code.FontColor(code.BackgroundColor("SUBTITLE1"))</Color> <BackgroundColor>=code.BackgroundColor("SUBTITLE1")</BackgroundColor> <FontStyle>=code.FontStyle("SUBTITLE1")</FontStyle> <FontFamily>=code.FontFamily("SUBTITLE1")</FontFamily> <FontSize>=code.FontSize("SUBTITLE1")</FontSize> <FontWeight>=code.FontWeight("SUBTITLE1")</FontWeight> 
<Color>=code.FontColor(code.BackgroundColor("SUBTITLE2"))</Color> <BackgroundColor>=code.BackgroundColor("SUBTITLE2")</BackgroundColor> <FontStyle>=code.FontStyle("SUBTITLE2")</FontStyle> <FontFamily>=code.FontFamily("SUBTITLE2")</FontFamily> <FontSize>=code.FontSize("SUBTITLE2")</FontSize> <FontWeight>=code.FontWeight("SUBTITLE2")</FontWeight> 
  


<FontStyle>=code.FontStyle("GROUP1FOOTER")</FontStyle>
<FontFamily>=code.FontFamily("GROUP1FOOTER")</FontFamily>
<FontSize>=code.FontSize("GROUP1FOOTER")</FontSize>
<FontWeight>=code.FontWeight("GROUP1FOOTER")</FontWeight>
<Format>=code.NumberFormat </Format>

Database Maintenance on Remote servers (EXEC or OPENROWSET)

When I write database maintenance procedures I want to be able to execute them on any or all databases on any server so I use the EXEC to run local server commands and create a dynamic OpenRowSet SQL Statement to run remote server commands. I use the procedure below to implement this.


CREATE PROCEDURE [dbo].[usp_srvr_exec]
@cmd            varchar(MAX)
, @DisplayError        bit=0
, @ServerName        varchar(80)=NULL
, @Execute0Display1Both2    tinyint=0
, @UseOpenRowSet    bit=0
, @ReplaceTabsWithSpaces    bit=1
, @cmd_generated        varchar(8000)=NULL OUTPUT
, @DoubleSingleQuotes    bit = 1
, @UserAccount        varchar(50)= NULL
, @Password        varchar(50)= NULL
, @execute_command    bit=1
AS
BEGIN

DECLARE
@Retval        integer
, @cmd2        varchar(MAX)
, @uid        varchar(25)
, @pwd        varchar(50)
, @err        integer
, @pos        bigint

SET @Retval                = 0
SET @ReplaceTabsWithSpaces        = ISNULL(@ReplaceTabsWithSpaces    ,0)
SET @ServerName                = ISNULL(@ServerName, @@SERVERNAME)
SET @DoubleSingleQuotes            = ISNULL(@DoubleSingleQuotes,1)
SET @UID                = COALESCE( @UID, @UserAccount)
SET @PWD                = COALESCE(@PWD , @Password)

IF @ReplaceTabsWithSpaces = 1
    SET @cmd = REPLACE(@cmd, CHAR(9), CHAR(32))

IF @ServerName IS NOT NULL AND @ServerName <> @@SERVERNAME
BEGIN
    SET @ServerName = COALESCE(
(SELECT Server_Name
FROM msdb.dbo.sysmanagement_shared_registered_servers_internal
WHERE NAME = @ServerName)
,@ServerName)
end

IF (@ServerName <> @@SERVERNAME OR ISNULL(@UseOpenRowSet,0) = 1) --AND 1 = 2
BEGIN
---------------------------------------------------------------
-- Format the command to be called by the EXEC command to use
-- OPENROWSET so the command can be executed on a remote server.
---------------------------------------------------------------

---------------------------------------------------------------
-- The following section has been modified to account for
-- a less complicated version of this procedure.
-- The original procedure checked to see if CLR functionality was enabled.
-- If it was not then user ids and encrypted passwords necessary to access
-- remote servers were retrieved.
-- This procedure has been modified to assume CLR functionality is disabled.
-- It is also modified to only look for encrypted passwords if current database
-- is 'DBATools'. Otherwise server communications will only be possible if
-- the SQL service account is running under a domain account trusted on the
-- remote server being accessed.
---------------------------------------------------------------
    IF [dbo].[udf_config_use_trusted]() = 0
    AND (@UID IS NULL OR @PWD IS NULL)
    BEGIN
        SET @UID = [dbo].[udf_AppDefault]( 'SystemUID1','SQL_UID', @ServerName    , 'DBATools')
        SET @PWD = [dbo].[udf_AppDefault2]( 'SystemUID1','SQL_UID', @ServerName, 'DBATools')
        --PRINT @UID + ':' + @PWD
    END

---------------------------------------------------------------
--
---------------------------------------------------------------
    IF ISNULL(@uid,'') = '' OR ISNULL(@pwd,'') = ''
    BEGIN
        SET @cmd2 = CHAR(39) + 'Server=' + @ServerName + ';'
                 + 'Trusted_Connection=yes;'', ''set fmtonly off;'
    END
    ELSE
    BEGIN
        SET @cmd2 = CHAR(39) + @ServerName + CHAR(39) + ';'''
            + @uid + ''';''' + @pwd + ''', ''set fmtonly off; '
    END
---------------------------------------------------------------
-- First layer of double quotes.
-- This is necessary for any EXEC statements.
---------------------------------------------------------------
    IF @DoubleSingleQuotes    = 1
        SET @cmd = REPLACE(@cmd, CHAR(39),CHAR(39) + CHAR(39))

    IF CHARINDEX('USE ', @cmd) > 0
    BEGIN
        SET @cmd = 'EXEC(''' + @cmd + ''')'

---------------------------------------------------------------
-- Second layer of double quotes.
-- This is necessary for any EXEC statements executed on remote servers
-- using OPENROWSET.
---------------------------------------------------------------
        SET @cmd = REPLACE(@cmd, CHAR(39),CHAR(39) + CHAR(39))

    END
    ELSE
    BEGIN
        IF CHARINDEX('EXEC ', @cmd) = 0
        AND CHARINDEX('EXEC(', @cmd) = 0
        AND CHARINDEX('EXECUTE(', @cmd) = 0
        AND CHARINDEX('EXECUTE ', @cmd) = 0
        AND CHARINDEX('SELECT ', @cmd) = 0
        AND CHARINDEX('UPDATE ', @cmd) = 0
        AND CHARINDEX('DELETE ', @cmd) = 0
        AND CHARINDEX('TRUNCATE ', @cmd) = 0
        AND CHARINDEX('INSERT ', @cmd) = 0
        AND CHARINDEX('DBCC ', @cmd) = 0
        BEGIN
            SET @cmd2 = @cmd2 + ' EXEC '
        END
    END

    SET @cmd = 'SELECT * FROM OPENROWSET(''SQLNCLI'',' + @cmd2 + ' ' + @cmd + ''') '
END
SET @cmd_generated = @cmd

---------------------------------------------------------------
-- Execute the command
---------------------------------------------------------------
IF ISNULL(@Execute0Display1Both2,1) >= 1
    PRINT @cmd

IF ISNULL(@execute_command,0) = 1
AND (ISNULL(@Execute0Display1Both2,1) = 0 OR ISNULL(@Execute0Display1Both2,1) = 2)
BEGIN
    BEGIN TRY
        EXEC(@cmd)
        SET @retval    = @@ERROR
    END TRY
    BEGIN CATCH
        IF ISNULL(@DisplayError,1) = 1
            EXECUTE dbo.usp_srvr_error_print;
            --EXECUTE dbo.usp_srvr_error_add;
        SET @retval    = 1
    END CATCH
END

ProcExit:
    RETURN @Retval
ProcError:
    SET @retval = @err
    GOTO ProcExit

/*

DECLARE
@err int
, @cmd varchar(max)

SET @cmd = 'Use [AdventureWorks];SELECT * FROM dbo.sysfiles'

EXEC @err        = [dbo].[usp_srvr_exec]
@cmd            = @cmd
, @DisplayError        = 1
, @ServerName        = @@SERVERNAME
, @Execute0Display1Both2    = 2


*/

END
GO

Row duplication - SQL limitation

-------------------------------------------------------------------------------
-- Here is an example of an SQL limitation.
-- Joining more than one table on the same columns that contains more than on
-- record per join criteria causes row duplication to occur.
-------------------------------------------------------------------------------
SELECT  t1.InvestorID , t2.Period , t3.Period
SELECT NumRecords = COUNT(*)
FROM    dbo.InvestorBase t1 WITH (NOLOCK)
        LEFT JOIN dbo.InvestorDetail1 t2 WITH (NOLOCK)
            ON t1.InvestorID      = t2.InvestorID 
            AND t1.Period         = t2.Period
        LEFT JOIN dbo.InvestorDetail2 t3 WITH (NOLOCK) 
            ON t1.InvestorID      = t3.InvestorID 
            AND t1.Period         = t3.Period

Monday, November 19, 2012

SQL Server - Preparation for an Interview

In short this is some preparation for a phone interview.
Master Data Services
Data Quality Services
Data Mining Demos
Data Mining - Adventure Works SESSION: Data Mining in SQL Server Analysis Services (Brian Knight)   
SSIS term extraction components
Resources
SQLPASSTV
SQLServerDataMining.com


My SQL Server DBATools library function and procedure index.


Brent Ozar
Capacity and performance planning.
Always On ; Always On Code
Use Windows 2012
Patching is manual and self job.
Monitoring and Tuning require union of MSDB databases.