Monday, June 17, 2019

Aggregating strings in SQL Server, using Irish Saints Days

I nowadays enjoy working with front-end developers using JavaScript, especially now that SQL Server has the tools that allow the SQL Developer to accommodate their requirements. There is less tension now, because if the SQL Server developer can deliver exactly what they need, then there is less temptation to allow users access to base tables, and less of an urge for the full-stack developer to ask for it. You can now have a nice simple RESTful interface that provides everything that the application requires via a set of functions.

What has brought about this change? The introduction of String_agg() has made a lot of difference. So, of course, has the JSON support for reading and writing JSON. SQL Server now is a great deal more ‘sympatico’ with the world of web-based development and microservices

You now have a lot more opportunity to deliver to the application precisely what is needed for a display without so much tedious data-pummeling by the front-end developer. I ought to illustrate this a bit from the murky depths of SQL Server.

Let’s choose as sample for our calendar-based information, a list of Irish saints days. The code to create this can be downloaded from here.

There are 128 Irish saints whose saints days are known, of the 190 who are still revered. There are 10,000 in all, but that is discussed elsewhere.

In the actual application this was derived from, the report was a calendar-based list of appointments. You might have all sorts of calendar-based reports that could use this sort of visualisation. You can use anything that has an event followed by a date.

This routine gives you a list of relevant saints for each day in which one or more saint has a day.

/* glue together all saints for a day  */
SELECT
    --the name of the month in the current language 
    DateName(MONTH,DateAdd(MONTH,MonthNumber,N'2018-12-01T00:00:00')), 
        --the list of Irish saints with saints-days in the month
    String_Agg(saints, '. ') WITHIN GROUP( ORDER BY MonthNumber ASC)
  --a simple derived table containing all the valid month numbers
  FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12)) 
  AS Month ( MonthNumber )
    /* cope with a month with no entries with a blank column */
    LEFT OUTER JOIN
      (/* glue together all saints for a day  */
      SELECT 
           Coalesce(
          Convert(VARCHAR(2), DateOfMonth) --this is english abbreviation only. Sorry
           + Substring('stndrdthththththththththththththththththstndrdthththththththst',
             (DateOfMonth * 2) - 1, 2 ) + ': ' + String_Agg(NAME, ', ') 
                   , '' ) AS Saints, DateOfMonth, MonthNo
        FROM
          (/* collect the basic information in a derived table */
          SELECT name, DatePart(DAY, saintsDay) AS DateOfMonth,
            DatePart(MONTH, saintsDay) AS MonthNo
            FROM #IrishSaintsDay
            WHERE saintsDay IS NOT NULL
          ) AS f(NAME, DateOfMonth, MonthNo)
        GROUP BY DateOfMonth, MonthNo
      ) AS g(Saints, DateOfMonth, MonthNo)
      ON g.MonthNo = Month.MonthNumber
  GROUP BY MonthNumber;

Which will give the following table (I used MS Word to display it though it usually ends up in an HTML table. As you see, one or more saint is listed on the days that have a saints day.

Sometimes, you need to concatenate strings in a particular order. You could do it in the bad old days using the XML trick. String-agg has a syntax for it . You’ll see that I use the feature built into String-Agg().

String_Agg(saints, '') WITHIN GROUP( ORDER BY MonthNumber ASC)

T-SQL requires the use of the WITHIN GROUP clause when ordering the result set. This is more complicated than the PostgreSQL String_agg() function and MySQL’s GROUP_CONCAT() function which both make do with just an ORDER BY clause. String_agg(), unlike group_concat(), doesn’t have a DISTINCT option, but at the moment I can’t think of a use for this.

Of course, you might be asked to do a scrolling year view like this…

Which you could do more simply like this.

SELECT Rownumber AS day,
  Max(CASE WHEN ColumnNumber = 1 THEN Saints ELSE '' END) AS January,
  Max(CASE WHEN ColumnNumber = 2 THEN Saints ELSE '' END) AS February,
  Max(CASE WHEN ColumnNumber = 3 THEN Saints ELSE '' END) AS March,
  Max(CASE WHEN ColumnNumber = 4 THEN Saints ELSE '' END) AS April,
  Max(CASE WHEN ColumnNumber = 5 THEN Saints ELSE '' END) AS May,
  Max(CASE WHEN ColumnNumber = 6 THEN Saints ELSE '' END) AS June,
  Max(CASE WHEN ColumnNumber = 7 THEN Saints ELSE '' END) AS July,
  Max(CASE WHEN ColumnNumber = 8 THEN Saints ELSE '' END) AS August,
  Max(CASE WHEN ColumnNumber = 9 THEN Saints ELSE '' END) AS September,
  Max(CASE WHEN ColumnNumber = 10 THEN Saints ELSE '' END) AS october,
  Max(CASE WHEN ColumnNumber = 11 THEN Saints ELSE '' END) AS November,
  Max(CASE WHEN ColumnNumber = 12 THEN Saints ELSE '' END) AS December
  FROM
    (
    SELECT String_Agg(NAME, ', ') AS Saints, RowNumber, ColumnNumber
      FROM
        (
        SELECT name, DatePart(DAY, saintsDay) AS RowNumber,
          DatePart(MONTH, saintsDay) AS ColumnNumber
          FROM #IrishSaintsDay
          WHERE saintsDay IS NOT NULL
        ) AS f(NAME, RowNumber, ColumnNumber)
      GROUP BY RowNumber, ColumnNumber
    ) AS g(Saints, Rownumber, ColumnNumber)
  GROUP BY Rownumber
  ORDER BY Rownumber;

You’ll notice here that you can’t use string_agg() in the same way as you might a SUM(), because of the concatenation delimiter that you specify. You’d end up with lots of empty delimiters. Instead, you need to aggregate the saints for the day separately beforehand, and use something neutral like MIN() or MAX(). It doesn’t matter because it has already been aggregated so there will be only one of each.

You can get exactly the same result this way, using JSON; it has added logic to make sure that even if not every monthday (1-31) or month (1-12) is represented, it will still work.

SELECT *
  FROM
  OpenJson(
    (
    SELECT '[' + String_Agg(TheJSONList, ',') WITHIN GROUP(
      ORDER BY DayNumber ASC) + ']'
      FROM
        (
        SELECT '[' + Convert(VARCHAR(2), DayNumber) + ',"'
               + String_Agg(Coalesce(String_Escape(Saints, 'json'), ''), '","') WITHIN GROUP(
          ORDER BY MonthNumber ASC) + '"]' AS TheJSONList, DayNumber
          FROM (VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11),
(12)
)         AS Month (MonthNumber)
            CROSS JOIN
              (
              VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11),
                (12), (13), (14), (15), (16), (17), (18), (19), (20), (21),
                (22), (23), (24), (25), (26), (27), (28), (29), (30), (31)
              ) AS Day (DayNumber)
            LEFT OUTER JOIN
              (
              SELECT String_Agg(NAME, ', ') AS Saints, RowNumber, ColumnNumber
                FROM
                  (
                  SELECT name, DatePart(DAY, saintsDay) AS RowNumber,
                    DatePart(MONTH, saintsDay) AS ColumnNumber
                    FROM #IrishSaintsDay
                    WHERE saintsDay IS NOT NULL
                  ) AS f(NAME, RowNumber, ColumnNumber)
                GROUP BY RowNumber, ColumnNumber
              ) AS g(Saints, Rownumber, ColumnNumber)
              ON g.Rownumber = Day.DayNumber
             AND g.ColumnNumber = Month.MonthNumber
          GROUP BY DayNumber
        ) AS lines(TheJSONList, DayNumber)
    )
          )
  WITH
    (
    Day INT '$[0]', Jan NVARCHAR(1000) '$[1]', Feb NVARCHAR(1000) '$[2]',
    Mar NVARCHAR(1000) '$[3]', Apr NVARCHAR(1000) '$[4]',
    May NVARCHAR(1000) '$[5]', Jun NVARCHAR(1000) '$[6]',
    Jul NVARCHAR(1000) '$[7]', Aug NVARCHAR(1000) '$[8]',
    Sep NVARCHAR(1000) '$[9]', Oct NVARCHAR(1000) '$[10]',
    Nov NVARCHAR(1000) '$[11]', Dec NVARCHAR(1000) '$[12]'
    );

 

Of course, there is always going to be someone who wants the report in this format:

294-img73.jpg

So here is the code to provide all the saints days for the month of whatever date you place into the variable @date, or the current date if you put a null in it.

DECLARE @Date DATETIME = '01 Jun 2019';
--nail down the start of the week
DECLARE @MonthLength INT; --number of days in the month
DECLARE @MonthStartDW INT; --the day of the week that the month starts on
DECLARE @Month INT; --the month (1.10)
--if no date is specified, then use the current date
SELECT @Date =
  '01 ' + Substring(Convert(CHAR(11), Coalesce(@Date, GetDate()), 113), 4, 8);
--get the number of days in the month and the day of the week that the month starts on
SELECT @MonthLength =
  DateDiff( DAY, Convert(CHAR(11), @Date, 113),
            Convert(CHAR(11), DateAdd(MONTH, 1, @Date), 113)
          ), @MonthStartDW = ((DatePart(dw, @Date) + @@DateFirst - 3) % 7) + 1,
  @Month = DatePart(MONTH, @Date);
 
 SELECT  
  Max(case when day=1 and monthdate between 1 and @MonthLength then convert(varchar(2),monthdate)+': ' else '' end)
    + String_agg(case when day=1 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Mon,
  Max(case when day=2 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=2 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Tue, 
  Max(case when day=3 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=3 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Wed, 
  Max(case when day=4 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=4 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Thu,
  Max(case when day=5 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=5 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Fri, 
  Max(case when day=6 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=6 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Sat, 
  Max(case when day=7 and monthdate between 1 and @MonthLength then convert(Varchar(2),monthdate)+': '  else '' end)
    + String_agg(case when day=7 and monthdate between 1 and @MonthLength THEN ISD.name+'. ' ELSE ''END ,'') AS Sun 
  FROM
    (
    SELECT DayNo.number AS day, Weekno.number AS week,
      (DayNo.number + ((Weekno.number - 1) * 7)) - @MonthStartDW AS monthDate
      FROM (VALUES (1), (2), (3), (4), (5), (6), (7)) AS DayNo (number)
        CROSS JOIN (VALUES (1), (2), (3), (4), (5), (6)) AS Weekno (number)
    ) AS f
    LEFT OUTER JOIN #IrishSaintsDay AS ISD
      ON f.monthDate = DatePart(DAY, saintsDay)
     AND DatePart(MONTH, saintsDay) = @Month
  GROUP BY week --so that each week is on a different row
  HAVING Max(CASE WHEN f.day = 1 AND f.monthDate BETWEEN 1 AND @MonthLength THEN
                    f.monthDate ELSE 0 END
            ) > 0
      OR (f.week = 1 AND Sum(monthDate) > -21);

 

I’m not advocating that you get stuck into calendar displays of Irish Saints, worthy though they are. I would like to make a point about the use of calendar-style reports. I once had a boss who was very keen indeed on having his financial reports based in a calendar form. He could see instantly if revenues tailed off on a Friday and could zoom in on the consequences of various IT problems or off-site team-bonding sessions on revenues. He could pick out all sorts of details that a bare list wouldn’t tell him about. It is also great for alerts and warnings in IT processes because the patterns are more readily apparent. Our brains are used to calendars, and the brains of managers will benefit from any help you can give them.

I also want to make the point that the simple SQL Reports you dish out may look splendid, but are they exactly right for communicating the particular type detailed financial or quantitative information you are being asked for?

The builder for The Irish Saints Day table is here

The post Aggregating strings in SQL Server, using Irish Saints Days appeared first on Simple Talk.



from Simple Talk http://bit.ly/2wZbTgM
via

No comments:

Post a Comment