If you know the structure and contents of a JSON document, then it is possible to turn this into one or more relational tables, but even then I dare to you claim that it is easy to tap in a good OpenJSON SELECT statement to do it. If you don’t know what’s in that JSON file, then you’re faced with sweating over a text editor trying to work it all out. You long to just get the contents into a relational table and take it on from there. Even then, You’ve got several struggles before that table appears in the result pane. You must get the path to the tabular data correct, you have to work out the SQL Datatypes, and you need to list the full panoply of keys. Let’s face it: it is a chore. Hopefully, all that is in the past with these helper functions.
What about being able to do this, for example …
EXECUTE TablesFromJSON @TheJSON= N'[ {"name":"Phil", "email":"PhilipFactor@geeMail.com"}, {"name":"Geoff", "email":"Geoff2435@geeMail.com"}, {"name":"Mo", "email":"MoHussain34@geeMail.com"}, {"name":"Karen", "email":"KarenAlott34@geeMail.com"}, {"name":"Bob", "email":"bob32@geeMail.com"} ]'
…and getting this?
Or if there is more than one table somewhere there…
DECLARE @json NVARCHAR(MAX) = N'{ "id": "0001", "type": "donut", "name": "Cake", "ppu": 0.55, "batters": { "batter": [ { "id": "1001", "type": "Regular" }, { "id": "1002", "type": "Chocolate" }, { "id": "1003", "type": "Blueberry" }, { "id": "1004", "type": "Devil''s Food" } ] }, "topping": [ { "id": "5001", "type": "None" }, { "id": "5002", "type": "Glazed" }, { "id": "5005", "type": "Sugar" }, { "id": "5007", "type": "Powdered Sugar" }, { "id": "5006", "type": "Chocolate with Sprinkles" }, { "id": "5003", "type": "Chocolate" }, { "id": "5004", "type": "Maple" } ] }'; EXECUTE TablesFromJson @JSON
You may not want table results straight out of your JSON: I’m really just showing off, but I’m going to describe some routines that are useful to me for dealing with JSON import. Your requirements may be more subtle.
The first stage is to get a good representation of a json document so you can work on it in SQL. I do this with a multi-statement table-valued function, but you’ll see that it is a mostly pretty simple unwrapping of the json document. Built into it is a way of gauging the most appropriate SQL Datatype for each value. Unless you adopt JSON Schema, there is no onus on being consistent in assigning values to keys, so you have to test every simple value (i.e. everything other than arrays or objects)
CREATE OR alter FUNCTION [dbo].[UnwrapJson] /** summary: > This multi-statement table-valued function talkes a JSON string and unwraps it into a relational hierarchy table that also retains the path to each element in the JSON document, and calculates the best-fit sql datatype fpr every simple value Author: Phil Factor Revision: 1.0 date: 1 Nov 2020 example: - SELECT * FROM UnwrapJson (N'[ {"name":"Phil", "email":"PhilipFactor@gmail.com"}, {"name":"Bob", "email":"bob32@gmail.com"} ]') returns: > id, level, [key], Value, type, SQLDatatype, parent, path **/ ( @JSON NVARCHAR(MAX) ) RETURNS @Unwrapped TABLE ( [id] INT IDENTITY, --just used to get a unique reference to each json item [level] INT, --the hierarchy level [key] NVARCHAR(100), --the key or name of the item [Value] NVARCHAR(MAX),--the value, if it is a null, int,binary,numeric or string type INT, --0 TO 5, the JSON type, null, numeric, string, binary, array or object SQLDatatype sysname, --whatever the datatype can be parsed to parent INT, --the ID of the parent [path] NVARCHAR(4000) --the path as used by OpenJSON ) AS begin INSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent, [path]) VALUES (0, --the level NULL, --the key, @json, --the value, CASE WHEN Left(ltrim(@json),1)='[' THEN 4 ELSE 5 END, --the type 'json', --SQLDataType, 0 , --no parent '$' --base path ); DECLARE @ii INT = 0,--the level @Rowcount INT = -1; --the number of rows from the previous iteration WHILE @Rowcount <> 0 --while we are still finding levels BEGIN INSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent, [path]) SELECT [level] + 1 AS [level], new.[Key] AS [key], new.[Value] AS [value], new.[Type] AS [type], -- SQL Prompt formatting off /* in order to determine the datatype of a json value, the best approach is to a determine the datatype that can be parsed. It JSON, an array of objects can contain attributes that arent consistent either in their name or value. */ CASE WHEN new.Type = 0 THEN 'bit null' WHEN new.[type] IN (1,2) then COALESCE( CASE WHEN TRY_CONVERT(INT,new.[value]) IS NOT NULL THEN 'int' END, CASE WHEN TRY_CONVERT(NUMERIC(14,4),new.[value]) IS NOT NULL THEN 'numeric' END, CASE WHEN TRY_CONVERT(FLOAT,new.[value]) IS NOT NULL THEN 'float' END, CASE WHEN TRY_CONVERT(MONEY,new.[value]) IS NOT NULL THEN 'money' END, CASE WHEN TRY_CONVERT(DateTime,new.[value],126) IS NOT NULL THEN 'Datetime2' END, CASE WHEN TRY_CONVERT(Datetime,new.[value],127) IS NOT NULL THEN 'Datetime2' END, 'nvarchar') WHEN new.Type = 3 THEN 'bit' WHEN new.Type = 5 THEN 'object' ELSE 'array' END AS SQLDatatype, old.[id], old.[path] + CASE WHEN old.type = 5 THEN '.' + new.[Key] ELSE '[' + new.[Key] COLLATE DATABASE_DEFAULT + ']' END AS path -- SQL Prompt formatting on FROM @Unwrapped old CROSS APPLY OpenJson(old.[Value]) new WHERE old.[level] = @ii AND old.type IN (4, 5); SELECT @Rowcount = @@RowCount; SELECT @ii = @ii + 1; END; return END go
Now, you have the document as a table. You can now unpick this in several ways, but we want to get all the tables that are embedded in the json. These are generally represented as arrays of objects, each of these objects representing a ‘document’ or ‘row’. In JSON, you can, of course, have an object or array as a value for one of the keys within the object; equivalent to storing xml or json in a relational column- but we’re after the simple columns.
Because we are looking for the tables, we can get a quick listing of them. (in reality, you’d probably only want a sample of each if you’re just browsing)
First we place the output of dbo.unwrapJSON into a temporary table. I’ve chosen #jsonObject. Now we can see the rows, if there are any table structures within the JSON.
SELECT parent.path AS [TheTable] ,grandchild.* FROM (SELECT path, id FROM #jsonObject WHERE type = 4) Parent --start with an array INNER JOIN #jsonObject Child ON Child.parent = Parent.id AND child.type IN (4, 5) --either array or object INNER JOIN #jsonObject GrandChild ON GrandChild.parent = Child.id AND GrandChild.type NOT IN (4, 5)
Well, that’s fine as far as it goes, but it doesn’t go far enough. What I want is the OpenJSON query that I can execute to get the actual result.
Here is an inline table function that does just that, using the expression I’ve just shown you.
CREATE OR alter FUNCTION [dbo].[OpenJSONExpressions] /** summary: > This inline table-valued function talkes a JSON string and locates every table structure. Then it creates an OpenJSON Statement that can then be executed to create that table from the original JSON. Author: Phil Factor Revision: 1.0 date: 1 Nov 2020 example: - SELECT * FROM OpenJSONExpressions (N'[ {"name":"Phil", "email":"PhilipFactor@gmail.com"}, {"name":"Bob", "email":"bob32@gmail.com"} ]') returns: > expression **/ ( @JSON NVARCHAR(MAX) ) RETURNS TABLE AS RETURN ( WITH UnwrappedJSON (id, [level], [key], [Value], [type], SQLDatatype, parent, [path] ) AS (SELECT id, [level], [key], [Value], [type], SQLDatatype, parent, [path] FROM dbo.UnwrapJson(@json) ) SELECT 'Select * from openjson(@json,''' + path + ''') WITH (' + String_Agg( [name] + ' ' + datatype + ' ' --the WITH statement -- SQL Prompt formatting off + case when datatype='nvarchar' then '('+length+')' WHEN datatype='numeric' then '(14,4)' ELSE '' end,', ') WITHIN GROUP ( ORDER BY TheOrder ASC ) +')' as expression -- SQL Prompt formatting on FROM ( SELECT Parent.path, GrandChild.[key] AS [name], Min(GrandChild.id) AS TheOrder, Max(GrandChild.SQLDatatype) AS datatype, Convert(NVARCHAR(100), Max(Len(GrandChild.Value))) AS length FROM (SELECT path, id FROM UnwrappedJSON WHERE type = 4) Parent INNER JOIN UnwrappedJSON Child ON Child.parent = Parent.id AND child.type IN (4, 5) INNER JOIN UnwrappedJSON GrandChild ON GrandChild.parent = Child.id AND GrandChild.type NOT IN (4, 5) GROUP BY Parent.path, GrandChild.[key] ) TheFields GROUP BY path ) GO
So we try it out with some JSON that has two table in it.
SELECT * FROM OpenJSONExpressions ( N'{"employees":[ {"name":"Mo", "email":"mojaiswal@gmail.com", "StartDate":"2012-11-07T18:26:20"}, {"name":"Bob", "email":"bob456@gmail.com", "StartDate":"2015-06-20"}, {"name":"Phil", "email":"PhilipFactor@gmail.com", "StartDate":"2015-08-01T08:05:20"}, {"name":"Susan", "email":"Su87@me.com", "StartDate":"2012-13-07"} ],"Customers":[ {"name":"The Kamakaze Laxative Company", "contact":"Karen", "email":"Enquiries@KLCEnterprises.co.uk", "CustomerSince":"2012-11-07"}, {"name":"GreenFence Softwear limited", "contact":"Dick", "email":"Bob@GreenFenceSoftwear.com", "CustomerSince":"2015-06-20"}, {"name":"Grimm and Grivas, Solicitors", "contact":"Jaz", "email":"GrimmGrivas@gmail.com", "CustomerSince":"2015-08-01T08:05:20"} ] } ')
From this we get the result …
Which are the following queries…
Select * from openjson(@json,'$.Customers') WITH (name nvarchar (29), contact nvarchar (5), email nvarchar (30), CustomerSince Datetime2 ) Select * from openjson(@json,'$.employees') WITH (name nvarchar (5), email nvarchar (22), StartDate nvarchar (19))
Well, those look suspiciously-executable; so we’ll do just that, in the following procedure
go CREATE PROCEDURE TablesFromJSON @TheJSON NVARCHAR(MAX) /** summary: > This procedure returns a table for every one found in a JSON string Author: Phil Factor Revision: 1.0 date: 1 Nov 2020 example: - EXECUTE TablesFromJSON @TheJSON= N'[ {"name":"Phil", "email":"PhilipFactor@geeMail.com"}, {"name":"Geoff", "email":"Geoff2435@geeMail.com"}, {"name":"Mo", "email":"MoHussain34@geeMail.com"}, {"name":"Karen", "email":"KarenAlott34@geeMail.com"}, {"name":"Bob", "email":"bob32@geeMail.com"} ]' returns: > expression **/ AS DECLARE @expressions TABLE (id INT IDENTITY, TheExpression NVARCHAR(MAX)); INSERT INTO @expressions (TheExpression) SELECT expression FROM OpenJSONExpressions(@TheJSON); DECLARE @RowCount INT = -1, @ii INT = 1, @expressionToExcecute NVARCHAR(MAX); WHILE @RowCount <> 0 BEGIN SELECT @expressionToExcecute = TheExpression FROM @expressions WHERE id = @ii; SELECT @RowCount = @@RowCount; SELECT @ii = @ii + 1; IF @RowCount > 0 EXECUTE sp_executesql @expressionToExcecute, N'@JSON NVARCHAR(MAX)', @JSON = @TheJSON; END; GO So we try it out… EXECUTE TablesFromJSON N'{"employees":[ {"name":"Mo", "email":"mojaiswal@gmail.com", "StartDate":"2012-11-07T18:26:20"}, {"name":"Bob", "email":"bob456@gmail.com", "StartDate":"2015-06-20"}, {"name":"Phil", "email":"PhilipFactor@gmail.com", "StartDate":"2015-08-01T08:05:20"}, {"name":"Susan", "email":"Su87@me.com", "StartDate":"2012-13-07"} ],"Customers":[ {"name":"The Kamakaze Laxative Company", "contact":"Karen", "email":"Enquiries@KLCEnterprises.co.uk", "CustomerSince":"2012-11-07"}, {"name":"GreenFence Softwear limited", "contact":"Dick", "email":"Bob@GreenFenceSoftwear.com", "CustomerSince":"2015-06-20"}, {"name":"Grimm and Grivas, Solicitors", "contact":"Jaz", "email":"GrimmGrivas@gmail.com", "CustomerSince":"2015-08-01T08:05:20"} ] } '
And voila! Two results.
So there we have it. I have to admit that the TablesFromJSON procedure isn’t quite so practical as I’d like because it is impossible to get more than one result from a stored procedure within SQL (no problem from an application, of course). It turned out to be very useful in testing all the code out, though.
I use all three routines. I hope that they’re useful to you. If you spot a bug, then let me know in the comments.
These are stored with my JSON/SQL Server routines here Phil-Factor/JSONSQLServerRoutines
The post Unwrapping JSON to SQL Server Tables appeared first on Simple Talk.
from Simple Talk https://ift.tt/38iKHx0
via
No comments:
Post a Comment