Wednesday, May 24, 2023

Microsoft Fabric: Data Warehouse Ingestion using T-SQL and more

.In this blog I will illustrate how we can ingest data from a blog storage to a Microsoft Fabric Data Warehouse using T-SQL.

Create a new workspace

  1. On the left ribbon, click Workspaces.
  2. Click the New Workspace button.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente

  1. Provide a name for the workspace.

Tabela Descrição gerada automaticamente

  1. On advanced configuration, ensure you choose Premium by Capacity, and you need to select a capacity.

Interface gráfica do usuário, Texto, Aplicativo, Email Descrição gerada automaticamente

  1. Click the Apply button.

Interface gráfica do usuário Descrição gerada automaticamente com confiança média

Creating the Microsoft Fabric Data Warehouse

  1. On the left ribbon, Experiences button, click the Data Warehouse experience.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente

  1. Click the Warehouse button to create a new Warehouse

    Interface gráfica do usuário, Texto, Aplicativo Descrição gerada automaticamente

  2. Define the name wwiSample for the Data Warehouse and click Create.

Interface gráfica do usuário, Texto, Aplicativo Descrição gerada automaticamente

Interface gráfica do usuário, Texto, Aplicativo, Email Descrição gerada automaticamente

 

Creating the schema to import the data

  1. Click the New SQL query button.

 –dimension_city
DROP TABLEIF EXISTS [dbo].[dimension_city];

CREATE TABLE [dbo].[dimension_city]
             (
                          [citykey] [INT] NULL,
                          [wwicityid] [INT] NULL,
                          [city] [VARCHAR](8000) NULL,
                          [stateprovince] [VARCHAR](8000) NULL,
                          [country] [VARCHAR](8000) NULL,
                          [continent] [VARCHAR](8000) NULL,
                          [salesterritory] [VARCHAR](8000) NULL,
                          [region] [VARCHAR](8000) NULL,
                          [subregion] [VARCHAR](8000) NULL,
                          [location] [VARCHAR](8000) NULL,
                          [latestrecordedpopulation] [BIGINT] NULL,
                          [validfrom] [DATETIME2](6) NULL,
                          [validto] [DATETIME2](6) NULL,
                          [lineagekey] [INT] NULL
             );

–fact_sale

DROP TABLEIF EXISTS [dbo].[fact_sale];

CREATE TABLE [dbo].[fact_sale]
             (
                          [salekey] [BIGINT] NULL,
                          [citykey] [INT] NULL,
                          [customerkey] [INT] NULL,
                          [billtocustomerkey] [INT] NULL,
                          [stockitemkey] [INT] NULL,
                          [invoicedatekey] [DATETIME2](6) NULL,
                          [deliverydatekey] [DATETIME2](6) NULL,
                          [salespersonkey] [INT] NULL,
                          [wwiinvoiceid] [INT] NULL,
                          [description] [VARCHAR](8000) NULL,
                          [package] [VARCHAR](8000) NULL,
                          [quantity] [INT] NULL,
                          [unitprice] [DECIMAL](18, 2) NULL,
                          [taxrate] [DECIMAL](18, 3) NULL,
                          [totalexcludingtax] [DECIMAL](29, 2) NULL,
                          [taxamount] [DECIMAL](38, 6) NULL,
                          [profit] [DECIMAL](18, 2) NULL,
                          [totalincludingtax] [DECIMAL](38, 6) NULL,
                          [totaldryitems] [INT] NULL,
                          [totalchilleritems] [INT] NULL,
                          [lineagekey] [INT] NULL,
                          [month] [INT] NULL,
                          [year] [INT] NULL,
                          [quarter] [INT] NULL
             );

  1. Click the Run button.

The Data Warehouse has full DDL support to prepare tables for data ingestion or create table aggregations.

  1. On the Explorer, open Schemas->dbo->Tables
  2. Click the three dots on the right and select Refresh.

Interface gráfica do usuário, Texto, Aplicativo, Word Descrição gerada automaticamente

Interface gráfica do usuário Descrição gerada automaticamente

  1. Double click the query in the top of the window and give it a new name, Create Tables

Interface gráfica do usuário, Texto, Aplicativo Descrição gerada automaticamente

  1. You will be able to see your query on the left side, in the Explorer.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente com confiança média

The Queries area in the explorer is like the queries in datamarts, but in the Data Warehouse we have the Shared Queries, allowing us to create a distinction between our personal queries and queries shared with other users.

You can move a query to Shared Queries by clicking the 3 dots and selecting the option Move to Shared Queries

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente

Ingesting the data

We can use the COPY statement to ingest the data from an Azure Blog Storage to the data warehouse.

This is the COPY statement syntax:

Texto Descrição gerada automaticamente

The copy statement is capable of:

  • Load data from Blob Storage or ADLS Gen 2
  • Load Parquet or CSV data
  • Connect to secure resources using Credential

Many features documented for the COPY INTO statement for Synapse also work with Synapse Data Warehouse inside Microsoft Fabric. You can discover more about COPY INTO here

  1. Click the button New SQL Query
  2. Paste the following code in the new query window:
COPY INTO [dbo].[dimension_city] FROM ‘https://ift.tt/y8f76i4; 
WITH (file_type = ‘PARQUET’);
 
COPY INTO [dbo].[fact_sale] FROM ‘https://ift.tt/PMh7IUT;
 WITH (file_type = ‘PARQUET’);

 

This storage is a public storage provided by Microsoft

  1. Click the Run button.

Interface gráfica do usuário, Texto, Aplicativo, Email Descrição gerada automaticamente

The result is 50 million of records loaded in 1min and 26 seconds. Not bad.

  1. Rename the query to Loading Data

Building the Model

The Data Warehouse allows us to create a model which will be shared by all datasets created from the data warehouse. This is similar with the creation of a model in a Power BI Datamart.

  1. On the tabs under the explorer, click the Model tab.

  1. Link the fact_table with the dimension_city by CitKey field, dragging and dropping the field from the fact_table to the dimension city.

Interface gráfica do usuário Descrição gerada automaticamente

  1. The Create Relationship window will be, in general, correct, you only need to click the Confirm button.

Interface gráfica do usuário, Texto, Aplicativo, Email Descrição gerada automaticamente

Uma imagem contendo Diagrama Descrição gerada automaticamente

Creating a Report

The data warehouse is completely integrated with Power BI. All the data is stored in OneLake, as explained in the article https://www.red-gate.com/simple-talk/cloud/azure/data-intelligence-on-light-speed-microsoft-fabric/

Power BI reports access the data in the OneLake using a technology called Direct Lake, which is faster than usual technologies to connect to SQL or Datalakes, either using Import or Direct Query.

  1. Click the button New Report on the top of the window.

Uma imagem contendo Logotipo Descrição gerada automaticamente

The data window on the right side of the screen brings the tables with the relationship already stablished according to the model.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente com confiança média

  1. On the visualization pane, select Arcgis Maps for Power BI

Interface gráfica do usuário, Aplicativo, Word Descrição gerada automaticamente

  1. Drag the StateProvince field from the dimension_city to the Location box in the Visualizations bar.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente

  1. Move the Profit field from the fact_sale to the Size box in the visualization bar.

Interface gráfica do usuário, Tabela Descrição gerada automaticamente

  1. Resize the Arcgis visual to fill the report area.

Mapa Descrição gerada automaticamente

  1. Click the File -> Save menu.

Interface gráfica do usuário, Aplicativo Descrição gerada automaticamente

  1. Type the name Profit by Region and click the Save button.

Interface gráfica do usuário, Texto, Aplicativo Descrição gerada automaticamente

  1. On the left bar, click DataWarehouseSimple to return to a Workspace view.

Interface gráfica do usuário, Texto, Aplicativo, chat ou mensagem de texto Descrição gerada automaticamente

Besides the data warehouse and the report, you created, the model created in the data warehouse was saved as a dataset.

Interface gráfica do usuário, Texto, Aplicativo, Email Descrição gerada automaticamente

Conclusion

On this blog, you found a walkthrough to create a data warehouse, load from a blob storage using T-SQL, create the modelling and build a report.

 

The post Microsoft Fabric: Data Warehouse Ingestion using T-SQL and more appeared first on Simple Talk.



from Simple Talk https://ift.tt/BwWPl3K
via

No comments:

Post a Comment