Thursday, November 9, 2023

Fabric Data Warehouse News you need to know

Microsoft Fabric, including the Data Warehouse, evolves faster than we can analyze and write about it. Here I’m summarizing the most recent news I looked at.

Clone Table

We have a specific statement to clone a table. But what exactly does it mean?

The Clone Table feature promises to create an image of the table on a specific point in time or with the current information. The documentation is not precise, because at some points it says it’s only a clone of the structure, but we can see the data on the table.

What’s the advantage of this over a simple SELECT INTO statement?

I tested both to discover. The images below illustrate how the clone table executes in only 3 seconds and the SELECT INTO takes 32 seconds for the same task.

A screenshot of a computer Description automatically generated

A screenshot of a computer Description automatically generated

However, this feature is in a very early stage. As a result, it shows some strange behaviours:

  • The cloned table doesn’t appear in the one lake explorer
  • A UI feature is promised and documented but not available yet
  • The cloned table doesn’t appear in the list of tables. A refresh results in an error. We can only access it from a query.

A screenshot of a computer program Description automatically generated

How does the clone happen in only 3 seconds?

Without the access to the files in the onelake, it’s difficult to say. My bet would be something related to the difference between copy files or copy records.

References:

https://learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-clone-table-portal

https://learn.microsoft.com/en-us/sql/t-sql/statements/create-table-as-clone-of-transact-sql?view=fabric&preserve-view=true

sp_rename for tables

This is an inheritance from Azure Synapse, which never supported sp_rename. Problem solved. A small step for the technology, a big step for the data engineer.

Reference:

https://blog.fabric.microsoft.com/en-GB/blog/announcing-fabric-warehouse-support-for-sp_rename/

Data Warehouse Dynamic Data Mask

Another important addition is starting to appear on Synapse Data Warehouse, but it’s still missing some details.

Dynamic Data Mask is a very interesting feature to help protect sensitive data. However, in Synapse there are still some features missing:

Dynamic Data Mask with Alter Table is not working very well

We need to provide the Unmask permission to users and roles. I wrote an article when this feature was finally made available in SQL Server and Azure SQL. However, it’s not available in Synapse yet.

Reference:

https://learn.microsoft.com/en-us/fabric/data-warehouse/dynamic-data-masking

New CSV Parser for the Data Warehouse

I wrote before about the differences between the CSV Parser 1.0 and 2.0 in Synapse Analytics.

Checking the information from the old article, you may notice this is not a small advance, it’s a big deal.

Reference:

https://blog.fabric.microsoft.com/en-GB/blog/announcing-improvements-to-csv-data-ingestion-in-fabric-warehouse/

Save As View/Table available for Visual Queries

The queries feature, either SQL or Visual, are a great data exploration tool. It allows our users to explore the data before building any report, to discover exactly what’s hidden in the data before building reports.

The possibility to create tables and views also on the Visual Queries, as we already had on SQL ones, completes the data exploration scenario.

A screenshot of a computer Description automatically generated

The bad news:

An old bug on Visual Query continues to be alive and kicking, making this feature not so useful. On the image above, there is a join between fact_sale table and dimension_city table. However, either the View SQL button or the Save as view button only show the SQL from fact_table, not the entire query with the join.

The image below shows the Save as view option, unfortunately without the JOIN it should have.

A screenshot of a computer Description automatically generated

Reference:

https://learn.microsoft.com/en-us/fabric/data-warehouse/visual-query-editor#save-as-view

Row Level Security

Row Level Security is one of the biggest differences between Data Warehouse and Lakehouse. Only a Data Warehouse has this feature, making the Data Warehouse a great candidate to be the gold layer in a medallion architecture, for example.

This feature has the same working method as in SQL Server: We create a function to filter the security by user and create a security policy to link the function to a table.

The example on the first reference below uses the USER_NAME() function. However, when an application connects to a database, we may be using an application identity, instead of a user identity. This could break the row level security.

The application has the capability to set a session context value with the name of the real user accessing the data. The security function would check the session context and ensure the row level security.

This architecture comes from SQL Server and it’s completely available in Microsoft Fabric Data Warehouse. It’s explained on the 2nd reference below.

Reference:

https://learn.microsoft.com/en-us/fabric/data-warehouse/row-level-security

https://learn.microsoft.com/en-us/azure/data-api-builder/azure-sql-session-context-rls

Summary

Microsoft Fabric has not been released yet. Documentations and announcements appear before all the feature details are available. However, this information shows us the direction things are evolving and what we should be looking for.

 

The post Fabric Data Warehouse News you need to know appeared first on Simple Talk.



from Simple Talk https://ift.tt/5J0YrTP
via

No comments:

Post a Comment