The series so far:
- Creating Calculated Columns Using DAX
- Creating Measures Using DAX
- Using the DAX Calculate and Values Functions
- Using the FILTER Function in DAX
The FILTER
function works quite differently than the CALCULATE
function explained in the previous article. It returns a table of the filtered rows, and sometimes it is the better approach to take.
I’ll spend most of this article explaining how to create the following measures:
The columns above show, respectively:
- The city name;
- Total sales for the city in question (the filter context);
- Total sales for the city in question for 2018;
- Total sales for the city in question for the USA;
- Total sales for the city in question for the USA for 2018; and
- The percentage each city’s sales contributes to the total.
First, I’ll show you how to set up the example. Then I’ll dive into the syntax of the FILTER
function. I’ll finish by highlighting the differences between the FILTER
and CALCULATE
functions.
A Quick Refresher
To work through the examples in this article, you’ll need to create a simple Power BI report containing a single table and then create and show a series of measures. Here’s a quick run-through of how to get started.
First, create a Power BI report based on the tables used in the previous articles. You can load them either from the SQL Server database given or the Excel workbook. You should now have something like this (if your diagram looks a bit different, you may not have updated your instance of Power BI to include the March 2019 update, which included the new Model View):
Now create a table in Report view to list out the city names. Make sure the Table visualization is selected and click CityName in the Fields list.
Switch to the Home ribbon and select Enter Data. This will add a new table to your report to contain your measures. To understand why you might want to do this, see this previous article in this series:
Give this table a name. Here I’ve called mine All measures:
Click Load. Now add the following measure to your All measures table. You can right-click on the table and choose New measure to do this:
Sales = SUMX( // multiply the price of each transaction by // its quantity, and sum the result Sales, [Price]*[Quantity] )
Choose to display this measure in your table:
You should now be able to see the total value of sales for each city:
What happens if you want to show the sales for American cities only? Or sales taking place in 2018 only?
The FILTER Function
The measure for the sales column shown above, giving total sales for each city, is as follows:
Sales = SUMX( Sales, [Price]*[Quantity] )
What this does (as readers of this series of articles will know) is to iterate down the rows in the Sales table, calculating the price multiplied by the quantity for each and summing the result for each city to get this:
To get the sales in 2008, you could use a CALCULATE
function so that this measure would work:
2018 sales using CALCULATE = CALCULATE( SUMX( Sales, [Price]*[Quantity] ), YEAR(Sales[SalesDate]) = 2018 )
This takes the filter context for each city, and further reduces it to consider only those rows where the sales occurred in 2018 to get this:
Another way to solve the problem, however, is to treat the sales for the current filter context as a table and filter it accordingly. Consider the example of sales for New York. Here’s the underlying data for this city:
The total figure for sales for New York for 2018 is 25.98 (18.98 + 7.00). One way to get to this would be to follow these steps when compiling the data for New York. Firstly, get the data for the filter context:
Secondly, filter this data to include only those sales for 2018, by iterating down each row deciding whether to include it. This will include only the shaded area below:
This leaves this table, which is the one whose sales Power BI will sum:
Here’s the formula to accomplish this:
2018 sales = SUMX( FILTER( Sales, YEAR(Sales[SalesDate])=2018 ), [Price]*[Quantity] )
This will give exactly the same results as the formula using CALCULATE
above. The CALCULATE
function will run more quickly because it doesn’t have to iterate down each row in the table testing a condition. At this point, you may be asking yourself what the point of the FILTER
function is. I’ll return to this later in this article.
Linking tables within the Filter Function
Take a look at how to show total sales for the USA for each city. The Sales, City and Country tables are related as follows:
What’s needed is to iterate down the rows in the sales table, calculating the sales (price times quantity) for each but only where the country name is USA. Here’s the formula to do this:
American sales = SUMX( FILTER( Sales, RELATED(Country[CountryName])="USA" ), [Price]*[Quantity] )
The expression gives these results:
The question is – why use the RELATED
function when the DAX formulae using filter context automatically link tables together? The answer is that within this formula, row context, not filter context, is used. The shaded lines in the formula below iterate over each row in the Sales table returned for the filter context, creating a row context for each:
Because within the shaded bit of the formula DAX has to create a row context for each row in the sales table, it then has to use the RELATED
function to bring in the country name from the Country table.
Combining Filters Without Nesting
It’s now time to look at how to combine criteria: how to show sales which happened in 2018 and which took place in the USA. I’ll show in a bit how to do this by nesting one FILTER
function within another, but for now, I’ll show ways to combine criteria. There are two basic ways to do this in DAX – either by using &&
or the AND
function (or if either of two conditions can be true, using ||
or the OR
function).
Here’s a version of the measure using the AND
function:
2018 American sales = SUMX( FILTER( Sales, AND( RELATED(Country[CountryName])="USA", YEAR(Sales[SalesDate])=2018 ) ), [Price]*[Quantity] )
Here’s the same measure, but using the &&
symbols:
2018 American sales using && = SUMX( FILTER( Sales, RELATED(Country[CountryName])="USA" && YEAR(Sales[SalesDate])=2018 ), [Price]*[Quantity] )
Personally, I’d use the AND
(or OR
) functions any time, as they work in the same way as their Excel counterparts, and it’s easier to indent and comment formulae. However, you should use whichever floats your particular boat.
Even more sales have dropped from the figures:
Combining Conditions by Nesting Functions
The other way to solve this would have been to nest one function within another:
2018 American sales using nesting = SUMX( FILTER( FILTER( Sales, RELATED(Country[CountryName])="USA" ), YEAR(Sales[SalesDate])=2018 ), [Price]*[Quantity] )
Consider what this does for the New York row in the table:
Filter context restricts the data to sales for the current city in question.
The inner FILTER
function iterates over each row in the table of data for the filter context, picking out only rows where the country is in the USA.
The outer FILTER
function then iterates over each row in the table of sales for the USA for the filter context and applies a further constraint that the sales year must be 2018.
Depending on your data, nesting FILTER
functions could speed up processing. If the vast majority of sales were outside the USA, the inner condition could eliminate nearly all rows for each city in the filter context, with the result that Power BI would only need to test the sales date for the few remaining rows.
Using ALL to Remove Any Filters
Every example shown so far has taken the set of rows for the current filter context and applied additional constraints to pick out only certain rows. However, you can use the ALL
function when filtering to work with the entire table, rather than just the data for the current filter context. You could use this, for example, to show the percentage contribution of each city’s sales to the grand total:
Here’s the formula for the above measure:
% of all sales = DIVIDE( // calculate the total sales for the current filter context SUMX( Sales, [Price]*[Quantity] ), // divide this by total sales for all cities SUMX( ALL(Sales), [Price]*[Quantity] ) )
Incidentally, if you’re wondering how to get the nice percentage format, just select the measure you’ve created:
You can then set the formatting in the Modeling tab on the menu:
The Difference Between CALCULATE and FILTER
To see the difference between the way in which CALCULATE
and FILTER
filter data, consider this example:
The first measure applies the filter context (so it only calculates sales for the city in question), and applies an additional constraint that the city should be New York:
New York FILTERED = CALCULATE( // work out total sales // for the filter context SUMX( Sales, [Price]*[Quantity] ), // but whittle the filter context // down to show only those cities // within it called New York FILTER( City, City[CityName]="New York" ) )
The second measure replaces the filter context with a new constraint that the city should be New York, which results in the same figure appearing in every row:
New York CALCULATED = CALCULATE( // work out total sales // for the filter context SUMX( Sales, [Price]*[Quantity] ), // changing the city criteria // so it is New York, not // whatever the filter context // originally said City[CityName]="New York" )
Debugging Using the FILTER Function (Method 1)
To make debugging easier, first add a couple of calculated columns to the Sales table, to give the city name and sales year. The formulae are shown below:
Here’s the formula for the City column. It just looks up the name of each city in which sales took place:
Here’s the formula for the Sales year column. It gives the year in which each sale took place:
These two columns will make it easier to check what’s going on when debugging.
The FILTER
function creates virtual tables which, under normal circumstances, you never see, but you can use a tool like DAX Studio to show the rows these virtual tables contain. I’ve covered how to download and use DAX Studio in a previous article in this series, but here’s a quick refresher. When you run DAX Studio, choose to connect to an open Power BI report:
Type in a DAX query in the top right-hand window and press the F5 key to run this. The results will appear beneath it. For the example below, I’m just listing out the contents of the Sales table:
Incidentally, if you’re wondering what those long date table names are, you’re not the only one. I presume they are created behind the scenes to provide the built-in date hierarchy included in the March 2019 update of Power BI.
You can evaluate any table, including one which is returned from a filter function. A good thing to ask might be: which sales were in the United States? You can do this by copying this part of the measure you created earlier:
Precede this with the word EVALUATE
in DAX Studio, and you’ll get this:
Run this to get the following output:
That’s looking good, so now you can repeat this technique with the outer bit of the FILTER
function:
This gives only 3 rows:
From this, it’s easy to see why you get the figures for this measure.
Debugging Using the FILTER Function (Method 2)
Another way to debug a DAX formula using the FILTER
function (or any other DAX formula, for that matter) is to use variables. I’ve already covered this for scalar variables (ones holding a single value) in the previous article in this series on measures, but did you know a variable can hold an entire table?
Here’s another way to write the nested FILTER
function:
US sales in 2018 = // create a variable to hold the sales in the USA VAR UsaSalesTable = FILTER( Sales, RELATED(Country[CountryName])="USA" ) // create another variable to filter this to show // only sales in 2008 VAR UsaSales2018 = FILTER( UsaSalesTable, YEAR(Sales[SalesDate])=2018 ) // finally, calculates sales for these figures RETURN SUMX( UsaSales2018, [Price]*[Quantity] )
The advantage of breaking the complicated formula down into different parts is that you could then test each in isolation.
Why Would You Use the FILTER Function?
I promised I would return to this question: why would you use the FILTER
function when the CALCULATE
function seems to offer a better alternative? There are at least four advantages:
I’ve already shown that it’s easier to debug DAX expressions that use the FILTER
function.
I think expressions using the FILTER
function are easier to understand than equivalent expressions just using CALCULATE
.
Learning the FILTER
function will help you to understand the EARLIER
function, which will be the subject of the next article in this series.
There are some problems which the CALCULATE
function won’t solve (an example follows).
To illustrate the last point, suppose that you want to create a measure showing total sales for cities having two or more purchases. Here are the figures that this should return:
There are no sales recorded for Chicago, LA and Rio in the new measure because they each only witnessed a single sale.
Assume in all of the following that [Number of purchases]
is a measure with this formula:
Number of purchases = COUNTROWS(Sales)
Here’s a measure which you could use to try to solve this problem (although it won’t work):
Sales for multiple purchases = CALCULATE( // calculate total sales where ... SUMX( Sales, [Price]*[Quantity] ), // ... the number of purchases is more than 1 [Number of purchases] > 1 )
If you type in this measure, you’ll see the following error message:
This isn’t a brilliant description of the problem, which is that you can’t use a measure in the filtering part of a CALCULATE
function; you can only refer to columns. You can, however, solve this problem by rewriting it to incorporate a FILTER
function:
Sales for multiple purchases = CALCULATE( // calculate total sales but ... SUMX( Sales, [Price]*[Quantity] ), // only where the number of // purchases is more than 1 FILTER( City, [Number of purchases] > 1 ) )
This will calculate total sales, but only for those cities where the number of purchases was more than 1.
Summary
The FILTER function in DAX allows you to iterate down the rows of any table, creating a row context for each and testing whether the row should be included in your calculation. You can combine filters using keywords like AND and OR and also nest one filter within another. The FILTER function allows you to perform some tasks which the CALCULATE function can’t reach, and also (in my opinion) lets you create formulae which are easier to understand.
The post Using the FILTER Function in DAX appeared first on Simple Talk.
from Simple Talk http://bit.ly/2V5gknB
via
No comments:
Post a Comment