Sunday, August 28, 2016

SSIS Balanced Data Distributor transform

SSIS Balanced Data Distributor

Transform:

Microsoft download center has a new download available that would be of interest to many SSIS professionals. Its a new transform named Balanced Data Distributor transform, which takes a single input and distributes the incoming rows to one or more outputs uniformly via multithreading. Below mentioned is the description of this transform as mentioned on the download page:

"Microsoft® SSIS Balanced Data Distributor (BDD) is a new SSIS transform. This transform takes a single input and distributes the incoming rows to one or more outputs uniformly via multithreading. The transform takes one pipeline buffer worth of rows at a time and moves it to the next output in a round robin fashion. It’s balanced and synchronous so if one of the downstream transforms or destinations is slower than the others, the rest of the pipeline will stall so this transform works best if all of the outputs have identical transforms and destinations. The intention of BDD is to improve performance via multi-threading. Several characteristics of the scenarios BDD applies to: 1) the destinations would be uniform, or at least be of the same type. 2) the input is faster than the output, for example, reading from flat file to OleDB. "
Considering the test performed on this transform, from the results it seems that performance of this transform is a little better over other transforms like Script and Conditional Split. But it should also be taken into consideration that this transform buffer by buffer to its output ends, and the other two checks the data based on the specified logic and then divides the data. Its a nice transform to have in your existing SSIS toolbelt.

The best use of this transform that I can think of is when your input is extremely fast, and you have replicated blocks of logic to keep up the pace with the incoming data flow. In such a case BDD acts as a distributor bridge pumping data to all logic pipelines. But if data is to be dissected conditionally, Conditional Split would be the option again. The speed of distribution harnessed by this transform seems to be due to two reasons: 1) No distribution of data conditionally, just pass buffer by buffer 2) Multithreading architecture.

Certain curious questions the my mind raises are:

1) Denali is already in CTP mode and more CTPs are expected to come. Then why this transform has been released separately at this time and so silently?

2) Will this transform be a regular transform available with Denali, or would it remain a mysterious separated out transform?

3) This transform is available only for SSIS 2008 and SSIS 2008 R2, not SSIS 2005. Why ? Actually it's SSIS 2005 that needs more help with transforms like this, where it would be a value addition to customers who have already made investments in SSIS 2005 !

Let's look forward to the next CTP of Denali to checkout whether this transform would have a seat in SSIS Denali.

No comments:

Post a Comment