SSIS,SSRS,SQL Server Information: October 2011

Monday, October 24, 2011

Display page numbers in SSRS Roports

Display the page numbers in reports of SSRS:

put a text box on the report bottom
drag the page number built in field from the report data pane.
after that drag and drop the total pages built in field from data pane.
we can write the expression for this

="page" & globals ! pagenumber & "of" & globals ! totalpages

SQL SERVER real time question

can we call the trigger with in a stored procedure?

Here i will post one SSIS performance tip for a day...regularly i will post a performance tip

TO DAY TIP:
Avoid Select *:
The data flow task(DFT) of ssis uses a buffer oriented architecture for data transfer and transformations.when data travels from source to the destination, the data first comes in to buffer, required transformations are done in buffer itself and then written in to the destination.
The size of the buffer is dependent on several factors, one of them is estimated row size. the estimated row size is determined by summing the maximum size of all the columns in the row.so more columns in the row means less number of rows in a buffer and with more buffer requirements the result is performance degradation .Hence it is recommended to select only those columns which are required at destination.
Even if toy need all the columns from the source, you should use the columns name specifically in the select statement other wise it takes another round for the source to gather mete-data about the columns when you are using select *.

TIP: Try to fit as many rows into the buffer which will eventually reduce the number of buffers passing through pipeline engine and improve performance.

go for ssis performence tuing tip 2

Thursday, October 20, 2011

Command Line Tools In SSIS

what are the command line tools to execute SQL server Integration service projects?

post your answers as commet

Real time scenarios of SSIS

The below some scenarios are common in SSIS..

scenario 1:Loop over a list of files & load each one

Tasks Required: Foreach Loop, Data Flow Task

Solution:
Configure the Foreach Loop to loop over any particular directory of files. The loop should be configured to output
to a given variable. Map the given variable to a connection manager by using expressions.
scenario 2:
Conditionally executing tasks

Solution:
Double-click the precedence constraint and set the Evaluation property to Expression and Constraint.
Type the condition that you want to evaluate in the Expression box.
scenario 3:
Pass in variables when scheduling or running a package

Solution:
Use the /SET command in the DTExec command line or change the Property tab in the Package Execution Utility to have the property path like:
\Package.Variables[User::VariableName].Properties[Value]
scenario 4:
Move and rename the file at the same time

Tasks Required:
File System Task

Solution:
Set the File System task to rename the file and point to the directory you'd like to move the file to. This enables you to rename and move the file in the same step.

Problem:Loop over an array of data in a table & perform a set of tasks for each row
Tasks Required: Execute SQL Task, Foreach Loop
Solution:
Use an Execute SQL Task to load the array and send the data into an object variable. Loop over the variable in a Foreach Loop by use an ADO Enumerator.
scenario 5:
Perform an incremental load of data

Tasks Required: 2 Execute SQL Tasks, Data Flow Task

Solution:
Have the 1st Execute SQL Task retrieve a date from a control table of when the target table was last loaded and place that into a variable.
In the Data Flow Task, create a date range on your query using the variable. Then, update the control table using a 2nd Execute SQL Task to specify when the table was
last updated.
scenario 6:
Perform a conditional update & insert
Components Required: Data Flow Task, Conditional Split, Lookup Transform or Merge Join, OLE DB Command Transform

Solution:
Use the lookup Transform or Merge Join to detemine if the row exists on the destination and ignore a failed match. If the row yields blank
on the key, then you know the row should be inserted into the target (by Conditional Split). Otherwise, the row is a duplicate or an update. Determine if the row
is an update by comparing the source value to the target value in the Conditional Split. The update can be done by an OLE DB Command Transform or by loading the data
into a staging table.

SQL SERVER Interview Questions

1.What is DTS in SQL Server ?
Data Transformation Services is used to transfer the data from one source to our required destination. Considering am having some data in sql server and I need to transfer the data to Excel destination. Its highly possible with dialogue based tool called Data Transformation services. More customization can be achieved using SSIS. A specialized tool used to do such migration works.

2.What is the significance of NULL value and why should we avoid permitting null values?
Null means no entry has been made. It implies that the value is either unknown or undefined.We should avoid permitting null values because Column with NULL values can't have PRIMARY KEY constraints. Certain calculations can be inaccurate if NULL columns are involved.

3.Difference between primary key and Unique key?
Both constraints will share a common property called uniqueness. The data in the column should be unique. The basic difference is,
· Primary key won’t allow null value. Whereas, unique key will accept null value but only one null value.
· On creating primary key, it will automatically format the data inturn creates clustered index on the table. Whereas, this characteristics is not associated with unique key.
· Only one primary key can be created for the table. Any number of Unique key can be created for the table.

4.What is a stored procedure:
Stored procedures are precompiled T-SQL statements combined to perform a single task of several tasks. Its basically like a Macro so when you invoke the Stored procedure, you actually run a set of statements. As, its precompiled statement, execution of Stored procedure is compatatively high when compared to an ordinary T-SQL statement.

5.What is the difference between UNION ALL Statement and UNION ?
The main difference between UNION ALL statement and UNION is UNION All statement is much faster than UNION,the reason behind this is that because UNION ALL statement does not look for duplicate rows, but on the other hand UNION statement does look for duplicate rows, whether or not they exist.

6.Example for Stored Procedure?
They are three kinds of stored procedures,1.System stored procedure – Start with sp_2. User defined stored procedure – SP created by the user.3. Extended stored procedure – SP used to invoke a process in the external systems.Example for system stored proceduresp_helpdb - Database and its propertiessp_who2 – Gives details about the current user connected to your system. sp_renamedb – Enable you to rename your database

7.What is a trigger?

Triggers are precompiled statements similar to Stored Procedure. It will automatically invoke for a particular operation. Triggers are basically used to implement business rules.

8.What is a view?
If we have several tables in a db and we want to view only specific columns from specific tables we can go for views. It would also suffice the needs of security some times allowing specfic users to see only specific columns based on the permission that we can configure on the view. Views also reduce the effort that is required for writing queries to access specific columns every time.

9.What is an Index?
When queries are run against a db, an index on that db basically helps in the way the data is sorted to process the query for faster and data retrievals are much faster when we have an index.

10.What are the types of indexes available with SQL Server?

There are basically two types of indexes that we use with the SQL ServerClustered -

1. It will format the entire table, inturn physically sort the table.

2. Only one clustered index can be created for a table.

3. Data will be located in the leaf level.

4. By default, primary key will create clustered index on the table.

Non-Clustered Index

1. It wont touch the structure of the table.

2. It forms an index table as reference to the exact data.

3. A reference to the data will be located in the leaf level.

4. For a table, we can create 249 non clustered index.

11.Extent Vs Page?

Pages are low level unit to store the exact data in sql server. Basically, the data will be stored in the mdf, ldf, ndf files. Inturn, pages are logical units available in sql server.The size of the page is 8KB.

Eight consecutive pages will form an extent 8 * 8KB = 64KB.

Thus I/O level operation will be happening at pages level.The pages will hold a template information at the start of each page (header of the page).

They are,

1. page number,

2. page type,

3. the amount of free space on the page,

4. the allocation unit ID of the object that owns the page.

Extents will be classifed into two types,

1. Uniform extents

2. Mixed extents

Uniform Extents:It occupied or used by a single object. Inturn, a single object will hold the entire 8 pages.Mixed

Extents:Mulitple objects will use the same extent. SQL Server will allow a max of eight objects to use a shared extent.

Property of SQL Server :Initally if an object is created, sql server will allocate the object to the mixed extent and once if the size reaches 8 pages and more... immediately, a new uniform extent will be provided for that particular object.

Herecomes, our fragmentation and reindexing concepts.

some information of sql server:

Maximum Number of tables : No restrictions (Until your harddisk size you can create tables)
Maximum bytes per row in the table : 8060 bytes
Maximum tables in a select statment - 256
Maxiumu references per table - 253
Maximum rows in the table - Unlimited (Depends on the storage)
Maximum columns per base table : 1024
Maximum columns per insert statment : 1024
Maximum columns per Select statment : 1024 * 4 = 4096
Bytes per Group by or Order by = 8000

Index:
Maximum clustered index per table - 1
Maximum Non-clustered index per table - 249
Maximum Columns in an index (composite index) - 16

Keys:
Maximum Columns in a Primary key - 16
Maximum Columns in a Foreign key - 16

Objects:
Nested Stored procedure levels- 32
Nested triggers levels- 32
Parameters per stored procedure - 2100
Parameters per User defined functions - 2100
Identifier length - 16

Database Level Details:
Number of instances for a server - 50
Files per database - 32767
Filegroups per database - 32767
Databases per instance of SQL Server - 32767

12.MSDE vs SQL Server :
MSDE is the downgrade version of SQL Server, the same code with initial release of SQL Server was named as MSDE (Microsoft Desktop Enginer).

13.Extent Vs Page:
Pages are 8 KB size smallest unit to store the data. Inturn, 8 Pages will form the extent.

14.Delete VS Truncate:
Delete will delete row by row in the table. The syntax is,
delete from tablename
We can give where conditions for deleting the data.
Each delete will be logged in the log file.
Its a DML statement

Trucate will deallocate the reference in the pages instead of deleting the data.
Its DDL statement. The syntax is,
Truncate table tablename
We can provide truncate with tables having foreign key relationships. (Because it wont do any cascade delete on the table).
It wont log any information in the log file.
Delete or truncate the data at a stretch and wont do any row by row manipulation.

15.What do you mean by acid property?

Atomicity : The value should be atomic.

Consistency : The data in the database should be consistent. It indicates, if we are maintaining data in two different places. Those data should resemble. If it differs, then it will create confusions. Am I right?

Isolation : Transaction should be isolated and it should not affect the other transactions. If the transactions occured on the same data. Appropriate locks should be posted before starting the transactions.

Durability: The data should be stable.

Wednesday, October 19, 2011

SSRS INTERVIEW QUESTIONS

First iam providing basic SSRS interview questions.

1) What are the different kinds of SSRS Reports?

Reports can be categorized into operational and analytical reports. The distinction is based on the source of data and level of analysis facilitated by any particular report. Operational reports are based on OLTP sources and are static reports and Analytical reports are based on OLAP sources and generally facilitate drill-down and drill-through for analysis. Technically, SSRS reports can be categorized into parameterized, linked, snapshot, cached, etc...

2) What are parameterized reports? What are cascading parameters in SSRS reports?

Reports that accept parameters from users to fetch and report data conditionally, are known as parameterized reports. When you have multiple parameters in a report and values of different parameters are dependent and populated dynamically based on the value of parent parameters, it's known as a cascading parameter.

3) How would you go about developing a SSRS report?

General development methodology for a SSRS report is to start by creating a data source. Based on the data source create one or multiple datasets as needed for parameters and the body of the report. Add required controls from the toolbox which would act as a container for the fields in the dataset. Format the controls added to the report body. Verify and validate the report and finally deploy the report.

4) What is a dataset and what are the different types of datasets?

A dataset is similar to a query definition, which is executed when the report is executed. Datasets are of two types: Shared and Embedded. An embedded dataset is private to the report in which it exists and shared datasets can be shared across reports.

5) Would you store your query in a SSRS report or a Database server? State the reason why.

Storing SQL queries directly in text format in the dataset, should be avoided. Ideally it should be stored in a stored procedure in the database server. The benefit is that the SQL would be in a compiled format in a SP and brings all the benefits of using an SP compared to using an ad-hoc query from the report.

6) What is Tablix?

A Tablix can be seen as a control with combined capabilities of a table and a matrix, which facilitates asymmetric and flexible, row and column level, static and dynamic groupings.

7) How would you access SSRS reports deployed on report server?

Reports Manager is the most straight-forward way to access SSRS reports deployed on report server. Apart from executing the reports, it is the main administration console for SSRS server to manage reports.

8) Have you used the Report Viewer control / web part? What are the limitations?

The report viewer control / web part is the basic control to integrate SSRS reports with external applications. Using these interface applications can link and display SSRS reports within the application. The parameters toolbar in these controls have a limited programmable interface to decorate and blend it with the theme of the application.

9) Which is the latest version of SSRS and what are the new enhancements?

SSRS 2008 R2 is the latest RTM version of SSRS. Rich data visualizations, better programming functions for lookup and aggregation, improved performance and better control over exported data are some of the major enhancements.

10) What is Report Builder?

Report Builder is an ad-hoc report authoring tool primarily targeted to be used by business analysts to facilitate self-service report authoring. Report Builder 3.0 is the latest version available as of date.

11) How would you deploy SSRS Reports using out-of-box functionality and how can you automate SSRS report deployment?

Business Intelligence Development Studio is generally used to deploy SSRS reports. There is no out-of-box support in SSRS to automate reports deployment, but free third-party products like RSScripter can be used for this.

12) What is drill-down and drill-through in SSRS?

Drill-down is a mechanism of decomposing summarized information to a detailed level. Drill-through is a mechanism of decomposing the problem by drilling information generally using more than one report

How to add NULL defense for dates in SSIS packages

In order to create a test bed for our NULL Defense series, create a new database ( name it something like TestDB ) which we will use for simulating problems and developing solutions.

Create a new table called NULLDefense and populate it with data exactly as shown in the picture below. To keep the focus on the problem, we would keep the structure simple, but inclined towards the problem for which we will create our NULL Defense.

Now follow the steps below to create the SSIS package which will simulate the problem in question.

Create a new SSIS Project
Add a new package to it and name it "NULLDefense".
Add a new Data Flow Task to the package.
Add an OLE DB Source and name it "NULLDefense - TestDB".
Edit the OLE DB Source and connect it to the table we just created.
Add a Conditional Split transformation and name it "Split valid and invalid rec". The expected use of this transformation is to split records which contain dates and records which do not contain dates. Edit this transformation as shown in the picture below and also make sure you change the Default output name to "Valid Records" as shown below.

Since this is only for demonstration, we are not going to waste time inserting these records, so we will just use two Multicasts for the destinations to analyze the number of records going to each destination.

Add a Multicast transform and name this "Invalid Records". This should be tied to the "Dates Unavailable" output from the Conditional Split.
Add another Multicast transform and name this "Valid Records" and this should be tied to the "Valid Dates" output from the Conditional Split.

After you have completed these steps, your package should look something like the below picture.

Now if you take a look again at the data we have inserted into our table, we should have two records on the invalid output side and one record on the valid side as there are two records which do not have dates. The reason why we inserted a record with NULL and one record with '' is this is one of the scenarios that you will find in real life projects. Applications insert data into OLTP systems and many times application developers insert data using '' as a placeholder for no date instead of using a constant for NULL from the data access libraries (something like SQLDBNULL) which sends a NULL value to the database instead of a blank value. But from a user or business point of view any date that is not available is NULL, though blank and NULL might carry a different meaning and behavior from a technical standpoint.

Execute the package and you will find two records on the valid side and only one on the invalid side. You would find the same result even if you put a condition using the LEN function to measure the length of the value in the DOB field.

Go back to SSMS and view the data in the NULLDefense table we just created and you will find the reason for this. Whenever a blank value is entered inside a datetime field, when you run a query it will return a default value of "1900-01-01 00:00:00.000" as can be seen in the below picture.

We now move towards the solution for this.

I would not consider any options of making changes at the database level as we might not have the liberty of changing anything for our ETL solution as many other applications might be depending on the existing settings of the database.

Coming to the next alternative, we can place additional checks at the database level to ensure that the value is neither NULL nor the default value, but this kind of solution cannot be considered for a real-time solution as there might be numerous functions and conditions operating on this field and everywhere a dual check or replacement function would be required to fix this issue.

Finally, the option that I recommend is to bring uniformity in perception of NULL to our package and feeding only data that is either NULL or non-NULL to the package. This can be done through a very simple step at the database level itself, without changing any existing database settings, values or logic.

For our test package, just create a new "view" as shown in the picture below where we replace the default value with NULL.

Now change the connection of the source in our package to this view and execute the package. The results should now be as expected, which can be seen in the below figure.

Datetime is one special datatype which can trick you with NULLs and we just developed a defense against NULL date values.

Next Steps

Try to size the efforts that you would need to patch this NULL or blank issue of dates if you would be fixing this in your existing SSIS packages.
Implement this solution and compare the benefit and issues of implementing it at the database level instead of at the package level.

DATEPART and DATENAME Functions in SQL SERVER

DATE Functions in SQL SERVER :

DATEPART(): datepart() is a SQL Server function that extracts a specific part of the date/time value. Its syntax is as follows:

DATEPART (part_of_day, expression)

examples:

1.SELECT DATEPART(month, GETDATE()) AS 'Month Number'

2.SELECT DATEPART(m, 0), DATEPART(d, 0), DATEPART(yy, 0)

DATENAME(): datename() function returns a string value. With the DATENAME function, the only units of time that return values different than the DATEPART function are the WEEKDAY and MONTH.
Unit of time Abbreviations Query Result

DATEDIFF(): DATEDIFF function is used to calculate the difference between two days, and is used in MySQL and SQL Server. The syntax for this date function is different between these two databases, so each one is discussed below:
MySQL:
The usage for the DATEDIFF function
DATEDIFF (expression1, expression2

example: SELECT DATEDIFF('2000-01-10','2000-01-05');

GETDATE(): this function is used to retrieve the current database system time in SQL Server. Its syntax is
GETDATE()
GETDATE does not require any argument.
Example: The SQL statement
SELECT GETDATE();
yields the following result:
2011-10-22 11:27:10.247
GETDATE function is most useful when we need to record the time a particular transaction happens. In SQL Server, we simply insert the value of the GETDATE() function into the table to achieve this. We can also set the default value of a column to be GETDATE() to achieve the same purpose.

DATEPART (The @Date value used is '2011-09-25 19:47:00.8631597')
Unit of time	Abbreviations	Query	Result
ISO_WEEK	isowk, isoww	SELECT DATEPART(ISO_WEEK,@Date)	38
TZoffset	tz	SELECT DATEPART(TZoffset,@Date)	0
NANOSECOND	ns	SELECT DATEPART(NANOSECOND,@Date)	863159700
MICROSECOND	mcs	SELECT DATEPART(MICROSECOND,@Date)	863159
MILLISECOND	ms	SELECT DATEPART(MS,@Date)	863
SECOND	ss, s	SELECT DATEPART(SS,@Date)	0
MINUTE	mi, n	SELECT DATEPART(MINUTE,@Date)	47
HOUR	hh	SELECT DATEPART(HH,@Date)	19
WEEKDAY	dw	SELECT DATEPART(DW,@Date)	1
WEEK	wk, ww	SELECT DATEPART(WEEK,@Date)	40
DAY	dd, d	SELECT DATEPART(DAY,@Date)	25
DAYOFYEAR	dy, y	SELECT DATEPART(DAYOFYEAR,@Date)	268
MONTH	mm, m	SELECT DATEPART(MM,@Date)	9
QUARTER	qq, q	SELECT DATEPART(QUARTER,@Date)	3
YEAR	yy, yyyy	SELECT DATEPART(YYYY,@Date)	2011

DATENAME (The @Date value used is '2011-09-25 19:47:00.8631597')
Unit of time	Abbreviations	Query	Result
WEEKDAY	dw	SELECT DATENAME(WEEKDAY,@Date)	Sunday
MONTH	mm, m	SELECT DATENAME(MM,@Date)	September

SSIS,SSRS,SQL Server Information

Pages