This article defines ten interview questions for senior SQL server developers and provides answers and examples. While there are hundreds of articles and blog posts related to SQL server interviews on the Internet, I have decided to share my own approach to interviewing senior-level database developers.
The idea here is asking a minimal number of questions during a fixed amount of time (one hour or less) and, at the same time, choose such questions that will maximally cover all major topics of database development. Thus, instead of asking several questions about one topic, I tried to define the questions so each question represents one entire topic.
For interviewing experienced candidates, the questions are designed to include quite deep theoretical aspects. Additionally, they are defined in a special way so that when the interviewees answer the main question, they should automatically answer several topic-related sub-questions. Although some questions can be seen as short, precise questions requiring simple answers, we are not expecting laconic “yes” or “no” answers. The candidate should provide the full description of the answer, the reason why it is possible or not, and ideally, an example explaining the answer. Therefore, this will show how deeply the candidates understand the whole material. This does not mean that all experienced, senior database developers are expected to answer all these questions. Indeed, it would be awesome and would probably positively surprise the interviewers if they receive correct and detailed answers to all questions. However, even candidates with rich experiences can forget something or can have less experience in a specific field or may never have used some database system features. Hence, if the candidates face difficulties answering the main question, the interviewer can provide hints encouraging them to answer the sub-questions to check the level of knowledge about the topic.
It is worth mentioning that although all these questions below are theoretical and do not require coding, what I believe is that without an extensive experience in database design development, it will not be possible to correctly answer these questions. Therefore, these can be useful for testing the candidates’ theoretical background and experience level. These questions can be very effective for a phone interview due to their complexity, time efficiency, and the absence of the necessity of code-writing. I would recommend doing a T-SQL coding interview after the candidate successfully passes this theoretical step.
Questions
Here are the questions. I would recommend reading the questions carefully, understanding the question correctly, and then trying to answer. Even if you do not know the answer but have an idea about the topic, try to think and guess the answer before reading the provided answer in the next section. Please find the questions below:
- Is it possible to create a primary key as a non-clustered index? If so, why might you need to do this?
- It is recommended to keep the size of the clustered index key as small as possible. What is the main reason for this?
- Why would WITH CHECK OPTION be used in a view creation?
- What are the differences between the JOIN and APPLY operators?
- Is it possible to call a stored procedure inside a user-defined function and vice versa? What are some differences between them?
- Is it possible to issue INSERT, UPDATE, and DELETE commands inside a user-defined function? If so, give an example of how to do that?
- Which errors cannot be handled by CATCH block? Tell just one example of such kind of an error.
- What are the main differences between AFTER and INSTEAD OF triggers?
- Which methods can be used to check whether there are active transactions in the current connection or not? How can you check how many active transactions are in the current connection?
- Which transaction isolation levels prevent all transaction phenomena (concurrency issues) and how do they achieve that? In other words, what are the highest isolation levels in SQL Server and how they differ from each other?
Answers
Below you can find the answers to the questions above. I have tried to describe the answer as detailed as it is possible in a couple of minutes. Explore the answers and check whether you were correct or miss something while trying to answer yourself.
- Is it possible to create a primary key as a non-clustered index? If so, why might you need to do this?
The answer to this question is yes, it is possible.
While creating the primary key as a clustered index (default option) is recommended as a best practice, there are some situations when you need to create it as a non-clustered index.
For example, assume you have a column in your table which is intensively used in queries as a key column for joins. Additionally, in many queries, this table is sorted mostly by this column. This column is a good candidate for a clustered index. Nevertheless, due to your application’s logic, this column allows nulls and duplicate values. Therefore, you cannot create the primary key on that column. Instead, a clustered index can be created on that column but not the primary key. As a primary key is also needed to uniquely identify each row in the table, you can have an identity, auto-incremented column with unique and not-nullable columns and create the primary key on it.
- It is recommended to keep the size of the clustered index key as small as possible. What is the main reason for this?
It is a good idea to keep the clustered index size minimal because the size of the clustered index affects the sizes of the non-clustered indexes on the same table. This is explained by the fact that the non-clustered indexes of a clustered table use the clustered key as a row locator to refer to the corresponding rows in the table. To do so, the clustered index key is stored in the leaf level nodes of the non-clustered indexes along with the non-clustered index column values. As a result, the bigger the clustered index key size, the bigger are the non-clustered index sizes.
.
- Why would WITH CHECK OPTION be used in a view creation?
If you want to update a table through a view and to ensure that the modified data is visible through the view, use the WITH CHECK option while creating that view. If a view is created using WITH CHECK option, the result of any INSERT or UPDATE statement issued against the view must meet the criteria in the WHERE clause of the view. In other words, if WITH CHECK is used, it is not possible to update a row in such a way that will make it disappear from the view, and it is not possible to insert such a row that will not appear in the view. Any attempts of row modifications that do not meet the WHERE criteria will fail with the clear error message, and the statement will be terminated. It is important to mention that the above-mentioned restrictions refer only to the data modifications using the view. These rows, however, can be successfully modified through the corresponding table.
- What are the differences between the JOIN and APPLY operators?
The APPLY operator has two variations – CROSS APPLY and OUTER APPLY. Like the INNER JOIN operator, CROSS APPLY returns only these rows from the left (outer) table (combined with the corresponding rows from the right table) for which the condition is met. The OUTER APPLY returns all rows from the left table regardless of the condition. However, the rows that met the condition in the result set are combined with the corresponding rows from the right table. These rows, for which condition is not true, have NULLs in the fields of the corresponding column values. Thus, OUTER APPLY is similar to the LEFT JOIN operator.
The difference is that APPLY operators can use a table-valued function as a right table that can receive columns as arguments from the left table, which is not possible in the case of the JOINs. This feature makes APPLY operators quite flexible for developing complex logic inside a table-valued function and then use it in an APPLY operator. Achieving the same using only JOINs will need much more effort, will make the code larger and more complicated (performance, however, can be much better). Also, it is worth mentioning that unlike the JOINs, there is no ON clause in the APPLY operators, and the condition is defined in the right table source expression.
- Is it possible to call a stored procedure inside a user-defined function and vice versa? What are some differences between them?
While it is possible to call functions inside a stored procedure, it is not possible to call stored procedures from functions. The methods of calling the stored procedures and user-defined functions are also different. In order to call a stored procedure, EXECUTE (or EXEC) command must be used. In contrast, the user-defined functions must be called as a part of an SQL statement (for example in a SELECT statement). Unlike stored procedures, functions cannot modify data. If it were possible to execute a stored procedure inside a function, it would mean that it could be possible to modify the data through that function. This is because the DML logic could be implemented inside the procedure and, therefore, change the database state with a function. Thus, it is logical that it is impossible to call a procedure from a function. However, it is possible to call other stored procedures from a stored procedure (the nesting level is 32). It is also possible to call a function from another user-defined function.
- Is it possible to issue INSERT, UPDATE, and DELETE commands inside a user-defined function? If so, give an example of how to do that?
While it can sound strange and surprising for many developers, the answer is – yes, it is possible to modify table-variables inside functions. Although neither local nor global temporary tables are allowed in UDFs, table-variables can be used inside functions. DML statements can be issued inside a UDF to modify table-variables’ data. For example, data can be inserted, updated, and deleted in a table-variables inside a UDF. It is important to mention that table-variables are stored in tempdb, like temporary tables, and not in memory. Nevertheless, temporary tables cannot even be used in UDFs. Thus, modifying data inside table-variables is not considered as a modification of the database state.
Generally, the statement that DML operations cannot be performed inside user-defined functions is not true. These operations cannot be performed against the permanent database objects (such as tables) but can be issued to modify table-variables.
- Which errors cannot be handled by the CATCH block? Tell just one example of such kind of an error.
The TRY…CATCH construct is used to implement an error handling mechanism in T-SQL. In case of the error occurrence in the TRY block, control is passed to the CATCH block. Usually, in the CATCH block, logic is developed in response to the error. Some errors, however, remain unaffected by the TRY…CATCH construct.
Errors with severity 10 or lower (that are in informal messages) and errors with the severity of 20 and higher (that indicate system problems and fatal errors) are such examples. Interrupted client requests, broken client connections, and killed sessions are also not trapped by the TRY…CATCH construct. In the case of statement-level recompilation errors (for instance, object name resolution errors) and compile errors preventing the batch from running (syntax errors, for example) control is not passed to the CATCH block. A table in the TRY block that does not exist in the database can be considered a common example of an error not caught by the CATCH block. So, if you use a non-existing table in the statements of the TRY block, the object name resolution error will be generated, and the control will not be passed to the CATCH block.
- What are the differences between AFTER and INSTEAD OF triggers?
Before discussing the differences of these triggers, here is a brief description of triggers. A trigger is considered as a special type of stored procedure. However, unlike regular stored procedures, triggers cannot be executed manually. They run automatically in response to special database-related events. There are three types of triggers: DML triggers – which fire in response to the data manipulation language event (INSERT, UPDATE, or DELETE statements), DDL triggers – which are associated with the data definition language events (DDL), and Logon that are related to logon events.
Both INSTEAD OF and AFTER triggers are DML triggers, which means that they fire due to the corresponding DML action. An AFTER trigger is executed only after the corresponding statement(s) on the related table is successfully executed (the statement(s) is launched, and the corresponding constraint checks and referential cascade actions are succeeded). For instance, if you want to track historical data and guarantee that all deleted rows from a table will be moved to another table, you can define an AFTER DELETE trigger on that table and define the trigger’s logic to insert the deleted rows into the corresponding “history” table. As mentioned above, an AFTER trigger will not run if the corresponding statements fail (for example, due to constraint violation and so on).
INSTEAD OF triggers run in place of the corresponding DML commands if special events, defined in the trigger’s logic occur. In other words, in predefined cases, it is possible to do a different action, defined in the trigger’s logic, rather than performing the expected DML operations. For example, if you want to restrict the insertion of some values into a column and track the attempts of such kind of failed insertions. You can use an INSTEAD OF INSERT trigger to perform insertion into another, log table in such cases. Thus, when someone tries to insert a prohibited value into the table, this attempt will be prevented and recorded in another table or just will not occur, depending on the INSTEAD OF triggers logic.
While you can have multiple AFTER triggers on a table for each DML action, only one INSTEAD OF trigger is allowed per each INSERT, DELETE, UPDATE command for a single table. INSTEAD OF triggers can be defined not only on tables but also on views, unlike AFTER triggers, that can be applied only to tables.
- Which methods can be used to check whether there are active transactions in the current connection or not? How can you check how many active transactions are in the current connection?
The @@TRANCOUNT and XACT_STATE() system functions can be used to determine if there are active transaction(s) in the current connection.
The XACT_STATE() function returns only three values : 1, 0, and -1. When XACT_STATE()=1, it means that the session has an active transaction(s) and if it returns zero, it means that there are no transactions inside the current session (if XACT_STATE()=-1, it means that there are uncommittable transactions). Thus, if XACT_STATE<>0 condition is true, there are active transactions in the session. If you want to detect only the committable ones, you can use XACT_STATE()=1 and use XACT_STATE()=-1 for uncommittable ones.
Each BEGIN TRANSACTION statement increments the @@TRANCOUNT variable by 1, and each COMMIT statement decrements it by 1 (at the beginning of a new session @@TRANCOUNT=0). The ROLLBACK statement sets the value of @@TRANCOUNT to 0 (except the ROLLBACK to a savepoint does not change its value). Therefore, if @@TRANCOUNT > 0 means that there are active transaction(s) in the current connection. Moreover, the value of the @@TRANCOUNT shows the nested transactions count. For example, if @@TRANCOUNT=3, indicates that there are three active transactions in the current connection. Hence, @@TRANCOUNT can be used to detect how many active transactions the current session has. To compare, the XACT_STATE() system function, is not applicable for this task as it shows only the fact that there are active transaction(s) but not their count. In turn, the @@TRANCOUNT cannot be used to detect uncommittable transactions, unlike the XACT_STATE().
- Which transaction isolation levels prevent all transaction phenomena (concurrency issues) and how do they achieve that? In other words, what are the highest isolation levels in SQL Server, and how do they differ from each other?
There are five transaction isolation levels in MS SQL Server, and only two of them eliminate all three phenomena (dirty reads, non-repeatable reads, phantoms). Therefore, these two are considered as the highest isolation levels. These two isolation levels are the serializable and snapshot isolation levels. As it is mentioned above, unlike the previous three isolation levels (read uncommitted – when dirty reads, possible and read committed – when non-repeatable reads are possible, and repeatable read – when phantoms are possible ) these isolation levels prevent all of these phenomena. In the case of these two isolation levels, it is not possible to read uncommitted data (dirty), the value of the row retrieved once in the transaction cannot be changed (preventing non-repeatable read), and the insertion of rows meeting select criteria inside a transaction is eliminated preventing phantom inserts.
While both isolation levels guarantee the highest isolation level, they use quite different approaches to achieve that. In the case of the serializable isolation level, locking is used to isolate data used by the current transaction from other transactions. This, in turn, reduces concurrency. The snapshot isolation level prevents all these phenomena by using row versioning. As its name suggests, the snapshot isolation level stores the snapshot (old version) of the rows modified by other transactions in the tempdb database. Only this snapshot data is visible inside the transaction. This level definitely increases concurrency, but it is worth to mention that it adds an additional load to the tempdb.
Conclusion
There are various approaches, techniques, methodologies, and styles of interviewing database developers. While some interviewers prefer to ask easier questions first and then move to more difficult ones to check the candidate’s level, others prefer to ask questions with random difficulties and from various topics. I have discussed my version of interview questions for experienced professionals.
The key criteria here is to design questions in a way that will make the process of the interview time-efficient and will help to objectively evaluate the level of the candidate. This is why the provided list includes a limited number of questions. At the same time, each of these questions relates to one of the main topics of database development. As a result, all questions together maximally touch almost all common aspects of the database development field.
Although each of these questions could be considered a complex, difficult question, it includes several simpler sub-questions. Thus, even if a candidate is unable to fully answer the whole question, he or she has a chance to provide answers to some core concepts included in the question. As there are no coding questions there and the number of questions is quite a few, this question list can be used for phone interviews.
The post SQL Server interview questions for experienced developers appeared first on Simple Talk.
from Simple Talk https://ift.tt/3rWP1tr
via
No comments:
Post a Comment