Dataiku blocking the tables overview of Object Explorer in Microsoft SSMS

Solved!
asertion
Level 1
Dataiku blocking the tables overview of Object Explorer in Microsoft SSMS

If I have an SQL Script that writes into a table on a database, e.g.:

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]

then the tables overview of the corresponding database in Object Explorer of Microsoft SSMS is blocked during the time that it takes to execute the script in Dataiku. 

I, and my colleagues, recieve a 'Lock request time out period exceeded' error from Microsoft SSMS (see attachment). For working in team with multiple people working in the same database and using Dataiku and Microsoft SSMS at the same time, this becomes an issue quickly. 

DataikuBlocksMSSMS.PNG

โ€ƒ

I was wondering if there are ways to stop this from happening?

 

Other information:

- I work on

SQL Server Management Studio 15.0.18183.0
Dataiku DSS Version 5.1.5

0 Kudos
1 Solution
GCase
Dataiker

Hi Asertion,

What you are seeing here is a product of the MS SQL Server engine not Dataiku. By chance, can I inquire what version of MS SQL Server you are running, the size of the table, and the size of the database?

Anytime you make a query in MS SQL Server, the SQL Server engine makes a determination as to how much of a lock it needs to ensure an atomic commit of the transaction. For example, if you are inserting a single row, SQL Server may only hold a lock on the one-page where the insert will be made, letting all other queries continue. Cross-database commits are a bit different. In this case, you are asking SQL Server to create the schema on the new database and then do the insert.

So why is this blocking?

First, you can write sp_who2 and based on the Blk field see that Dataiku is potentially blocking other queries from happening. Why?

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]

Let's take your query.

First, you have no WHERE clause and you are doing a SELECT *. To ensure consistency during the read SQL Server will immediately elevate your transaction to the highest-level, a table lock on old_table. Since you are requesting the entire schema and all the data, SQL Server has no choice.

Second, the INTO new_table clause means that SQL Server MUST lock the new_table in-place because it has no idea what schema will be awaiting. We now created a DDL schema lock. Neither of these locks can be released until the entire process completes to create transaction consistency. 

Third, these two operating together blocked the schema and thus blocked Object Explorer. Object Explorer is trying to return ALL the metadata from the SQL Server database BUT we are in the process of creating new objects thus its blocked.

So how do we get passed this?

Two things you can do with your script.

One - anytime you script out anything in SSMS you will see that it first creates the object and then inserts the data. This on-purpose to minimize these types of events.

Either use a CREATE TABLE script to build your table prior to insertion (best practice)

or

If you don't feel like writing out all the column names, you can do something like the following:

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]
WHERE 1 = 0

Starting with this query first will create new_table but will not add any rows to it. This is a VERY fast query to build the table schema. Note: this doesn't do anything for indexes and if you want to do any querying later you should definitely do that through a CREATE TABLE or ALTER TABLE script and add indexes.

Now that the table has been created, do an INSERT INTO query. If your table doesn't include an IDENTITY field then this should be pretty straight forward. Best Practice: use column names on your insert but if both tables are exactly the same and don't include an IDENTITY field you can just do an INSERT INTO.

Finally, one last tip. If your data isn't really changing and you don't mind if you might miss a few rows you can add a query hint to your FROM clause on INSERT INTO called NOLOCK. No Lock tells SQL Server make a dirty read and do NOT block any other queries on the table you are reading INCLUDING INSERTS and UPDATES. This keeps your environment from locking up but CAN make your data inconsistent (buyer beware). Hopefully these tips help. If you are looking for more on SQL Server querying I highly recommend books by Itzik Ben-Gan - a SQL Query and Tuning guru I was fortunate enough to learn from previously.

Grant

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]
WHERE 1 = 0

INSERT INTO new_table
SELECT *

FROM [database].[dbo].[old_table] (NOLOCK)

 

View solution in original post

1 Reply
GCase
Dataiker

Hi Asertion,

What you are seeing here is a product of the MS SQL Server engine not Dataiku. By chance, can I inquire what version of MS SQL Server you are running, the size of the table, and the size of the database?

Anytime you make a query in MS SQL Server, the SQL Server engine makes a determination as to how much of a lock it needs to ensure an atomic commit of the transaction. For example, if you are inserting a single row, SQL Server may only hold a lock on the one-page where the insert will be made, letting all other queries continue. Cross-database commits are a bit different. In this case, you are asking SQL Server to create the schema on the new database and then do the insert.

So why is this blocking?

First, you can write sp_who2 and based on the Blk field see that Dataiku is potentially blocking other queries from happening. Why?

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]

Let's take your query.

First, you have no WHERE clause and you are doing a SELECT *. To ensure consistency during the read SQL Server will immediately elevate your transaction to the highest-level, a table lock on old_table. Since you are requesting the entire schema and all the data, SQL Server has no choice.

Second, the INTO new_table clause means that SQL Server MUST lock the new_table in-place because it has no idea what schema will be awaiting. We now created a DDL schema lock. Neither of these locks can be released until the entire process completes to create transaction consistency. 

Third, these two operating together blocked the schema and thus blocked Object Explorer. Object Explorer is trying to return ALL the metadata from the SQL Server database BUT we are in the process of creating new objects thus its blocked.

So how do we get passed this?

Two things you can do with your script.

One - anytime you script out anything in SSMS you will see that it first creates the object and then inserts the data. This on-purpose to minimize these types of events.

Either use a CREATE TABLE script to build your table prior to insertion (best practice)

or

If you don't feel like writing out all the column names, you can do something like the following:

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]
WHERE 1 = 0

Starting with this query first will create new_table but will not add any rows to it. This is a VERY fast query to build the table schema. Note: this doesn't do anything for indexes and if you want to do any querying later you should definitely do that through a CREATE TABLE or ALTER TABLE script and add indexes.

Now that the table has been created, do an INSERT INTO query. If your table doesn't include an IDENTITY field then this should be pretty straight forward. Best Practice: use column names on your insert but if both tables are exactly the same and don't include an IDENTITY field you can just do an INSERT INTO.

Finally, one last tip. If your data isn't really changing and you don't mind if you might miss a few rows you can add a query hint to your FROM clause on INSERT INTO called NOLOCK. No Lock tells SQL Server make a dirty read and do NOT block any other queries on the table you are reading INCLUDING INSERTS and UPDATES. This keeps your environment from locking up but CAN make your data inconsistent (buyer beware). Hopefully these tips help. If you are looking for more on SQL Server querying I highly recommend books by Itzik Ben-Gan - a SQL Query and Tuning guru I was fortunate enough to learn from previously.

Grant

SELECT *
INTO [database].[dbo].[new_table]
FROM [database].[dbo].[old_table]
WHERE 1 = 0

INSERT INTO new_table
SELECT *

FROM [database].[dbo].[old_table] (NOLOCK)