Job taking long time to complete due to 'checking if table exists' step

Farhan · June 2022

[2022/06/06-19:03:09.858] [qtp1290795133-42] [DEBUG] [dku.sql.generic]  - Checking if table exists by querying meta schema=null table=abc types=["TABLE","VIEW","CALC VIEW"]
[2022/06/06-19:07:45.159] [qtp1290795133-42] [INFO] [dku.sql.generic]  - Table null.abc exists

Recently, I noticed the above is causing my job duration to increase quite significantly. As seen above, it is taking almost 5 mins for it to find the table. From further investigation, this step only happens for certain tables and before this, it will only take less than a minute.

This issue does not cause the job to fail but only increase the processing time. Spark engine is used to run the job

I have been using Dataiku for almost a year but only started to observe this since last month.

Appreciate the help.

Note= 'abc' is used to replace actual table name

Operating system used: Windows 10

Alexandru · June 2022

Hi @farhanromli
,

A few questions that may help narrow down the issue:

1) Is the issue intermittent?

2) Does happens with specific databases only?

3) What Hadoop distro version are you on?

DSS will check for table existence as part of dependency checks before running a job.

What that call actually does is send get_tables call to the Hive metastore and wait's for a response:

The Hive logs it would like: OperationHandle [opType=GET_TABLES]

If you are seeing degraded performance all of a sudden your database may have reached a very high number of tables and this specific operation is now taking longer or there is an issue with the hive metastore you may need to look further into.

Farhan · August 2022

Hi

I have actually find a workaround for this.
When we create a table, we have options either to "Read a database table" or use "SQL query".

It seems this issue only happen if I go with the former option. But if I use SQL query, it is able to generate the meta schema.

Job taking long time to complete due to 'checking if table exists' step

Answers

Categories

Setup Info

Tags