Hands-On: Custom Metrics, Checks & Scenarios

tgb417 · June 2022

In this part of the hand-on exercise for Advanced Designer, the Automation module:

I was instructed to create a custom SQL step with the following code.

SELECT
    COUNT(*) AS "state_transactions",
    "merchant_state"
FROM "${projectKey}_transactions_joined_prepared"
WHERE "merchant_state" IS NOT NULL
GROUP BY "merchant_state"
ORDER BY "state_transactions" DESC
LIMIT 1

This consistently produced the error message.

ERROR: relation "DKU_TUT_AUTOMATION_transactions_joined_prepared" does not exist Position: 72

When looking at the Logs I discovered the following.

[2022/06/19-16:24:52.037] [FT-ScenarioThread-1u0skXZS-10233] [INFO] [dku.sql.runtime] scenario DKU_TUT_AUTOMATION.MYCUSTOMSENARIO#2022-06-19-16-24-51-997 - [ct: 16] Executing main statement: SELECT
    COUNT(*) AS "state_transactions",
    "merchant_state"
FROM "DKU_TUT_AUTOMATION_transactions_joined_prepared"
WHERE "merchant_state" IS NOT NULL
GROUP BY "merchant_state"
ORDER BY "state_transactions" DESC
LIMIT 1
[2022/06/19-16:24:52.039] [FT-ScenarioThread-1u0skXZS-10233] [DEBUG] [dku.resourceusage] scenario DKU_TUT_AUTOMATION.MYCUSTOMSENARIO#2022-06-19-16-24-51-997 - [ct: 18] Reporting start of CRU:{"type":"SQL_QUERY","id":"Cs5sNtsCuIspjaYz","startTime":1655670292039,"sqlQuery":{"connectionUsageId":"xKfFJ46cPK06UrAU","connection":"postgreSQL","query":"SELECT\n    COUNT(*) AS \"state_transactions\",\n    \"merchant_state\"\nFROM \"DKU_TUT_AUTOMATION_transactions_joined_prepared\"\nWHERE \"merchant_state\" IS NOT NULL\nGROUP BY \"merchant_state\"\nORDER BY \"state_transactions\" DESC\nLIMIT 1"}}
[2022/06/19-16:24:52.040] [FT-ScenarioThread-1u0skXZS-10233] [ERROR] [dku.sql.runtime] scenario DKU_TUT_AUTOMATION.MYCUSTOMSENARIO#2022-06-19-16-24-51-997 - Failed query
org.postgresql.util.PSQLException: ERROR: relation "DKU_TUT_AUTOMATION_transactions_joined_prepared" does not exist
  Position: 72
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2553)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2285)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:323)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
    at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:164)
    at org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:153)
    at com.dataiku.dip.queries.SQLQueryRuntime.executeAllStatementsUpToSelect(SQLQueryRuntime.java:363)
    at com.dataiku.dip.queries.SQLQueryRuntime.getPage(SQLQueryRuntime.java:558)
    at com.dataiku.dip.scheduler.steps.ExecuteSQLStepRunner.run(ExecuteSQLStepRunner.java:210)
    at com.dataiku.dip.server.services.ScenariosService.runStep(ScenariosService.java:655)
    at com.dataiku.dip.scheduler.scenarios.StepBasedScenarioRunner.run(StepBasedScenarioRunner.java:226)
    at com.dataiku.dip.scheduler.ScenarioThread.execute(ScenarioThread.java:146)
    at com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:96)
[2022/06/19-16:24:52.041] [FT-ScenarioThread-1u0skXZS-10233] [INFO] [dku.sql.runtime] scenario DKU_TUT_AUTOMATION.MYCUSTOMSENARIO#2022-06-19-16-24-51-997 - [ct: 20] Closing SQL Query runtime
[2022/06/19-16:24:52.041] [FT-ScenarioThread-1u0skXZS-10233] [INFO] [dku.sql.runtime] scenario DKU_TUT_AUTOMATION.MYCUSTOMSENARIO#2022-06-19-16-24-51-997 - [ct: 20] Closing main statement

I then took the query provided in the instructions and discovered it did not work in a SQL recipe by itself either . I was using one of my database connections that is working OK.

After some experimentation, I did discover by adding the SCHEMA for my connection manually. Things seems to work OK. In my case in the Connection the skema was listed as "dku". See below for the modification of the SQL.

SELECT COUNT(*) AS "state_transactions"
       , "merchant_state" 
FROM "dku"."${projectKey}_transactions_joined_prepared" 
WHERE "merchant_state" IS NOT NULL 
GROUP BY "merchant_state" 
ORDER BY "state_transactions" DESC 
LIMIT 1

Here are some configuration items.

DSS V10.0.7
PostrgreSQL V 13.2
Local OS is Mac OS 10.15.7

Am I missing something here?

Is my use of a Skema in a database connection unusual?

Is this a defect in 10.0.7? Most other places in DSS I do not run into this problem.

Hope this helps others.

tgb417 · June 2022

Also,

I discovered that the logic does not work out to correctly refresh the transactions_by_merchant_id to be Colorado Records. The instructions just above this point suggest that:

When the scenario finishes, open the transactions_by_merchant_id dataset 
to observe the change in output. It turns out that “Colorado” is the state 
with the most transactions.

Although, it is true that Colorado is the states with the most transaction the table transactions_by_merchant_id dataset does not get updated. Because the Steps in the Scenario have already update the stored variable before this check. See below in orange.

from dataiku.scenario import Scenario
import dataiku

# The Scenario object is the main handle from which you initiate steps
scenario = Scenario()
state_name = scenario.get_all_variables()["state_name"]
if state_name != "Colorado":
    scenario.clear_dataset("transactions_by_merchant_id")
    scenario.build_dataset("transactions_by_merchant_id")

if state_name != "Colorado": would never be true.

One might try.

from dataiku.scenario import Scenario
import dataiku

# The Scenario object is the main handle from which you initiate steps
scenario = Scenario()
state_name = scenario.get_all_variables()["state_name"]
if state_name == "Colorado":
    scenario.clear_dataset("transactions_by_merchant_id")
    scenario.build_dataset("transactions_by_merchant_id")

Then the statements in the Hands on instruction make some sense.

Hands-On: Custom Metrics, Checks & Scenarios

Answers

Categories

Setup Info

Tags