Issue with Dataiku Visual Recipe Failing to Save Data to SQL Database

KYOUNGJIN
KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

Hello,

I am currently using an Other SQL Database connection in Dataiku and utilizing Visual Recipes to save data to a table.

In some cases, such as when uploading files or handling simple datasets, the table is successfully created and saved. However, there are instances where the process fails.

I would like to understand the reasons behind these failures. I have also attached the error messages that appear when the process does not work.

Could you please provide insight into why this might be happening? Any guidance would be greatly appreciated.

Thank you.

Operating system used: Linux

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,360 Neuron

    What is exactly your database technology and version?

  • KYOUNGJIN
    KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

    Hi Turribeach,

    I'm using SingleStoreDB (version 8.5.22), which needs to be configured using the "Other SQL Database" option under Dataiku's SQL Database connections.

    Thank you!

  • yonghyun
    yonghyun Registered Posts: 9 ✭✭

    로그 보시면 dataiku 에서 드랍하려고 하는데

    드랍이 실패합니다.

    해당 db 커넥션에 설정한 계정의 권한을 확인해주세요

  • KYOUNGJIN
    KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

    Hi Yonghyun,

    Thank you for reviewing the logs and looking into this issue.

    I'd like to briefly explain the accounts I'm currently using for Dataiku and the DB connection:

    • The Dataiku user account I'm using is a Designer account with administrator privileges.
    • For the DB connection, I'm using the root account.

    Additionally, since both the successful and failed flows in the previously uploaded images are using the same DB connection, I assumed the issue wasn't related to account permissions.

    Is there perhaps any permission or setting that I might have overlooked?

    Thank you!

  • yonghyun
    yonghyun Registered Posts: 9 ✭✭

    해당 작업은 전부다 sync를 하는 작업업으로 보여집니다.

    신규 dataset에 대해 sync를 한번 진행 했을 경우 만들어지는지

    한번 sync가 완료된 후 다시 해당 sync레시피를 동작했을때 에러가 나는 것인지 궁금합니다.

  • KYOUNGJIN
    KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

    This error occurs only during the initial run on a dataset.

    In other successful cases, re-running the flow works without issues.

    I've also tested other visual recipes, but the same error still occurs.

    For your reference, I've attached an image showing this issue.

  • yonghyun
    yonghyun Registered Posts: 9 ✭✭

    에러화면을 보고싶습니다

    job 에서 에러난모든 화면을 캡쳐 할수 있을까요?

    예시)

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,360 Neuron
  • KYOUNGJIN
    KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

    Hello Youghyun,

    Apologies for the late reply.

    I've attached the logs after re-running the requested operation.
    It seems like this issue is related to the schema. If, as Turribeach mentioned, this is caused by using an unsupported database, we may not be able to utilize this feature.

    Thank you again for your assistance.

    [2025/03/23-20:08:31.345] [ActivityExecutor-35] [INFO] [dku] running compute_year_2024_total_group_split_t_copy_NP - ----------------------------------------[2025/03/23-20:08:31.345] [ActivityExecutor-35] [INFO] [dku] running compute_year_2024_total_group_split_t_copy_NP - DSS startup: jek version:12.5.2[2025/03/23-20:08:31.345] [ActivityExecutor-35] [INFO] [dku] running compute_year_2024_total_group_split_t_copy_NP - DSS home: /home/penta/design_dss[2025/03/23-20:08:31.345] [ActivityExecutor-35] [INFO] [dku] running compute_year_2024_total_group_split_t_copy_NP - OS: Linux 4.18.0-545.el8.x86_64 amd64 - Java: Red Hat, Inc. 1.8.0_362[2025/03/23-20:08:31.344] [ActivityExecutor-35] [INFO] [dku.flow.jobrunner] running compute_year_2024_total_group_split_t_copy_NP - Allocated a slot for this activity![2025/03/23-20:08:31.345] [ActivityExecutor-35] [INFO] [dku.flow.jobrunner] running compute_year_2024_total_group_split_t_copy_NP - Run activity[2025/03/23-20:08:31.371] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Executing default pre-activity lifecycle hook[2025/03/23-20:08:31.377] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Checking if sources are ready[2025/03/23-20:08:31.379] [ActivityExecutor-35] [DEBUG] [dku.db.internal] running compute_year_2024_total_group_split_t_copy_NP - Created DSSDBConnection dssdb-h2-flow_state-exIHke9[2025/03/23-20:08:31.381] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Will check readiness of USER_EDU.year_2024_total_group_split_t p=NP[2025/03/23-20:08:31.392] [ActivityExecutor-35] [INFO] [dku.datasets.file] running compute_year_2024_total_group_split_t_copy_NP - Building Filesystem handler config: {"connection":"202_server","path":"USER_EDU/year_2024_total_group_split_t","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}}[2025/03/23-20:08:31.393] [ActivityExecutor-35] [DEBUG] [dku.datasets.fsbased] running compute_year_2024_total_group_split_t_copy_NP - getReadiness: will enumerate partition <partition:NP>[2025/03/23-20:08:31.394] [ActivityExecutor-35] [INFO] [dku.datasets.ftplike] running compute_year_2024_total_group_split_t_copy_NP - Enumerating Filesystem dataset prefix=[2025/03/23-20:08:31.395] [ActivityExecutor-35] [DEBUG] [dku.datasets.fsbased] running compute_year_2024_total_group_split_t_copy_NP - Building FS provider for dataset handler: USER_EDU.year_2024_total_group_split_t[2025/03/23-20:08:31.413] [ActivityExecutor-35] [DEBUG] [dku.datasets.fsbased] running compute_year_2024_total_group_split_t_copy_NP - FS Provider built[2025/03/23-20:08:31.413] [ActivityExecutor-35] [DEBUG] [dku.fs.local] running compute_year_2024_total_group_split_t_copy_NP - Enumerating local filesystem prefix=/[2025/03/23-20:08:31.414] [ActivityExecutor-35] [DEBUG] [dku.fs.local] running compute_year_2024_total_group_split_t_copy_NP - Enumeration done nb_paths=1 size=264[2025/03/23-20:08:31.415] [ActivityExecutor-35] [DEBUG] [dku.datasets.fsbased] running compute_year_2024_total_group_split_t_copy_NP - getReadiness: enumerated partition, found 1 paths, computing hash[2025/03/23-20:08:31.416] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Checked source readiness USER_EDU.year_2024_total_group_split_t -> true[2025/03/23-20:08:31.416] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Computing hashes to propagate BEFORE activity[2025/03/23-20:08:31.416] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Recorded 1 hashes before activity run[2025/03/23-20:08:31.416] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Building recipe runner of type[2025/03/23-20:08:31.421] [ActivityExecutor-35] [INFO] [dku.flow.recipes.prerunpropagate] running compute_year_2024_total_group_split_t_copy_NP - Propagating schema of recipe compute_year_2024_total_group_split_t_copy of type sync[2025/03/23-20:08:31.421] [ActivityExecutor-35] [INFO] [dku.flow.recipes.prerunpropagate] running compute_year_2024_total_group_split_t_copy_NP - Getting schema update result from backend[2025/03/23-20:08:31.443] [ActivityExecutor-35] [INFO] [dku.flow.recipes.prerunpropagate] running compute_year_2024_total_group_split_t_copy_NP - No change to do[2025/03/23-20:08:31.450] [ActivityExecutor-35] [INFO] [com.dataiku.dip.hive.HiveConfigurator] running compute_year_2024_total_group_split_t_copy_NP - Hive support is disabled (no hadoop)[2025/03/23-20:08:31.450] [ActivityExecutor-35] [INFO] [com.dataiku.dip.impala.ImpalaConfigurator] running compute_year_2024_total_group_split_t_copy_NP - Impala support is disabled (no hadoop)[2025/03/23-20:08:31.487] [ActivityExecutor-35] [INFO] [dku.flow.sync] running compute_year_2024_total_group_split_t_copy_NP - Selected engine type: DSS[2025/03/23-20:08:31.492] [ActivityExecutor-35] [INFO] [dku.flow.sync] running compute_year_2024_total_group_split_t_copy_NP - Using executor: com.dataiku.dip.dataflow.exec.sampling.SingleThreadAnyToAnySamplingExecutor[2025/03/23-20:08:31.494] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Recipe runner built, will use 1 thread(s)[2025/03/23-20:08:31.494] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Preparing execution thread: com.dataiku.dip.dataflow.exec.sync.SyncRecipeRunner@2457781[2025/03/23-20:08:31.494] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Starting execution thread: Thread[Thread-23,5,main][2025/03/23-20:08:31.495] [ActivityExecutor-35] [DEBUG] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Execution threads started, waiting for activity end[2025/03/23-20:08:31.497] [FRT-40-FlowRunnable] [INFO] [dku.flow.activity] act.compute_year_2024_total_group_split_t_copy_NP - Run thread for activity compute_year_2024_total_group_split_t_copy_NP starting[2025/03/23-20:08:31.508] [FRT-40-FlowRunnable] [INFO] [dip.connection.share] act.compute_year_2024_total_group_split_t_copy_NP - Take connection refCount=0[2025/03/23-20:08:31.508] [FRT-40-FlowRunnable] [INFO] [dip.connection.share] act.compute_year_2024_total_group_split_t_copy_NP -   > create connection[2025/03/23-20:08:31.553] [FRT-40-FlowRunnable] [INFO] [dku.connections.sql.provider] act.compute_year_2024_total_group_split_t_copy_NP - Connecting to jdbc:singlestore:loadbalance://192.168.100.204:3306,192.168.100.205:3306/news with props: {"user":"root","password":"***"} conn=singlestore_news-QcL0DTm[2025/03/23-20:08:31.557] [FRT-40-FlowRunnable] [DEBUG] [dku.connections.sql.driver] act.compute_year_2024_total_group_split_t_copy_NP - Driver version 1.2[2025/03/23-20:08:31.558] [FRT-40-FlowRunnable] [DEBUG] [dku.sql.connection.service.factory] act.compute_year_2024_total_group_split_t_copy_NP - SQL connection pool disabled (globally or for the connection)[2025/03/23-20:08:31.609] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: SET NAMES utf8mb4[2025/03/23-20:08:31.609] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: SELECT @@max_allowed_packet, @@aggregator_id[2025/03/23-20:08:31.639] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.socket.impl.PacketWriter] act.compute_year_2024_total_group_split_t_copy_NP - set maxAllowedPacket = 104857600[2025/03/23-20:08:31.644] [FRT-40-FlowRunnable] [INFO] [dku.connections.sql.provider] act.compute_year_2024_total_group_split_t_copy_NP - Driver: SingleStore JDBC (JDBC 4.2) 1.2.2 (1.2)[2025/03/23-20:08:31.646] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: SELECT @@memsql_version;[2025/03/23-20:08:31.662] [FRT-40-FlowRunnable] [INFO] [dku.connections.sql.provider] act.compute_year_2024_total_group_split_t_copy_NP - Database: SingleStore 8.9.5 (8.9) rowSize=0 stmts=0[2025/03/23-20:08:31.665] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: set autocommit=false[2025/03/23-20:08:31.669] [FRT-40-FlowRunnable] [DEBUG] [dku.resourceusage] act.compute_year_2024_total_group_split_t_copy_NP - Reporting start of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"tester1","projectKey":"USER_EDU","jobId":"Build_year_2024_total_group_split_t_copy__NP__2025-03-24T00-08-31.041","activityId":"compute_year_2024_total_group_split_t_copy_NP","activityType":"recipe","recipeType":"sync","recipeName":"compute_year_2024_total_group_split_t_copy"},"type":"SQL_CONNECTION","id":"RDvKaUOXqFSvFIr7","startTime":1742774911641,"sqlConnection":{"connection":"singlestore_news"}}[2025/03/23-20:08:31.670] [FRT-40-FlowRunnable] [INFO] [dku.sql.generic] act.compute_year_2024_total_group_split_t_copy_NP - Dropping table[2025/03/23-20:08:31.670] [FRT-40-FlowRunnable] [INFO] [dku.dataset.sql] act.compute_year_2024_total_group_split_t_copy_NP - Executing statement:[2025/03/23-20:08:31.670] [FRT-40-FlowRunnable] [INFO] [dku.dataset.sql] act.compute_year_2024_total_group_split_t_copy_NP - DROP TABLE `USER_EDU_year_2024_total_group_split_t_copy`[2025/03/23-20:08:31.671] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: DROP TABLE `USER_EDU_year_2024_total_group_split_t_copy`[2025/03/23-20:08:31.672] [FRT-40-FlowRunnable] [WARN] [com.singlestore.jdbc.message.server.ErrorPacket] act.compute_year_2024_total_group_split_t_copy_NP - Error: 1051-42S02: Unknown table 'USER_EDU_year_2024_total_group_split_t_copy'[2025/03/23-20:08:31.673] [FRT-40-FlowRunnable] [INFO] [dku.sql.generic] act.compute_year_2024_total_group_split_t_copy_NP - Drop table failed, table probably did not exist: (conn=57528) Unknown table 'USER_EDU_year_2024_total_group_split_t_copy'[2025/03/23-20:08:31.673] [FRT-40-FlowRunnable] [DEBUG] [dku.connections.sql.provider] act.compute_year_2024_total_group_split_t_copy_NP - Rollback conn=singlestore_news-QcL0DTm[2025/03/23-20:08:31.673] [FRT-40-FlowRunnable] [DEBUG] [com.singlestore.jdbc.client.impl.StandardClient] act.compute_year_2024_total_group_split_t_copy_NP - execute query: ROLLBACK[2025/03/23-20:08:31.673] [FRT-40-FlowRunnable] [INFO] [dku.sql.generic] act.compute_year_2024_total_group_split_t_copy_NP - Creating table[2025/03/23-20:08:31.676] [FRT-40-FlowRunnable] [INFO] [dku.flow.activity] act.compute_year_2024_total_group_split_t_copy_NP - Run thread failed for activity compute_year_2024_total_group_split_t_copy_NPjava.lang.AssertionErrorat com.dataiku.dip.sql.MySQLDialect.getSQLType(MySQLDialect.java:97)at com.dataiku.dip.sql.GenericSQLDialect.getCreateTableFieldsSQL(GenericSQLDialect.java:563)at com.dataiku.dip.sql.GenericSQLDialect.generateTableStatementSQL(GenericSQLDialect.java:590)at com.dataiku.dip.sql.MySQLDialect.generateTableStatementSQL(MySQLDialect.java:51)at com.dataiku.dip.sql.GenericSQLDialect.getCreateTableStatementSQL(GenericSQLDialect.java:579)at com.dataiku.dip.sql.GenericSQLDialect.dropAndRecreateTableOrPartition(GenericSQLDialect.java:658)at com.dataiku.dip.datasets.sql.SQLTableOutput$SQLTableOutputWriter.init(SQLTableOutput.java:133)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.init(ToDatasetStreamer.java:125)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.getAsProcessor(ToDatasetStreamer.java:108)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.getAsOutput(ToDatasetStreamer.java:112)at com.dataiku.dip.dataflow.exec.sampling.SingleThreadAnyToAnySamplingExecutor.run(SingleThreadAnyToAnySamplingExecutor.java:48)at com.dataiku.dip.dataflow.exec.sync.SyncRecipeRunner.run(SyncRecipeRunner.java:245)at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)[2025/03/23-20:08:31.695] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - activity is finished[2025/03/23-20:08:31.696] [ActivityExecutor-35] [ERROR] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Activity failedjava.lang.AssertionErrorat com.dataiku.dip.sql.MySQLDialect.getSQLType(MySQLDialect.java:97)at com.dataiku.dip.sql.GenericSQLDialect.getCreateTableFieldsSQL(GenericSQLDialect.java:563)at com.dataiku.dip.sql.GenericSQLDialect.generateTableStatementSQL(GenericSQLDialect.java:590)at com.dataiku.dip.sql.MySQLDialect.generateTableStatementSQL(MySQLDialect.java:51)at com.dataiku.dip.sql.GenericSQLDialect.getCreateTableStatementSQL(GenericSQLDialect.java:579)at com.dataiku.dip.sql.GenericSQLDialect.dropAndRecreateTableOrPartition(GenericSQLDialect.java:658)at com.dataiku.dip.datasets.sql.SQLTableOutput$SQLTableOutputWriter.init(SQLTableOutput.java:133)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.init(ToDatasetStreamer.java:125)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.getAsProcessor(ToDatasetStreamer.java:108)at com.dataiku.dip.dataflow.exec.stream.ToDatasetStreamer.getAsOutput(ToDatasetStreamer.java:112)at com.dataiku.dip.dataflow.exec.sampling.SingleThreadAnyToAnySamplingExecutor.run(SingleThreadAnyToAnySamplingExecutor.java:48)at com.dataiku.dip.dataflow.exec.sync.SyncRecipeRunner.run(SyncRecipeRunner.java:245)at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)[2025/03/23-20:08:31.696] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Executing default post-activity lifecycle hook[2025/03/23-20:08:31.709] [ActivityExecutor-35] [INFO] [dku.flow.activity] running compute_year_2024_total_group_split_t_copy_NP - Done post-activity tasks
    
  • KYOUNGJIN
    KYOUNGJIN Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

    Hello Turribeach,

    I initially suspected the issue was related to using an "Other database" connection, but it seems we need to test this again with an officially supported SQL connection.

    Thank you for reviewing this issue.