ModuleNotFoundError: No module named 'dataiku'

Solved!
wangqij
Level 1
ModuleNotFoundError: No module named 'dataiku'

hi,

I'm running a very simple python recipe in local execution, but found following error:

on-recipe/pyoutic2pxHWrHs8t', '/data/lib/python', '/home/dataiku/dataiku-dss-10.0.2/python', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/data/pyenv/lib64/python3.6/site-packages', '/data/pyenv/lib/python3.6/site-packages', '/home/dataiku/dataiku-dss-10.0.2/python36.packages']
[01:58:55] [INFO] [dku.utils] - 2022-03-31 01:58:55,483 INFO Script file: /data/jobs/DKU_TSHIRTS/Build_uif_test__NP__2022-03-31T01-58-54.097/compute_uif-test_NP/cpython-recipe/pyoutic2pxHWrHs8t/script.py
[01:58:55] [INFO] [dku.utils] - Traceback (most recent call last):
[01:58:55] [INFO] [dku.utils] - File "/data/jobs/DKU_TSHIRTS/Build_uif_test__NP__2022-03-31T01-58-54.097/compute_uif-test_NP/cpython-recipe/pyoutic2pxHWrHs8t/python-exec-wrapper.py", line 196, in <module>
[01:58:55] [INFO] [dku.utils] - import dataiku
[01:58:55] [INFO] [dku.utils] - ModuleNotFoundError: No module named 'dataiku'
[01:58:55] [INFO] [dku.utils] - 2022-03-31 01:58:55,488 21392 INFO [Child] Process 21394 exited with exit=1 signal=0
[01:58:55] [INFO] [dku.utils] - 2022-03-31 01:58:55,488 21392 INFO Full child code: 1
[01:58:55] [WARN] [dku.resource] - stat file for pid 21394 does not exist. Process died?
[01:58:55] [DEBUG] [dku.resourceusage] - Reporting completion of CRU:{"context":{"type":"JOB_ACTIVITY","authIdentifier":"admin","projectKey":"DKU_TSHIRTS","jobId":"Build_uif_test__NP__2022-03-31T01-58-54.097","activityId":"compute_uif-test_NP","activityType":"recipe","recipeType":"python","recipeName":"compute_uif-test"},"type":"LOCAL_PROCESS","id":"dWsBSJrenjmYksDi","startTime":1648691935427,"localProcess":{"pid":21394,"commandName":"/data/bin/python","cpuUserTimeMS":10,"cpuSystemTimeMS":0,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":10,"cpuCurrent":0.0,"vmSizeMB":121,"vmRSSMB":4,"vmHWMMB":4,"vmRSSAnonMB":1,"vmDataMB":1,"vmSizePeakMB":121,"vmRSSPeakMB":4,"vmRSSTotalMBS":0,"majorFaults":0,"childrenMajorFaults":0}}
[01:58:55] [INFO] [dku.flow.activity] - Run thread failed for activity compute_uif-test_NP
com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs.
at com.dataiku.dip.dataflow.common.CodeBasedThingHelper.throwSubprocessError(CodeBasedThingHelper.java:23)
at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:74)
at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:54)
at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:64)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:374)
[01:58:55] [INFO] [dku.flow.activity] running compute_uif-test_NP - activity is finished
[01:58:55] [ERROR] [dku.flow.activity] running compute_uif-test_NP - Activity failed
com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs.
at com.dataiku.dip.dataflow.common.CodeBasedThingHelper.throwSubprocessError(CodeBasedThingHelper.java:23)
at com.dataiku.dip.dataflow.exec.JobExecutionResultHandler.handleExecutionResult(JobExecutionResultHandler.java:26)
at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:74)
at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:54)
/exit

 

 

Here is the python code:

 

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
#crm_and_web_history_enriched = dataiku.Dataset("crm_and_web_history_enriched")
crm_and_web_history_enriched = dataiku.Dataset("crm_and_web_history_enriched")
crm_and_web_history_enriched_df = crm_and_web_history_enriched.get_dataframe()


# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.

uif_test_df = crm_and_web_history_enriched_df # For this sample code, simply copy input to output


# Write recipe outputs
uif_test = dataiku.Dataset("uif-test")
uif_test.write_with_schema(uif_test_df)

 

 

BTW, the DSS instance has User Isolation enabled in the local linux machine where DSS is running.

Thanks.

0 Kudos
1 Solution
sergeyd
Dataiker

Hi @wangqij 

Usually, this means that there is no traversal access to the DSS_INSTALL_DIR, where dataiku package is located as it's shipped with DSS.

Please check that impersonated Unix user has access to DSS_INSTALL_DIR/python/. 

View solution in original post

0 Kudos
1 Reply
sergeyd
Dataiker

Hi @wangqij 

Usually, this means that there is no traversal access to the DSS_INSTALL_DIR, where dataiku package is located as it's shipped with DSS.

Please check that impersonated Unix user has access to DSS_INSTALL_DIR/python/. 

0 Kudos