Issues with pandas version

BillMurray
Level 1
Issues with pandas version

Hello, I am evaluating DSS.



I am not expert of it, I am trying to load a project that was previously created by some colleagues in a previous version, I think it was 4.0



I am able to import the project (which contains hadoop and spark steps). The problem is when I try do build all the flow. 



I am receiving this error:



11:54:31] [INFO] [dku.utils] - raise Exception("Base package %s is too recent: version %s was found. %s. You should not install overriding versions of DSS base packages. Run '$DATADIR/bin/pip uninstall %s'" % (name, p.__version__, error_details, name)) [11:54:31] [INFO] [dku.utils] - Exception: Base package pandas is too recent: version 0.23.0 was found. Expected version 0.20.X. You should not install overriding versions of DSS base packages. Run '$DATADIR/bin/pip uninstall pandas' [11:54:31] [INFO] [dku.flow.activity] - Run thread failed for activity compute_Elbow_Table_NP com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs. at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.throwSubprocessError(AbstractCodeBasedActivityRunner.java:373) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:363) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:276) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:32) at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:56) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:352) [11:54:31] [INFO] [dku.flow.activity] running compute_Elbow_Table_NP - activity is finished [11:54:31] [ERROR] [dku.flow.activity] running compute_Elbow_Table_NP - Activity failed com.dataiku.dip.exceptions.ProcessDiedException: The Python process failed (exit code: 1). More info might be available in the logs. at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.throwSubprocessError(AbstractCodeBasedActivityRunner.java:373) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:363) at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:276) at com.dataiku.dip.dataflow.exec.AbstractPythonRecipeRunner.executeScript(AbstractPythonRecipeRunner.java:32) at com.dataiku.dip.recipes.code.python.PythonRecipeRunner.run(PythonRecipeRunner.java:56) at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:352)



Do you have any suggestion ?



Thanks, Bill

0 Kudos
6 Replies
Alex_Combessie
Dataiker Alumni

Hi Bill,



(I'm a big fan of your movies)



It seems someone made a change to the version of pandas installed in the built-in Python environment of Dataiku. As indicated in the error message, Dataiku requires pandas version 0.20.X to work. Can you or your Dataiku admin run the following Shell command:




$DATADIR/bin/pip uninstall pandas


Where $DATADIR is the data directory of your Dataiku DSS node, see https://www.dataiku.com/learn/guide/getting-started/dss-concepts/the-dss-datadir.html



Cheers,



Alex

0 Kudos
BillMurray
Level 1
Author
Hi, thanks for your appreciation!
In fact I act as admin here, we are just testing.
I already tried to uninstall pandas, successfully, and installing the version asked by the error code, which is 0.20, unsuccessfully.
It says it cannot install the version I am trying to due to some file missing.
You can find the full message, containing the log here https://pastebin.com/HhdwaTLh
Thanks for your help!
0 Kudos
Alex_Combessie
Dataiker Alumni
Hmmm. The logs indicates that some system packages are missing. You need to install system development tools and the Python interpreter header files. More info:
https://doc.dataiku.com/dss/latest/installation/python.html#additional-prerequisites
0 Kudos
BillMurray
Level 1
Author
Ok, thanks.
I installed the develpment tools but since they were slow on install, I went for a coffee.
Unfortunately, power went out. I restarted the system but now it seems that DSS's internat DB is corrupted.
In backend.log I can find an infinite stack trace, however the interesting message in my opinion is

Caused by: java.lang.IllegalStateException: Reading from nio:/opt/dataiku/databases/flow_state.mv.db failed; file length 499712 read length 1024 at 509121 [1.4.195/1]

How can I restore/discard this DB ?
Thanks
0 Kudos
Alex_Combessie
Dataiker Alumni
Hi, You can stop DSS, remove the corrupted DB and start DSS again.
0 Kudos
Alex_Combessie
Dataiker Alumni
Hi, did you solve your issue?
0 Kudos