Problem with np.bool dependency in "dataiku" package in python
Hi,
I'm having a problem importing the "dataiku" package in python in the following environments since update to 11.2
python 3.6
python 3.7
python 3.9
# Example: load a DSS dataset as a Pandas dataframe mydataset = dataiku.Dataset("PWHRn6UM")
import dataiku
/data1/dss/dataiku-dss-11.2.0/python/dataiku/core/schema_handling.py:17: FutureWarning: In the future `np.bool` will be defined as the corresponding NumPy scalar. (This may have returned Python scalars in past versions. 'boolean': np.bool
Appears at first to be a soft warning but any attempt to use the dataiku package fails
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-3-4bdc522e44af> in <cell line: 2>() 1 # Example: load a DSS dataset as a Pandas dataframe ----> 2 mydataset = dataiku.Dataset("PWHRn6UM") NameError: name 'dataiku' is not defined
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-4-3acb3cd0f9dd> in <cell line: 1>() ----> 1 import dataiku /data1/dss/dataiku-dss-11.2.0/python/dataiku/__init__.py in <module> 11 from .base import remoterun 12 from .core.base import is_container_exec ---> 13 from .core.dataset import Dataset, _dataset_writer_atexit_handler 14 from .core.schema_handling import get_schema_from_df 15 /data1/dss/dataiku-dss-11.2.0/python/dataiku/core/dataset.py in <module> 33 34 # Module code ---> 35 from dataiku.core import flow, base, schema_handling, dkuio 36 from dataiku.core.platform_exec import read_dku_json 37 from dataiku.core.dkujson import dump_to_filepath, load_from_filepath /data1/dss/dataiku-dss-11.2.0/python/dataiku/core/schema_handling.py in <module> 15 'float': np.float32, 16 'double': np.float64, ---> 17 'boolean': np.bool 18 } 19 /data1/dss/dss_data/code-envs/python/GC_dbs_39/lib64/python3.9/site-packages/numpy/__init__.py in __getattr__(attr) 282 return Tester 283 --> 284 raise AttributeError("module {!r} has no attribute " 285 "{!r}".format(__name__, attr)) 286 AttributeError: module 'numpy' has no attribute 'bool'
.
Operating system used: Windows
Operating system used: Windows
Best Answer
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi,
The error is not related to the newer DSS version. This behaviour comes from the numpy package version it uses.While the deprecation warning is harmless, the attribute error comes from 'numpy.bool' not being available since numpy 1.24.0 (released on 19th of December): https://github.com/numpy/numpy/releases/tag/v1.24.0
We are in process of fixing this globally for DSS. In the meantime, we can do a quick workaround on your code env:
1) Go to Administration > Code Envs > Select your code env
2) Add the following in 'Packages to install': numpy < 1.24'
3) Click 'UPDATE'
Answers
-
Issue resolves if I use numpy==1.23 on python 3.9
numpy 1.23 not compatible with python 3.6 / 3.7 however
-
Thanks, this resolves issue with earlier versions of python
-
It looks like this hasn't been completed yet:
We are in process of fixing this globally for DSS.
Is this still WIP on the Dataiku side?
-
Hello,
As of version 12.2.3, we've removed this deprecated usage from our codebase. You may still get the warning since it's in use in some common dependencies like pandas or scikit-optimize, meaning we still don't recommend using numpy>=1.24 in general.
Best regards -
Using one now with numpy 1.23.5, same error.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,002 Neuron
Please start a new thread.