Python output read the wrong way on the output.

Options
afostor
afostor Registered Posts: 6 ✭✭✭✭

I checked my output dataframe in Python formula before the dataiku.Dataset("output_dataset").write_with_schema(df) command. But, when I check the output dataset, it ommits some rows and fill them with duplicates. What could have happended?

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi,

    This type of issue can be due to an unsupported version of pandas being used in your code env.

    Can you please confirm the exact Python version you are using and the pandas version? You can quickly check by running :

    import pandas as pdfrom platform import python_versionprint(python_version())print(pd.show_versions())

    You should be using one of the Pandas versions available for your Code env -> Packages to install -> Core package versions:

    Screenshot 2022-05-04 at 20.52.50.png

  • afostor
    afostor Registered Posts: 6 ✭✭✭✭
    Options

    These are my versions and, as I can observe, they match with the ones supported by Dataiku .

    3.6.8INSTALLED VERSIONS------------------commit           : b5958ee1999e9aead1938c0bba2b674378807b3dpython           : 3.6.8.final.0python-bits      : 64OS               : LinuxOS-release       : 3.10.0-1160.59.1.el7.x86_64Version          : #1 SMP Wed Feb 16 12:17:35 UTC 2022machine          : x86_64processor        : x86_64byteorder        : littleLC_ALL           : en_US.UTF-8LANG             : en_US.UTF-8LOCALE           : en_US.UTF-8pandas           : 1.1.5numpy            : 1.19.5pytz             : 2020.5dateutil         : 2.8.1pip              : 21.3.1setuptools       : 51.3.3Cython           : Nonepytest           : Nonehypothesis       : Nonesphinx           : Noneblosc            : Nonefeather          : Nonexlsxwriter       : Nonelxml.etree       : Nonehtml5lib         : Nonepymysql          : Nonepsycopg2         : Nonejinja2           : 3.0.1IPython          : 7.16.1pandas_datareader: Nonebs4              : Nonebottleneck       : Nonefsspec           : 2021.08.1fastparquet      : Nonegcsfs            : 2021.08.1matplotlib       : 3.3.4numexpr          : 2.7.3odfpy            : Noneopenpyxl         : Nonepandas_gbq       : 0.14.1pyarrow          : 5.0.0pytables         : Nonepyxlsb           : Nones3fs             : Nonescipy            : 1.5.4sqlalchemy       : 1.4.23tables           : Nonetabulate         : 0.8.9xarray           : Nonexlrd             : 2.0.1xlwt             : Nonenumba            : 0.53.1None

    However, I changed my entire environment for another configured and it worked. I think maybe other libraries are not according to the Dataiku functionalities.

    Thanks

Setup Info
    Tags
      Help me…