DSS Python pandas heavy datasets problem

Houssam_2000
Houssam_2000 Dataiku DSS Core Designer, Registered Posts: 4
edited July 16 in Setup & Configuration

Hello,

i am struggling with a problem in DSS :
- i try to read and process some tables stored in Hive using python pandas, the tables are quite big but i trying to optimize my process, here is what i see on the log :

Columns (53,139) have mixed types.Specify dtype option on import or set low_memory=False.
  exec(f.read())


i get so many lines of this type and they take a long tame to execute, any ideas how to speed up the time of processing ?

Thank you


Operating system used: Linux

Best Answer

Setup Info
    Tags
      Help me…