Error 500 when trying to refresh full sample

UserBird
UserBird Dataiker, Alpha Tester Posts: 535 Dataiker

I am discovering DSS and this is my test use case:

1- Connect to a single PostgresSQL table ( around 4 millions record)

2- Add a recipe to add a column (XYZ) and output the managed filesystem.

3-Create a pivot table to calculate an average of a mesure ( speed) ( 1 dimension( the new XYZ) - 5 possible values)

4- Get the answer based on the full dataset

So I build the recipe and then go to the dataset=>chart OR the analyse=> chart to design the pivot table. It works fine with the 10000 row sample but, when i ask for the full dataset sample ( DSS engine) . It throws an error 500 after 5 to 15 minutes.

Any idea?

Answers

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Hi,

    It would help to have the "run/backend.log" file. Probably the helper process which manages the chart experienced an out of memory issue, which could happen if some of the column have very large number of unique values.

    For doing charts on such large data, we'd recommend that you switch the charts to "In-database" engine (in the Sample settings). This will push almost all the computation down to the database and will provide better performance
  • donvincenzo
    donvincenzo Registered Posts: 1 ✭✭✭✭
    The dataset is the local file system managed (CSV) so no database engine available. It says in the doc that The DSS Charts Engine does not require that the chart data be loaded in memory, but is instead able to efficiently stream data from disk and perform queries on the fly. This allows you to perform visual analytics on very large data extracts that would not fit in RAM using commodity hardware.

    How can I use this Charts Engine for my chart ( or pivot table)?
Setup Info
    Tags
      Help me…