Functionality questions on DataIKU

parneet_gill Registered Posts: 1
  • Can Spark be configured for ML algorithms; it looks like current processing is in memory?
    • Is there Spark processing option available for K means clustering and PCA linear regression?
    • Is Light GBM available with Spark?
  • Is automated Hyperparameter tuning available in Dataiku?
  • Schema visibility:
    • Current DataIKU configuration lacks schema visibility. Is there a way to view schema before datasets are created for Redshift/Redshift spectrum and S3 curated zone?
  • Currently DataIKU times out with large dataset query and it’s difficult to troubleshoot query in DataIKU. Is there a way to optimize this?
  • Read table on dataset creation doesn’t work for Redshift Spectrum. Is this a DataIKU bug?


  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,727 Neuron

    I will suggest you split these in individual questions / threads as you are asking too many questions in a single post. Where you have an error you should post the error you get, "doesn't work" or "times out" doesn't really say much, post the full error trace from the backend log. Where you say Dataiku lacks schema visibility, please post what you see (screen shot) and why think it's missing something.

Setup Info
      Help me…