Time series preparation: STL import error

raphanash
raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

Hi there,

I am working on a multivariate analysis of a time series dataset containing various commodity prices.

image (6).png

 After preparing the dataset, it consists of:

  • a parsed date, "Data_parsed"
  • 63 other columns of various commodity prices and indexes ranging from Fuels and Beverages to Metals

I am using the Time Series Preparation Plugin to resample the dataset- this is so that I can carry out a Kwiatkowski-Phillips-Schmidt-Shin test which requires a constant, regular time step for results to be correct.

However, I've received a class "ImportError" when STL is imported from "statsmodels.tsa.seasonal".

Here is the setup containing the error:

image (6) copy.png

For my details, here is the error log:

image (6) copy 2.png

image (6) copy 3.png

Not sure if this requires a simple importing fix using a "pip install statsmodels" command or if there is another cause behind the error.

I am using the free version of Dataiku DSS, version 11.3.2.

Thanks in advance for any help.

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker
    edited July 17

    Hi @raphanash
    ,

    Please try using this (add to your code env and select update):

    statsmodels==0.12.1

    Thanks!

    Jordan

  • raphanash
    raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

    many thanks!

    In admin settings, I can't see any code environments.

    Do I need to create an environment and upload it to DSS or should there be an existing environment with basic packages to which I can add "statsmodels==0.12.1"?

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @raphanash
    ,

    It sounds like you may need your DSS administrator to create a code environment for you or give you permission to create and/or manage a code env in order to add this package.

    Create new code env (Administration > Code Envs):

    Screen Shot 2023-03-20 at 2.06.02 PM.png

    Add package under "packages to install":

    Screen Shot 2023-03-20 at 2.06.36 PM.png

    Thanks!

    Jordan

  • raphanash
    raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

    Hi,  

    I've added the statsmodels package to "packages to install", but I've encountered this error (below). DSS automatically added the required packages for Visual ML on top of statsmodels.

    I am running python 3.10.

    image (6).png

    I appreciate your help!

  • raphanash
    raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

    Tried again without the additional packages, the same error appears.

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @raphanash
    ,

    Please try rebuilding the code environment from the plugins page (it should be based on python3.6). Select plugins > Installed > Time Series Preparation > Change or Dissociate code environment.

    Screen Shot 2023-03-21 at 3.50.46 PM.png

    This will automatically add the packages that you need to the code env.

    If you receive an error, please send a list of the packages that are installed in the new managed code env as well as the error message.

    Thanks!

    Jordan

  • raphanash
    raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

    Hi,

    Do I need to have python 3.6 installed for this to work?

    I haven't been able to build the environment, perhaps I am using an incorrect python implementation:

    image (8).png

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @raphanash
    ,

    According to your logs, you do have Python 3.7 installed as well, however, you need Python 3.6 installed to build this managed code env: https://github.com/dataiku/dss-plugin-timeseries-preparation/blob/master/code-env/python/desc.json#L3

    Once you've installed Python 3.6 on your host machine and added it to your path, please restart DSS.

    Thanks,

    Jordan

  • raphanash
    raphanash Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 9

    Hi there, everything works perfectly now!

    Many thanks for your help.

Setup Info
    Tags
      Help me…