Time series must have at least 3 values

reeda19
Level 1
Time series must have at least 3 values

I am trying to get a forecast on some data I have, but this error keeps appearing, and when I try and Google it, nothing shows. I have time sampled the data successfully.  To do this, I used 4 different identifying columns to build an ID for each row, and then used this new ID column as the "Column with identifier" while time sampling. I then filtered out any rows with null values, leaving me with what should be filtered, sampled data. To forecast the data, I use the AutoML - Quick Prototypes option. When I try and model it, I get the following error:

```Job failed: Error in Python process: At line 59: <class 'ValueError'>: Time series must have at least 3 values```.

I am not sure what the "3 values" is referring to.

 

I am fairly new to dataiku, so I am not too sure what other information is relevant. If I need to clarify anything, please let me know. Thanks

0 Kudos
2 Replies
CoreyS
Dataiker Alumni

Hi, @reeda19! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if youโ€™ve tried any fixes already?This should lead to a quicker response from the community.

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
0 Kudos
Krishna
Dataiker

Hi @reeda19,

It's possible that your data has not been prepared in the "Long format" https://doc.dataiku.com/dss/latest/time-series/data-formatting.html#long-format which would necessitate the need for the "Column with Identifier".

The error message, and your mention of "I used 4 different identifying columns to build an ID for each row", makes it sound like the issue is that you've created a unique identifier for each record, and passed that through as the "Column with Identifier". This would then lead DSS to believe you have several timeseries with a single record, whereas it's expecting at least 3 records per series.

If your input data does not contain multiple series 'stacked' together, then it it is in wide format and you should not use the 'column with identifier' parameter. If it is, then as per the documentation example, you ought to use a column such as 'carriergroup', as the 'column with identifier'.

 

Hope that helps.

0 Kudos