Timeseries too short for training Error
Hello community,
When using time series to predict the next candidate for the presidential elections over a period of 1,2 and 3 years we've set a date for each elections because there were no date. so for each elections (we used the last 3 elections) we generated the year of the election at format : yyyy-dd-mmTHH:MM:SS.ss. And this date is the same for each election. We got the same error as below:
Failed to train :<class 'ValueError>: Timeseries too short for training. Identifier : {"Candidate_Name":"Toto"}|Length: 5 < Minlength :<number>. try to decrease the evaluation set size and/or the season length (if applicable).
When incrementing the date for each election we also get the same error is someone have any idea ?
Operating system used: Windows
Answers
-
Based on the error "<class 'ValueError'> : Timeseries too short for training. Identifier..." that you are receiving for the time series {"Candidate_Name":"Toto"} it appears that this time series doesn't have enough data. It looks like it appears only 5 times in your training data and doesn't meet the minimum length of time series data needed for the model, "Length: 5 < Minlength :<number>".
You can try filtering out the row with the timeseries {"Candidate_Name":"Toto"} from the training dataset or you need to make sure you have enough data in your training dataset to meet the minimum length of timeseries data needed.An option to ignore too short time series during training has been added in DSS 12.1:
-
Tokoro-San Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 2 ✭
We updated the version and now we get : Failed to train : <class 'ValueError'> : All input time series are shorter than the min required length of 22 for training. Check the logs for more details.
The toto candidate which is an example exists for like 1000 rows with different dates so this is very strange. Do you know how can we solve this ?