I am a student working on creating a model to predict the soybean yield in midwestern states given the parameters of rainfall from April-October and nitrogen consumption. The standard Random Forest and Gradient Boosted Trees models are giving me an R2 score of less than 0.6, so I would like to tune the models to obtain a higher accuracy. Are there any suggestions on what modifications I can make?
Hi, @ellejo! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community.
There are a variety of ways for you to improve your R2 score, from feature engineering to data cleaning to choosing the right model.
One quick thing you can try through the visual ML in DSS is generating pairwise linear and polynomial combinations of your features. For your numerical features, this will automatically generate A+B, A-B, and A*B features, which can give your model more information to learn from. You can enable this feature in the design tab of the visual ML under "Feature generation".
You can also add hyperparameters to grid search through. Look at your previous runs and see which hyperparameters seemed to work best. You can then add additional hyperparameters in the algorithm tab that are similar to the ones that worked better in the past. DSS will automatically grid search through them and choose the hyperparameters that yield the best score.
I hope this helps!