You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it better to do cross-validation and when predicting new points use the mean of these? or tune the model on the whole dataset and predict with it?
I have tested with 5-fold and 15-fold and I get the same RMSE, which tells me that if I were to train the model with the all of the data, it will keep the same RMSE, this gives me an idea of expected error.
If I think as an ML person, when predicting with 5 models and take the mean, it's the same as applying subsample=0.8 for bagging. But if I think bayesian, the model should benefit from fitting the complete dataset as it will create a better posterior.
What are your thoughts?
The text was updated successfully, but these errors were encountered:
Is it better to do cross-validation and when predicting new points use the mean of these? or tune the model on the whole dataset and predict with it?
I have tested with 5-fold and 15-fold and I get the same RMSE, which tells me that if I were to train the model with the all of the data, it will keep the same RMSE, this gives me an idea of expected error.
If I think as an ML person, when predicting with 5 models and take the mean, it's the same as applying subsample=0.8 for bagging. But if I think bayesian, the model should benefit from fitting the complete dataset as it will create a better posterior.
What are your thoughts?
The text was updated successfully, but these errors were encountered: