TabPFN with cross validation (mean of folds or refit on all?) #238

luispintoc · 2025-03-12T02:04:59Z

Is it better to do cross-validation and when predicting new points use the mean of these? or tune the model on the whole dataset and predict with it?

I have tested with 5-fold and 15-fold and I get the same RMSE, which tells me that if I were to train the model with the all of the data, it will keep the same RMSE, this gives me an idea of expected error.

If I think as an ML person, when predicting with 5 models and take the mean, it's the same as applying subsample=0.8 for bagging. But if I think bayesian, the model should benefit from fitting the complete dataset as it will create a better posterior.

What are your thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TabPFN with cross validation (mean of folds or refit on all?) #238

TabPFN with cross validation (mean of folds or refit on all?) #238

luispintoc commented Mar 12, 2025

TabPFN with cross validation (mean of folds or refit on all?) #238

TabPFN with cross validation (mean of folds or refit on all?) #238

Comments

luispintoc commented Mar 12, 2025