Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TabPFN with cross validation (mean of folds or refit on all?) #238

Open
luispintoc opened this issue Mar 12, 2025 · 0 comments
Open

TabPFN with cross validation (mean of folds or refit on all?) #238

luispintoc opened this issue Mar 12, 2025 · 0 comments

Comments

@luispintoc
Copy link

Is it better to do cross-validation and when predicting new points use the mean of these? or tune the model on the whole dataset and predict with it?

I have tested with 5-fold and 15-fold and I get the same RMSE, which tells me that if I were to train the model with the all of the data, it will keep the same RMSE, this gives me an idea of expected error.

If I think as an ML person, when predicting with 5 models and take the mean, it's the same as applying subsample=0.8 for bagging. But if I think bayesian, the model should benefit from fitting the complete dataset as it will create a better posterior.

What are your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant