Removing deprecated files by scotthawes · Pull Request #855 · PriorLabs/TabPFN

scotthawes · 2026-04-02T10:18:34Z

No description provided.

…xtraction from Tabpfn

…training - Introduced `data_loader.py` for loading datasets from CSV or generating synthetic data, including preprocessing functions. - Added `evaluation_metrics.py` to provide regression and classification metrics, including RMSE, MAE, accuracy, F1 score, and AUC calculations. - Created `model_training.py` to define baseline models for regression and classification, along with functions for training, predicting, saving, and loading models. Includes cross-validation and evaluation capabilities.

- Modified the `events.out.tfevents` file to reflect recent training changes. - Updated `time_left.tsv` with new iteration data, showing adjusted passed and remaining time for each iteration.

Add local baselining helpers and update training logs

Updating local branch

Here are the files whihc I have been working on. The baselining notebook contains all of my code which was used to perform the experiments, the baselining notebook summary markdown file contains a much simpler output for the findings.

… v1.0 notebook

Update baseline utilities and notebook naming

- Rename n_ensemble -> n_estimators in TABPFN_CONFIG (7.x rename) - Remove use_wandb (no longer a TabPFN param) - Switch device from 'cpu' to 'auto' (7.x auto GPU detection) - Fix DATA_PATH -> DATA_DIR import in data_loader_class.py

- Fix hardcoded LOCAL_CSV_PATH -> uses DATA_DIR from baseline_config - Update GLOBAL_MAX_TRAIN comment (no longer API limit, now local model limit) - Replace defunct pre_aux try/except block with clean TabPFNClassifier(n_estimators=8, device='auto')

…tiation - TabPFNClassifier init now conditional: local uses n_estimators/device, client uses random_state only - Full rerun complete on eudirectlapse.csv (23K rows, 10K cap, 80/20 split) - tabpfn_extensions upgraded to fix AutoTabPFNClassifier import

Add scripts/download_datasets.py to fetch 3 additional public insurance classification datasets programmatically: - coil2000.csv: COIL 2000 (Dutch insurer), OpenML ID 298, 9,822 rows, 85 features - ausprivauto0405.csv: Australian vehicle insurance 2004-05, CASdatasets GitHub, 67,856 rows, 6 features, ClaimOcc target (6.8% pos rate) - freMTPL2freq_binary.csv: French MTPL binarised (50K sample), 50,000 rows, 10 features, ClaimIndicator target (5.0% pos rate) Add notebooks/baseline_experiments/07_multi_dataset_benchmark.ipynb which runs TabPFN vs GLM (+ CatBoost, RandomForest, XGBoost) across all 4 insurance datasets and produces a ROC/PR AUC comparison table and bar chart figure. All models capped at 10,000 training samples for fair comparison.

…CatBoost experiments

Regressional Testing & analysis

…insurance study

Short Journal

- Implement `run_domain_finetune_stage_a.py` for controlled fine-tuning experiments on insurance datasets. - Create batch scripts for fine-tuning trials: `run_finetune_crossover_batch_3000.sh`, `run_finetune_first_batch.sh`, and `run_finetune_stress_batch_2000.sh`. - Introduce `run_small_finetune_classifier_trial.py` for smoke tests on TabPFN classifier fine-tuning. - Enhance logging and result tracking in fine-tuning scripts.

…ndations

Initial Fine Tuning Experiment

Adding reproducability

Adding note to clarify use of classfifier

Updating funding request

gemini-code-assist · 2026-04-02T10:19:04Z

Note

The number of changes in this pull request is too large for Gemini Code Assist to generate a review.

Cillian-Williamson and others added 30 commits August 5, 2025 10:59

Create ADSWP Project

ccd363a

Delete ADSWP Project

0a1e77e

Create Placeholder

20245a7

Adding French Cars and Australian Cars analyses

e8135d4

Delete ADSWP Project/Placeholder

ea8befd

Adding EDA for enhancing baseline models by making use of embedding e…

833b824

…xtraction from Tabpfn

Adding freMTPL analysis (Python Colab Notebook)

bc84ca3

GBM v Tabpfn on a CASDataset (usautoBI)

51b1536

Update CatBoost training logs and time left estimates

4552ab7

- Modified the `events.out.tfevents` file to reflect recent training changes. - Updated `time_left.tsv` with new iteration data, showing adjusted passed and remaining time for each iteration.

Add files via upload

fb310df

Merge pull request #1 from IFoA-ADSWP/eda/baselining_notebook

8ee5283

Add local baselining helpers and update training logs

Merge pull request #2 from IFoA-ADSWP/main

1ac5c7e

Updating local branch

Add files via upload

864461d

Add files via upload

45f2460

Here are the files whihc I have been working on. The baselining notebook contains all of my code which was used to perform the experiments, the baselining notebook summary markdown file contains a much simpler output for the findings.

Add files via upload

60b7e13

feat: sync baseline utilities and notebook naming updates

d1903b8

merge: resolve conflicts with origin/main - relocate docs, accept new…

b7e344c

… v1.0 notebook

Merge pull request #3 from IFoA-ADSWP/eda/baselining_notebook

de38329

Update baseline utilities and notebook naming

Add time_left.tsv to track iteration progress and time estimates for …

c5aa5dc

…CatBoost experiments

Merge branch 'main' into feature/tabpfn-v2.5-update

54e1880

Merge pull request #4 from IFoA-ADSWP:feature/tabpfn-v2.5-update

41fae3e

Regressional Testing & analysis

feat: add reproducibility appendix and follow-up analysis for TabPFN …

994d98f

…insurance study

Merge pull request #5 from IFoA-ADSWP:feature/tabpfn-v2.5-update

0d0147f

Short Journal

feat: add Stage A and B findings report with key insights and recomme…

c0120f8

…ndations

scotthawes and others added 10 commits April 2, 2026 02:55

Merge pull request #6 from IFoA-ADSWP/feature/tabpfn-v2.5-update

bdcbcf3

Initial Fine Tuning Experiment

reproducability section

722a9c5

Merge pull request #7 from IFoA-ADSWP:feature/tabpfn-v2.5-update

f2db98a

Adding reproducability

Adding note to clarify whether regressor or classifier was used

63a6fb0

Merge pull request #8 from IFoA-ADSWP:feature/tabpfn-v2.5-update

b1e9253

Adding note to clarify use of classfifier

Updating technical funding request

693cded

Merge pull request #9 from IFoA-ADSWP:feature/tabpfn-v2.5-update

55893c7

Updating funding request

Delete docs/reports/BEFORE_AFTER_COMPARISON.md

11df5be

Delete docs/reports/UNIFIED_PAPER_FINAL.md

cc52ba4

Delete docs/reports/ARTICLE_REVISED_COMPLETE.md

a169c06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing deprecated files#855

Removing deprecated files#855
scotthawes wants to merge 40 commits intoPriorLabs:mainfrom
IFoA-ADSWP:scotthawes-patch-1

scotthawes commented Apr 2, 2026

Uh oh!

gemini-code-assist bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

scotthawes commented Apr 2, 2026

Uh oh!

gemini-code-assist bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants