Binary Classification Benchmarks
Dataset Overview
Dataset | Instances | Features | Numeric | Categorical | Target | Classes | Imbalance |
---|---|---|---|---|---|---|---|
Bioresponse | 3751 | 1777 | 1776 | 1 | Bioresponse | 2 | 1.18462 |
Credit approval | 690 | 15 | 6 | 9 | class | 2 | 1.247557003257329 |
Quick Check
Dataset | Type | Family / Variant | Test accuracy | Test ROC AUC |
---|---|---|---|---|
Bioresponse | Best classical | Trees / XGBoost | 0.81847 ± 0.00477 | 0.88961 ± 0.00122 |
Best transformed | CNN / REFINED | 0.78686 ± 0.01036 | 0.85621 ± 0.00549 | |
Credit approval | Best classical | MLP / MLP | 0.87692 ± 0.00430 | 0.94468 ± 0.00151 |
Best transformed | ViT / FeatureWrap | 0.89808 ± 0.03010 | 0.95735 ± 0.00327 |
Bioresponse
Leaderboard
Family | Best variant | Test Accuracy (↑) | Test ROC AUC (↑) | Train time (s) | #Params | FLOPs |
---|---|---|---|---|---|---|
Trees | XGBoost | 0.81847 ± 0.00477 | 0.88961 ± 0.00122 | 0.41230 | — | — |
MLP | MLP | 0.78615 ± 0.01182 | 0.84194 ± 0.00613 | 13.43436 | 1,041,409 | 2,082,048 |
ViT | REFINED | 0.76945 ± 0.01214 | 0.83747 ± 0.00358 | 67.72601 | 4,426,201 | 43,919,946 |
ViT+MLP | REFINED | 0.77620 ± 0.01066 | 0.84131 ± 0.00261 | 85.62639 | 5,496,729 | 637,718,480 |
CNN | REFINED | 0.78686 ± 0.01036 | 0.85621 ± 0.00549 | 56.44847 | 2,715,313 | 1,506,688,896 |
CNN+MLP | REFINED | 0.78686 ± 0.01036 | 0.85253 ± 0.01044 | 63.84759 | 3,799,729 | 1,508,856,848 |
Architecture Results
Tree Baselines
Method | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Train time (s) |
---|---|---|---|---|---|---|
XGBoost | 0.81847 ± 0.00477 | 0.83107 ± 0.00477 | 0.82426 ± 0.00680 | 0.83800 ± 0.00371 | 0.88961 ± 0.00122 | 0.41230 |
CatBoost | 0.81457 ± 0.00463 | 0.82622 ± 0.00461 | 0.81377 ± 0.00783 | 0.83912 ± 0.00567 | 0.87989 ± 0.00397 | 25.20507 ± 0.05985 |
LightGBM | 0.80888 ± 0.00728 | 0.82135 ± 0.00736 | 0.81115 ± 0.01459 | 0.83205 ± 0.01089 | 0.87810 ± 0.00307 | 1.24869 ± 0.04968 |
MLP
Train loss | Val loss | Test loss | Test accuracy (↑) | Test precision | Test recall | Test F1 | Test ROC AUC | Test LogLoss | Test MCC | Total params | FLOPs |
---|---|---|---|---|---|---|---|---|---|---|---|
0.38982 ± 0.03167 | 0.52523 ± 0.00288 | 0.49655 ± 0.00660 | 0.78615 ± 0.01182 | 0.81263 ± 0.01049 | 0.78689 ± 0.02714 | 0.79930 ± 0.01397 | 0.84194 ± 0.00613 | 0.49659 ± 0.00617 | 0.57126 ± 0.02221 | 1,041,409 | 2,082,048 |
ViT
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 39.12782 ± 0.22880 | 0.69130 ± 0.00918 | 0.74404 ± 0.01060 | 0.67546 ± 0.00969 | 0.82885 ± 0.02909 | 0.72977 ± 0.01074 | 0.59970 ± 0.00773 | 1,716,385 | 85,861,632 |
IGTD | 53.50183 ± 0.76464 | 0.74458 ± 0.02887 | 0.78704 ± 0.01479 | 0.72113 ± 0.03852 | 0.86885 ± 0.02820 | 0.82965 ± 0.01231 | 0.55979 ± 0.03481 | 4,550,263 | 155,284,479 |
REFINED | 67.72601 ± 1.34755 | 0.76945 ± 0.01214 | 0.78885 ± 0.00776 | 0.78407 ± 0.02504 | 0.79475 ± 0.02279 | 0.83747 ± 0.00358 | 0.51418 ± 0.00312 | 4,426,201 | 43,919,946 |
FeatureWrap | 63.76220 ± 1.49083 | 0.72220 ± 0.01321 | 0.73929 ± 0.00978 | 0.75338 ± 0.02814 | 0.72721 ± 0.02862 | 0.78393 ± 0.01199 | 0.57526 ± 0.01544 | 9,800,161 | 333,217,900 |
ViT + MLP
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 62.18761 ± 6.29526 | 0.77336 ± 0.00463 | 0.78515 ± 0.00592 | 0.80698 ± 0.00511 | 0.76459 ± 0.01257 | 0.83574 ± 0.00424 | 0.50571 ± 0.00383 | 2,657,889 | 809,617,408 ± 160,014,690 |
IGTD | 73.41846 ± 1.17743 | 0.76909 ± 0.01554 | 0.79588 ± 0.00733 | 0.76581 ± 0.03178 | 0.83016 ± 0.02565 | 0.83846 ± 0.00896 | 0.50748 ± 0.01806 | 5,871,831 | 3,481,690,302 ± 305,835,318 |
REFINED | 85.62639 ± 2.60948 | 0.77620 ± 0.01066 | 0.78976 ± 0.01479 | 0.80488 ± 0.02779 | 0.77771 ± 0.04575 | 0.84131 ± 0.00261 | 0.51031 ± 0.01926 | 5,496,729 | 637,718,480 ± 126,040,181 |
FeatureWrap | 83.09175 ± 5.88656 | 0.78224 ± 0.01070 | 0.79224 ± 0.01093 | 0.81974 ± 0.00857 | 0.76656 ± 0.01340 | 0.83531 ± 0.00602 | 0.50986 ± 0.01106 | 10,885,697 | 7,426,666,584 ± 652,366,162 |
CNN
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 34.07241 ± 0.46630 | 0.70444 ± 0.01650 | 0.75249 ± 0.00673 | 0.69077 ± 0.02870 | 0.82951 ± 0.04423 | 0.75853 ± 0.01253 | 0.58758 ± 0.00870 | 30,369 | 6,680,388 |
IGTD | 51.55007 ± 1.14411 | 0.63197 ± 0.08147 | 0.69982 ± 0.03777 | 0.65871 ± 0.10504 | 0.79934 ± 0.19212 | 0.75447 ± 0.03277 | 0.69259 ± 0.11148 | 5,417,457 | 978,246,400 |
REFINED | 56.44847 ± 0.36734 | 0.79183 ± 0.01769 | 0.80690 ± 0.01185 | 0.81448 ± 0.04146 | 0.80262 ± 0.04169 | 0.85621 ± 0.00549 | 0.48291 ± 0.01118 | 2,715,313 | 1,506,688,896 |
FeatureWrap | 77.51294 ± 0.31586 | 0.73996 ± 0.01592 | 0.75638 ± 0.03293 | 0.76865 ± 0.03896 | 0.75279 ± 0.08763 | 0.80743 ± 0.00882 | 0.54629 ± 0.01841 | 5,155,537 | 2,894,869,760 |
CNN + MLP
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 52.17665 ± 2.32481 | 0.77869 ± 0.00749 | 0.79065 ± 0.01160 | 0.81068 ± 0.01265 | 0.77246 ± 0.03098 | 0.84023 ± 0.00584 | 0.50143 ± 0.00800 | 1,088,657 | 184,719,528 |
IGTD | 59.26457 ± 1.49678 | 0.70764 ± 0.02508 | 0.73493 ± 0.04397 | 0.73387 ± 0.08276 | 0.76656 ± 0.15670 | 0.80774 ± 0.02291 | 0.60415 ± 0.08857 | 6,056,049 | 979,523,520 |
REFINED | 63.84759 ± 1.33045 | 0.78686 ± 0.01036 | 0.80513 ± 0.00924 | 0.80144 ± 0.04783 | 0.81443 ± 0.05808 | 0.85253 ± 0.01044 | 0.50015 ± 0.02944 | 3,799,729 | 1,508,856,848 |
FeatureWrap | 81.22601 ± 0.07450 | 0.74423 ± 0.00879 | 0.75265 ± 0.02133 | 0.79149 ± 0.03435 | 0.72197 ± 0.06192 | 0.81223 ± 0.01007 | 0.54734 ± 0.02279 | 5,999,569 | 2,896,557,568 |
Credit approval
Leaderboard
Family | Best variant | Test Accuracy (↑) | Test ROC AUC (↑) | Train time (s) | #Params | FLOPs |
---|---|---|---|---|---|---|
Trees | LightGBM | 0.75577 ± 0.00527 | 0.95064 ± 0.00133 | 0.09859 | — | — |
MLP | MLP | 0.87692 ± 0.00430 | 0.94468 ± 0.00151 | 6.46258 | 27,137 | 53,760 |
ViT | FeatureWrap | 0.89808 ± 0.03010 | 0.95735 ± 0.00327 | 24.34938 | 7,265,761 | 72,676,493 |
ViT+MLP | BIE | 0.88462 ± 0.00962 | 0.95232 ± 0.00606 | 40.93451 | 1,859,585 | 1,387,447,200 |
CNN | BarGraph | 0.87115 ± 0.01994 | 0.94483 ± 0.00932 | 24.68416 | 189,889 | 132,799,648 |
CNN+MLP | DistanceMatrix | 0.88077 ± 0.01874 | 0.94430 ± 0.00532 | 35.87942 | 488,481 | 453,152,896 |
Architecture Results
Tree Baselines
Method | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Train time (s) |
---|---|---|---|---|---|---|
LightGBM | 0.75577 ± 0.00527 | 0.81780 ± 0.00321 | 0.98276 ± 0.00000 | 0.70027 ± 0.00470 | 0.95064 ± 0.00133 | 0.09859 ± 0.00641 |
XGBoost | 0.89808 ± 0.00860 | 0.90421 ± 0.00727 | 0.86207 ± 0.00000 | 0.95079 ± 0.01597 | 0.94535 ± 0.00056 | 0.41230 ± 0.00573 |
CatBoost | 0.88462 ± 0.00000 | 0.88679 ± 0.00000 | 0.81035 ± 0.00000 | 0.97917 ± 0.00000 | 0.94670 ± 0.00041 | 3.63444 ± 0.05896 |
MLP
Train loss | Val loss | Test loss | Test accuracy (↑) | Test precision | Test recall | Test F1 (↑) | Test ROC AUC | Test MCC | Total params | FLOPs |
---|---|---|---|---|---|---|---|---|---|---|
0.32544 ± 0.01337 | 0.32932 ± 0.00195 | 0.28471 ± 0.00117 | 0.87692 ± 0.00430 | 0.93474 ± 0.00934 | 0.83793 ± 0.00944 | 0.88363 ± 0.00421 | 0.94468 ± 0.00151 | 0.75903 ± 0.00888 | 27,137 | 53,760 |
ViT
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 24.18381 ± 0.14994 | 0.97334 ± 0.00296 | 0.82123 ± 0.03314 | 0.88654 ± 0.02193 | 0.86791 ± 0.03361 | 0.94138 ± 0.01966 | 0.90270 ± 0.01715 | 1,600,609 | 83,777,096 |
IGTD | 68.75373 ± 0.10999 | 0.97080 ± 0.00332 | 0.75752 ± 0.01060 | 0.84423 ± 0.01580 | 0.84125 ± 0.02458 | 0.88966 ± 0.02885 | 0.86430 ± 0.01354 | 19,520,281 | 667,902,402 |
REFINED | 29.86047 ± 0.45036 | 0.97190 ± 0.00150 | 0.76177 ± 0.01712 | 0.87500 ± 0.00680 | 0.91834 ± 0.00850 | 0.85172 ± 0.01542 | 0.88368 ± 0.00721 | 5,268,487 | 696,741,440 |
DistanceMatrix | 28.01472 ± 0.10554 | 0.97244 ± 0.00438 | 0.75849 ± 0.03001 | 0.86539 ± 0.02150 | 0.89253 ± 0.04424 | 0.86552 ± 0.03316 | 0.87773 ± 0.01787 | 3,595,591 | 71,253,567 |
BarGraph | 33.32826 ± 2.41426 | 0.97363 ± 0.00281 | 0.77823 ± 0.03742 | 0.86539 ± 0.01799 | 0.87735 ± 0.02393 | 0.88276 ± 0.02833 | 0.87968 ± 0.01632 | 1,036,999 | 20,876,839 |
Combination | 43.79120 ± 0.31413 | 0.97146 ± 0.00178 | 0.76056 ± 0.01574 | 0.86923 ± 0.01874 | 0.88397 ± 0.03368 | 0.88276 ± 0.00771 | 0.88301 ± 0.01423 | 2,364,615 | 47,539,495 |
SuperTML | 55.46818 ± 0.95789 | 0.97304 ± 0.00159 | 0.76590 ± 0.03576 | 0.86731 ± 0.01720 | 0.89317 ± 0.06675 | 0.87586 ± 0.06264 | 0.88065 ± 0.00824 | 11,972,449 | 1,580,762,012 |
FeatureWrap | 24.34938 ± 0.09730 | 0.97597 ± 0.00245 | 0.82721 ± 0.02674 | 0.89808 ± 0.03010 | 0.87395 ± 0.04923 | 0.95862 ± 0.00944 | 0.91357 ± 0.02319 | 7,265,761 | 72,676,493 |
BIE | 31.80109 ± 0.45090 | 0.97138 ± 0.00737 | 0.76480 ± 0.02098 | 0.87308 ± 0.01580 | 0.89001 ± 0.02562 | 0.88276 ± 0.03738 | 0.88563 ± 0.01540 | 1,831,681 | 62,989,904 |
ViT + MLP
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 36.11794 ± 0.54655 | 0.97265 ± 0.00400 | 0.78003 ± 0.02799 | 0.87308 ± 0.02394 | 0.88942 ± 0.05987 | 0.88966 ± 0.04496 | 0.88714 ± 0.01516 | 2,153,313 | 774,421,568 |
IGTD | 88.77799 ± 1.40963 | 0.97031 ± 0.00462 | 0.74201 ± 0.03719 | 0.85577 ± 0.02255 | 0.88956 ± 0.03197 | 0.84828 ± 0.04463 | 0.86746 ± 0.02201 | 19,548,953 | 6,611,922,832 |
REFINED | 37.56043 ± 0.44217 | 0.97315 ± 0.00257 | 0.75091 ± 0.02145 | 0.87692 ± 0.00430 | 0.93162 ± 0.01586 | 0.84138 ± 0.00771 | 0.88408 ± 0.00269 | 5,142,983 | 13,348,791,360 |
DistanceMatrix | 33.55061 ± 1.04348 | 0.97134 ± 0.00276 | 0.73496 ± 0.01761 | 0.87308 ± 0.00430 | 0.94842 ± 0.01573 | 0.81724 ± 0.01542 | 0.87775 ± 0.00454 | 3,738,311 | 779,787,256 |
BarGraph | 47.16871 ± 1.24050 | 0.96340 ± 0.00583 | 0.77224 ± 0.02284 | 0.87308 ± 0.01053 | 0.89774 ± 0.01961 | 0.87241 ± 0.02313 | 0.88457 ± 0.00988 | 1,084,551 | 803,412,036 |
Combination | 58.56323 ± 6.61817 | 0.96940 ± 0.00273 | 0.73836 ± 0.04004 | 0.87308 ± 0.00430 | 0.91746 ± 0.03598 | 0.85172 ± 0.04496 | 0.88189 ± 0.00790 | 2,436,903 | 527,485,752 |
SuperTML | 74.32829 ± 5.53914 | 0.97351 ± 0.00345 | 0.75122 ± 0.03228 | 0.86346 ± 0.01850 | 0.88901 ± 0.06414 | 0.87241 ± 0.06047 | 0.87717 ± 0.01092 | 12,460,001 | 30,112,373,304 |
FeatureWrap | 35.66323 ± 0.36383 | 0.97168 ± 0.00230 | 0.76905 ± 0.03416 | 0.89039 ± 0.01458 | 0.89089 ± 0.02661 | 0.91724 ± 0.03738 | 0.90311 ± 0.01369 | 7,325,569 | 3,665,691,148 |
BIE | 40.93451 ± 0.83433 | 0.97242 ± 0.00145 | 0.75235 ± 0.03215 | 0.88462 ± 0.00962 | 0.94643 ± 0.02228 | 0.84138 ± 0.01889 | 0.89051 ± 0.00897 | 1,859,585 | 1,387,447,200 |
CNN
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 22.76856 ± 0.21394 | 0.97207 ± 0.00154 | 0.76000 ± 0.03333 | 0.87885 ± 0.01609 | 0.91333 ± 0.01960 | 0.86552 ± 0.03316 | 0.88831 ± 0.01599 | 108,793 | 16,313,808 |
IGTD | 34.86336 ± 0.12589 | 0.96198 ± 0.00949 | 0.70385 ± 0.06975 | 0.85577 ± 0.01923 | 0.90302 ± 0.01110 | 0.83103 ± 0.04463 | 0.86486 ± 0.02211 | 2,758,497 | 26,344,896 |
REFINED | 19.94357 ± 0.08692 | 0.96625 ± 0.00279 | 0.72330 ± 0.00569 | 0.86731 ± 0.01971 | 0.90918 ± 0.04680 | 0.85172 ± 0.06744 | 0.87678 ± 0.02234 | 2,112,225 | 20,382,208 |
DistanceMatrix | 32.35910 ± 2.07113 | 0.96938 ± 0.00313 | 0.77864 ± 0.02282 | 0.85577 ± 0.02150 | 0.87447 ± 0.04377 | 0.86897 ± 0.03575 | 0.87059 ± 0.01765 | 431,553 | 321,349,776 |
BarGraph | 24.68416 ± 0.52450 | 0.96888 ± 0.00310 | 0.71735 ± 0.04057 | 0.87115 ± 0.01994 | 0.92317 ± 0.04319 | 0.84138 ± 0.02557 | 0.87950 ± 0.01609 | 189,889 | 132,799,648 |
Combination | 30.61405 ± 0.78141 | 0.96607 ± 0.00320 | 0.76374 ± 0.02576 | 0.87692 ± 0.02085 | 0.91658 ± 0.02910 | 0.85862 ± 0.03932 | 0.88591 ± 0.02025 | 1,126,289 | 729,226,960 |
SuperTML | 56.27273 ± 1.87451 | 0.97041 ± 0.00524 | 0.75280 ± 0.02933 | 0.85962 ± 0.03160 | 0.86397 ± 0.06470 | 0.89655 ± 0.04044 | 0.87766 ± 0.02171 | 357,681 | 244,017,024 |
FeatureWrap | 21.80783 ± 0.12512 | 0.97182 ± 0.00252 | 0.80075 ± 0.04721 | 0.87885 ± 0.02413 | 0.87491 ± 0.06307 | 0.92069 ± 0.04152 | 0.89497 ± 0.01722 | 1,679,265 | 26,047,680 |
BIE | 23.68857 ± 1.57080 | 0.97263 ± 0.00359 | 0.79526 ± 0.05025 | 0.82692 ± 0.02040 | 0.80599 ± 0.02899 | 0.91035 ± 0.01889 | 0.85457 ± 0.01417 | 1,219,369 | 452,641,280 |
CNN + MLP
Method | Train time (s) | Test Acc (↑) | Test F1 (↑) | Test Precision | Test Recall | Test ROC AUC | Test LogLoss | #Params | FLOPs |
---|---|---|---|---|---|---|---|---|---|
TINTO | 23.51064 ± 1.06906 | 0.96707 ± 0.00614 | 0.75082 ± 0.02054 | 0.86731 ± 0.02489 | 0.89853 ± 0.04504 | 0.86207 ± 0.03448 | 0.87889 ± 0.02100 | 170,793 | 16,437,264 |
IGTD | 36.58546 ± 0.15293 | 0.95951 ± 0.00815 | 0.72782 ± 0.01887 | 0.88462 ± 0.00962 | 0.93942 ± 0.01918 | 0.84828 ± 0.01889 | 0.89128 ± 0.00918 | 2,735,457 | 26,298,688 |
REFINED | 21.89030 ± 0.07673 | 0.96761 ± 0.00358 | 0.72065 ± 0.02280 | 0.87885 ± 0.00860 | 0.94184 ± 0.01278 | 0.83448 ± 0.01542 | 0.88479 ± 0.00862 | 2,491,169 | 21,139,456 |
DistanceMatrix | 35.87942 ± 2.03683 | 0.97073 ± 0.00566 | 0.72532 ± 0.04591 | 0.88077 ± 0.01874 | 0.90919 ± 0.05523 | 0.87931 ± 0.05172 | 0.89164 ± 0.01540 | 488,481 | 321,463,144 |
BarGraph | 35.27408 ± 3.46612 | 0.97111 ± 0.00284 | 0.71134 ± 0.03457 | 0.87308 ± 0.01426 | 0.94121 ± 0.01852 | 0.82414 ± 0.01889 | 0.87865 ± 0.01386 | 219,425 | 132,858,336 |
Combination | 49.18553 ± 2.91262 | 0.96438 ± 0.00472 | 0.70106 ± 0.02357 | 0.85769 ± 0.02753 | 0.88503 ± 0.05529 | 0.86207 ± 0.05314 | 0.87115 ± 0.02328 | 1,317,009 | 15,321,762,960 |
SuperTML | 56.67474 ± 4.74708 | 0.96369 ± 0.00613 | 0.74369 ± 0.03724 | 0.86731 ± 0.00805 | 0.86725 ± 0.03440 | 0.90345 ± 0.05115 | 0.88339 ± 0.00958 | 441,649 | 244,184,512 |
FeatureWrap | 23.54730 ± 0.12061 | 0.97332 ± 0.00305 | 0.74736 ± 0.03495 | 0.87115 ± 0.00860 | 0.92557 ± 0.03114 | 0.83793 ± 0.03132 | 0.87876 ± 0.00864 | 1,775,201 | 26,239,200 |
BIE | 26.65689 ± 1.21066 | 0.96959 ± 0.00750 | 0.76029 ± 0.01694 | 0.85385 ± 0.02085 | 0.86260 ± 0.03206 | 0.87931 ± 0.02112 | 0.87046 ± 0.01693 | 1,475,625 | 453,152,896 |