Skip to content

Files

Latest commit

d59189b · May 29, 2024

History

History

TheoremQA

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
May 29, 2024
May 14, 2024
May 14, 2024
Apr 22, 2024
Apr 22, 2024
May 14, 2024
May 14, 2024
May 14, 2024
May 14, 2024
May 14, 2024

TheoremQA

python3 run.py --models hf_internlm2_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
python3 run.py --models hf_internlm2_chat_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug

Base Models

model TheoremQA
llama-7b-turbomind 10.25
llama-13b-turbomind 11.25
llama-30b-turbomind 14.25
llama-65b-turbomind 15.62
llama-2-7b-turbomind 12.62
llama-2-13b-turbomind 11.88
llama-2-70b-turbomind 15.62
llama-3-8b-turbomind 20.25
llama-3-70b-turbomind 33.62
internlm2-1.8b-turbomind 10.50
internlm2-7b-turbomind 21.88
internlm2-20b-turbomind 26.00
qwen-1.8b-turbomind 9.38
qwen-7b-turbomind 15.00
qwen-14b-turbomind 21.62
qwen-72b-turbomind 27.12
qwen1.5-0.5b-hf 5.88
qwen1.5-1.8b-hf 12.00
qwen1.5-4b-hf 13.75
qwen1.5-7b-hf 4.25
qwen1.5-14b-hf 12.62
qwen1.5-32b-hf 26.62
qwen1.5-72b-hf 26.62
qwen1.5-moe-a2-7b-hf 7.50
mistral-7b-v0.1-hf 17.00
mistral-7b-v0.2-hf 16.25
mixtral-8x7b-v0.1-hf 24.12
mixtral-8x22b-v0.1-hf 36.75
yi-6b-hf 13.88
yi-34b-hf 24.75
deepseek-7b-base-hf 12.38
deepseek-67b-base-hf 21.25

Chat Models

model TheoremQA
qwen1.5-0.5b-chat-hf 9.00
qwen1.5-1.8b-chat-hf 9.25
qwen1.5-4b-chat-hf 13.88
qwen1.5-7b-chat-hf 12.25
qwen1.5-14b-chat-hf 13.63
qwen1.5-32b-chat-hf 19.25
qwen1.5-72b-chat-hf 22.75
qwen1.5-110b-chat-hf 17.50
internlm2-chat-1.8b-hf 13.63
internlm2-chat-1.8b-sft-hf 12.88
internlm2-chat-7b-hf 18.50
internlm2-chat-7b-sft-hf 18.75
internlm2-chat-20b-hf 23.00
internlm2-chat-20b-sft-hf 25.12
llama-3-8b-instruct-hf 19.38
llama-3-70b-instruct-hf 36.25
llama-3-8b-instruct-lmdeploy 19.62
llama-3-70b-instruct-lmdeploy 34.50
mistral-7b-instruct-v0.1-hf 12.62
mistral-7b-instruct-v0.2-hf 11.38
mixtral-8x7b-instruct-v0.1-hf 26.00