Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md	[Doc] Update running command in README (#1206 )	May 29, 2024
TheoremQA_5shot_gen_6f0af8.py	TheoremQA_5shot_gen_6f0af8.py	[Format] Add config lints (#892 )	May 14, 2024
TheoremQA_few_shot_examples.py	TheoremQA_few_shot_examples.py	[Format] Add config lints (#892 )	May 14, 2024
TheoremQA_few_shot_examples_official.py	TheoremQA_few_shot_examples_official.py	[Feature] Add TheoremQA with 5-shot (#1048 )	Apr 22, 2024
TheoremQA_gen.py	TheoremQA_gen.py	[Feature] Add TheoremQA with 5-shot (#1048 )	Apr 22, 2024
deprecated_TheoremQA_gen_424e0a.py	deprecated_TheoremQA_gen_424e0a.py	[Format] Add config lints (#892 )	May 14, 2024
deprecated_TheoremQA_gen_7009de.py	deprecated_TheoremQA_gen_7009de.py	[Format] Add config lints (#892 )	May 14, 2024
deprecated_TheoremQA_gen_ef26ca.py	deprecated_TheoremQA_gen_ef26ca.py	[Format] Add config lints (#892 )	May 14, 2024
deprecated_TheoremQA_post_v2_gen_2c2583.py	deprecated_TheoremQA_post_v2_gen_2c2583.py	[Format] Add config lints (#892 )	May 14, 2024
deprecated_TheoremQA_post_v2_gen_ef26ca.py	deprecated_TheoremQA_post_v2_gen_ef26ca.py	[Format] Add config lints (#892 )	May 14, 2024

Name

Last commit message

Last commit date

README.md

[Doc] Update running command in README (#1206 )

May 29, 2024

TheoremQA_5shot_gen_6f0af8.py

[Format] Add config lints (#892 )

May 14, 2024

TheoremQA_few_shot_examples.py

[Format] Add config lints (#892 )

May 14, 2024

TheoremQA_few_shot_examples_official.py

[Feature] Add TheoremQA with 5-shot (#1048 )

Apr 22, 2024

TheoremQA_gen.py

[Feature] Add TheoremQA with 5-shot (#1048 )

Apr 22, 2024

deprecated_TheoremQA_gen_424e0a.py

[Format] Add config lints (#892 )

May 14, 2024

deprecated_TheoremQA_gen_7009de.py

[Format] Add config lints (#892 )

May 14, 2024

deprecated_TheoremQA_gen_ef26ca.py

[Format] Add config lints (#892 )

May 14, 2024

deprecated_TheoremQA_post_v2_gen_2c2583.py

[Format] Add config lints (#892 )

May 14, 2024

deprecated_TheoremQA_post_v2_gen_ef26ca.py

[Format] Add config lints (#892 )

May 14, 2024

TheoremQA

python3 run.py --models hf_internlm2_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug
python3 run.py --models hf_internlm2_chat_7b --datasets TheoremQA_5shot_gen_6f0af8 --debug

Base Models

model	TheoremQA
llama-7b-turbomind	10.25
llama-13b-turbomind	11.25
llama-30b-turbomind	14.25
llama-65b-turbomind	15.62
llama-2-7b-turbomind	12.62
llama-2-13b-turbomind	11.88
llama-2-70b-turbomind	15.62
llama-3-8b-turbomind	20.25
llama-3-70b-turbomind	33.62
internlm2-1.8b-turbomind	10.50
internlm2-7b-turbomind	21.88
internlm2-20b-turbomind	26.00
qwen-1.8b-turbomind	9.38
qwen-7b-turbomind	15.00
qwen-14b-turbomind	21.62
qwen-72b-turbomind	27.12
qwen1.5-0.5b-hf	5.88
qwen1.5-1.8b-hf	12.00
qwen1.5-4b-hf	13.75
qwen1.5-7b-hf	4.25
qwen1.5-14b-hf	12.62
qwen1.5-32b-hf	26.62
qwen1.5-72b-hf	26.62
qwen1.5-moe-a2-7b-hf	7.50
mistral-7b-v0.1-hf	17.00
mistral-7b-v0.2-hf	16.25
mixtral-8x7b-v0.1-hf	24.12
mixtral-8x22b-v0.1-hf	36.75
yi-6b-hf	13.88
yi-34b-hf	24.75
deepseek-7b-base-hf	12.38
deepseek-67b-base-hf	21.25

Chat Models

model	TheoremQA
qwen1.5-0.5b-chat-hf	9.00
qwen1.5-1.8b-chat-hf	9.25
qwen1.5-4b-chat-hf	13.88
qwen1.5-7b-chat-hf	12.25
qwen1.5-14b-chat-hf	13.63
qwen1.5-32b-chat-hf	19.25
qwen1.5-72b-chat-hf	22.75
qwen1.5-110b-chat-hf	17.50
internlm2-chat-1.8b-hf	13.63
internlm2-chat-1.8b-sft-hf	12.88
internlm2-chat-7b-hf	18.50
internlm2-chat-7b-sft-hf	18.75
internlm2-chat-20b-hf	23.00
internlm2-chat-20b-sft-hf	25.12
llama-3-8b-instruct-hf	19.38
llama-3-70b-instruct-hf	36.25
llama-3-8b-instruct-lmdeploy	19.62
llama-3-70b-instruct-lmdeploy	34.50
mistral-7b-instruct-v0.1-hf	12.62
mistral-7b-instruct-v0.2-hf	11.38
mixtral-8x7b-instruct-v0.1-hf	26.00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

TheoremQA

TheoremQA

README.md

TheoremQA

Base Models

Chat Models

Files

TheoremQA

Directory actions

More options

Directory actions

More options

Latest commit

History

TheoremQA

Folders and files

parent directory

README.md

TheoremQA

Base Models

Chat Models