The results of the description do not correspond to the paper #10

zhaodongliang678 · 2024-10-23T07:04:11Z

Hello, thank you very much for your work.

When I tested on the msvd-caption-test, I first used the fill_in_video_file you provided to add the video directory. Then, I used the Tarsier-7B and Tarsier-34B models respectively for inference and evaluation. The final CIDEr scores were 56.7 and 58.9, which differ significantly from those reported in the paper. Additionally, I also performed inference and evaluation using MSR-VTT, and the Tarsier-34B result was 31.4.

I used two A800 GPUs, each with 80G of memory, and made no other modifications. Therefore, I would like to ask if there are any other details that I might have overlooked. I look forward to your reply.

jwwang424 · 2024-10-30T11:56:41Z

Thanks for the reminder! We have just updated the prompts in the metadata to our latest version, which is consistent to the test results reported in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The results of the description do not correspond to the paper #10

The results of the description do not correspond to the paper #10

zhaodongliang678 commented Oct 23, 2024

jwwang424 commented Oct 30, 2024 •

edited

Loading

The results of the description do not correspond to the paper #10

The results of the description do not correspond to the paper #10

Comments

zhaodongliang678 commented Oct 23, 2024

jwwang424 commented Oct 30, 2024 • edited Loading

jwwang424 commented Oct 30, 2024 •

edited

Loading