Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more info to ANN_BENCH context #248

Merged
merged 4 commits into from
Jul 29, 2024

Conversation

achirkin
Copy link
Contributor

Add extra information to benchmark context for better reproducibility and performance analysis:

  1. Full command line used to call the executable (so you can copy-paste and run again).
  2. More CUDA device information: whether HMM, AST, or host atomics are available (how GPU can efficiently communicate with CPU).
  3. Host information: min/max frequences, used virtual processors and cores, available physical memory and swap (does the benchmark segfault due to not enough host memory? is SMT enabled? etc).

Addresses parts of #160

@achirkin achirkin requested a review from a team as a code owner July 24, 2024 09:30
@github-actions github-actions bot added the cpp label Jul 24, 2024
@achirkin
Copy link
Contributor Author

I'm thinking whether we should include some environment variables if present or the machine hostname? Wouldn't that be a bit fishy in terms of user privacy when sharing the output files?

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Artem, LGTM, it is useful to extend the context information. I do not see any issues with adding the hostname.

@tfeher
Copy link
Contributor

tfeher commented Jul 24, 2024

For env vars, I am not sure what would be relevant.
For OpenMP context, you could add omp_get_max_threads()

@achirkin
Copy link
Contributor Author

We could get the openmp-related env vars to see if the limit to the number of threads is set explicitly.
I wouldn't like to add omp_get_max_threads() and other OpenMP functions to avoid introducing openmp dependency in the benchmark (for the algorithms that do not use OpenMP and for the ANN_BENCH executable in the single-exe mode). I'm also not sure if cuVS team decides to move away from using OpenMP in favor of standard C++ threads in future.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jul 24, 2024
@tfeher
Copy link
Contributor

tfeher commented Jul 29, 2024

/merge

@rapids-bot rapids-bot bot merged commit 98c07f9 into rapidsai:branch-24.08 Jul 29, 2024
54 checks passed
divyegala pushed a commit to divyegala/cuvs that referenced this pull request Jul 31, 2024
Add extra information to benchmark context for better reproducibility and performance analysis:

  1. Full command line used to call the executable (so you can copy-paste and run again).
  2. More CUDA device information: whether HMM, AST, or host atomics are available (how GPU can efficiently communicate with CPU).
  3. Host information: min/max frequences, used virtual processors and cores, available physical memory and swap (does the benchmark segfault due to not enough host memory? is SMT enabled? etc).

Addresses parts of rapidsai#160

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: rapidsai#248
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants