Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: Introduce new tag FI_TAG_RPC #10792
base: main
Are you sure you want to change the base?
core: Introduce new tag FI_TAG_RPC #10792
Changes from all commits
e9d1ce8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can callers ensure that a receive buffer is posted? The data transfer APIs are asynchronous. In theory, there is a race here which could result in fi_tsend being issued before provider/HW has processed fi_trecv. In addition, should this race be hit with FI_ORDER_SAS, API contract may be broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know the trecv APIs are async. As far as I can see, it's not async for the TCP provider.
This sounds like a serious issue to me. Because if there exists providers who require this for zerocopy or DMA, this behavior will make zerocopy/DMA impossible since the buffer is not ready while data comes. @shefty @iziemba can you share some more insights? It's unlikely for me to work on that but it is still interesting to know how you would approach it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that once fi_trecv() or fi_recv() return, that the buffers are available. This is needed by apps trying to implement flow control. If fi_trecv() or fi_recv() just mean that the buffer may someday in the future be made available, we're almost certainly making life impossible for apps. No credit-based flow control will ever work
The completion of a posted receive buffer is asynchronous. The posting should not be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think unexpected messaging closes the race between a buffer receive post and an incoming message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern is that the unexpected buffers would be unlimited and nobody knows how long they would stay in that buffer.
Coming back to this issue, if the fi_addr is not configured when the unexpected messages comes, it will disable polling the socket fd, which will choke the tcp connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand there might be other complications but this is certainly helping us. I wonder if we could do it the current way and we can always come back to improve it when new information is discovered. @shefty @iziemba as long as there are something to debate, it wouldn't be terribly wrong or an obvious mistake, lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If mem_tag_format is to be used, part of the FI_TAG_RPC needs to address the format. I think Mercury today uses FI_TAG_BITS. FI_TAG_RPC could be defined to be the following:
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been out for about a week. IMO, it would be nice if tag_rpc could also define more semantic meaning around the tag format. For example, ignore bits aren't being used.
I also think it's worth considering this from the angle of resource management, but that's harder to fit a solution into without redefining resource_mgmt (it's currently an enum that's a boolean for practical purposes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iziemba I think FI_TAG_RPC is closer to FI_TAG_CCL in the sense that exact match is required. Note that both FI_TAG_MPI and FI_TAG_CCL are derivatives of FI_TAG_BITS. They all use leading 0s to indicate number of reserved bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that RPC is better aligned exact match than wildcard.