Skip to content

UCP/WIREUP: Introduce address request/reply wireup messages #10684

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

evgeny-leksikov
Copy link
Contributor

What?

  • Introduce address request/reply wireup messages
  • Refactored lane access functions to separate raw lane retrieval from the common access method.
  • Added new wireup message types: ADDR_REQUEST and ADDR_REPLY.
  • Refacored KA reply EP to be reused for ADDR_REPLY.

Why?

Part of #10640 to minimize its size

…tor lane access functions

- Refactored lane access functions to separate raw lane retrieval from the common access method.
- Added new wireup message types: ADDR_REQUEST and ADDR_REPLY.
 - Refacored KA reply EP to be reused for ADDR_REPLY.
@evgeny-leksikov evgeny-leksikov requested a review from gleon99 May 13, 2025 13:23
@gleon99 gleon99 requested a review from shasson5 May 13, 2025 13:48
Comment on lines +238 to +241
if (type == UCP_WIREUP_MSG_ADDR_REQUEST) {
/* Reuse KA reply EP for simplicity of EP lifetime management */
msg_hdr->dst_ep_id = UCS_PTR_MAP_KEY_INVALID;
} else if (ep->flags & UCP_EP_FLAG_REMOTE_ID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no EP on remote side need to create temporary EP to reply, the logic is the same as in case of KA

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the handler support the case of existing EP in peer (line 993).
no need to set msg_hdr->dst_ep_id to INVALID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover, removed for simplicity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So better block ep_remote_id initialization in ADDR_REPLY handler (rather than block here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's blocked on both sides: sender does not send, receiver checks this with an assertion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not required here.
Instead:

void handle_address_reply()
{
    /* don't call set_remote_id() */
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ADDR_REPLY handler is not impplemented in this PR and this message must contain remote EP id bcz it targets specific EP

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then we create dependency between on-demand lanes feature and keepalive feature.
what if we use the same logic as for ucp_ep_resolve_remote_id?
or, for to-iface case, we can extract the remote iface address from existing (pairwise) lanes and use it to connect all-to-all lanes?

Comment on lines +1060 to +1061
ucs_assert(ep == NULL);
ucp_wireup_send_reply_on_onetime_ep(worker, msg,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't EP be non-NULL? for example if some p2p lanes were previously connected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can but it leads to other bugs which are currently not addressed, so for simplicity I decided always send ack on temporary EP, re-usage of existing EP could be a good next step optimization

Comment on lines +238 to +241
if (type == UCP_WIREUP_MSG_ADDR_REQUEST) {
/* Reuse KA reply EP for simplicity of EP lifetime management */
msg_hdr->dst_ep_id = UCS_PTR_MAP_KEY_INVALID;
} else if (ep->flags & UCP_EP_FLAG_REMOTE_ID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So better block ep_remote_id initialization in ADDR_REPLY handler (rather than block here).

@evgeny-leksikov evgeny-leksikov requested a review from shasson5 May 21, 2025 08:23
@evgeny-leksikov evgeny-leksikov requested a review from yosefe May 21, 2025 12:20
Copy link
Contributor

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants