-
Notifications
You must be signed in to change notification settings - Fork 469
UCT/DC: Adjust DC FHS latency #10685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
src/uct/ib/mlx5/dc/dc_mlx5.c
Outdated
@@ -205,6 +211,46 @@ uct_dc_mlx5_ep_create_connected(const uct_ep_params_t *params, uct_ep_h* ep_p) | |||
} | |||
} | |||
|
|||
/* Check whether RDMA write is disabled */ | |||
static int uct_dc_mlx5_iface_rdma_wr_disabled(uct_dc_mlx5_iface_t *iface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
where possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getter functions used here receive non-const args
src/uct/ib/mlx5/dc/dc_mlx5.c
Outdated
{"FULL_HANDSHAKE_ADDED_LATENCY", "140ns", | ||
"Amount of latency added to performance estimation of DC due to full " | ||
"handshake (used when AR is enabled).", | ||
ucs_offsetof(uct_dc_mlx5_iface_config_t, fhs_added_latency), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_added_
is a bit weird. Maybe _extra_
instead? Or simple fhs_latency
?
(+ in all similar places)
src/uct/ib/mlx5/dc/dc_mlx5.c
Outdated
@@ -140,6 +140,12 @@ ucs_config_field_t uct_dc_mlx5_iface_config_sub_table[] = { | |||
ucs_offsetof(uct_dc_mlx5_iface_config_t, dcis_initial_capacity), | |||
UCS_CONFIG_TYPE_ULUNITS}, | |||
|
|||
{"FULL_HANDSHAKE_ADDED_LATENCY", "140ns", | |||
"Amount of latency added to performance estimation of DC due to full " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amount of
->
{"FULL_HANDSHAKE_LATENCY", "140ns", | ||
"DC Full Handshake extra latency", | ||
ucs_offsetof(uct_dc_mlx5_iface_config_t, fhs_latency), | ||
UCS_CONFIG_TYPE_TIME_UNITS}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there dependency on iface_attr->latency
? then maybe we could calculate default value based on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU default value today doesn't reflect real latency (just a relation in comparison to other transports)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iface_attr->latency should be real latency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a separate issue (not related to this PR)
@@ -42,6 +42,12 @@ do | |||
new_caps=$(echo "$new_tl_caps" | grep -A 7 "Device: $device" || true) | |||
for cap in bandwidth latency overhead | |||
do | |||
# DC latency was modified to account for FHS since v1.20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just skip latency in line 43 since it is not affecting wire compatibility
{"FULL_HANDSHAKE_LATENCY", "140ns", | ||
"DC Full Handshake extra latency", | ||
ucs_offsetof(uct_dc_mlx5_iface_config_t, fhs_latency), | ||
UCS_CONFIG_TYPE_TIME_UNITS}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iface_attr->latency should be real latency
src/uct/ib/mlx5/dc/dc_mlx5.c
Outdated
return (iface->super.config.dp_ordering >= | ||
UCT_IB_MLX5_DP_ORDERING_OOO_RW) && | ||
(UCS_BIT(ib_iface->config.sl) & ooo_sl_mask) && | ||
/* If DevX isn't used for DCI, driver sets default value for | ||
* 'rdma_wr_disabled' */ | ||
(!(md->flags & UCT_IB_MLX5_MD_FLAG_DEVX_DCI) || | ||
/* RDMA write should be enabled for FHS to be used */ | ||
!uct_dc_mlx5_iface_rdma_wr_disabled(iface)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- simplify and split to several if conditions
- what about DDP (in CX8), do we have FHS also in this case?
- IMO there is something wrong with the logic. if a device exposes UCT_IB_MLX5_MD_FLAG_NO_RDMA_WR_OPTIMIZED, it will use FHS unless UCT_DC_MLX5_IFACE_FLAG_DISABLE_PUT is set and UCT_IB_MLX5_MD_FLAG_DEVX_DCI is enabled. Otherwise, it will use HHS. It would be easier to review the logic after refactoring (1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DDP case is covered here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems DDP is making a difference only if have AR because there is && between them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first condition includes DDP because it tests >= UCT_IB_MLX5_DP_ORDERING_OOO_RW
and DDP enabled when UCT_IB_MLX5_DP_ORDERING_OOO_ALL
What?
Adjust DC FHS latency
Why?
Impove perf by better selection of RC/DC switching point
How?
Distinguish AR enabled and forced FHS scenraios and add extra round trip latency