Skip to content

Commit 2fba7dc

Browse files
mfijalkoborkmann
authored andcommitted
ice: Add support for XDP multi-buffer on Rx side
Ice driver needs to be a bit reworked on Rx data path in order to support multi-buffer XDP. For skb path, it currently works in a way that Rx ring carries pointer to skb so if driver didn't manage to combine fragmented frame at current NAPI instance, it can restore the state on next instance and keep looking for last fragment (so descriptor with EOP bit set). What needs to be achieved is that xdp_buff needs to be combined in such way (linear + frags part) in the first place. Then skb will be ready to go in case of XDP_PASS or BPF program being not present on interface. If BPF program is there, it would work on multi-buffer XDP. At this point xdp_buff resides directly on Rx ring, so given the fact that skb will be built straight from xdp_buff, there will be no further need to carry skb on Rx ring. Besides removing skb pointer from Rx ring, lots of members have been moved around within ice_rx_ring. First and foremost reason was to place rx_buf with xdp_buff on the same cacheline. This means that once we touch rx_buf (which is a preceding step before touching xdp_buff), xdp_buff will already be hot in cache. Second thing was that xdp_rxq is used rather rarely and it occupies a separate cacheline, so maybe it is better to have it at the end of ice_rx_ring. Other change that affects ice_rx_ring is the introduction of ice_rx_ring::first_desc. Its purpose is twofold - first is to propagate rx_buf->act to all the parts of current xdp_buff after running XDP program, so that ice_put_rx_buf() that got moved out of the main Rx processing loop will be able to tak an appriopriate action on each buffer. Second is for ice_construct_skb(). ice_construct_skb() has a copybreak mechanism which had an explicit impact on xdp_buff->skb conversion in the new approach when legacy Rx flag is toggled. It works in a way that linear part is 256 bytes long, if frame is bigger than that, remaining bytes are going as a frag to skb_shared_info. This means while memcpying frags from xdp_buff to newly allocated skb, care needs to be taken when picking the destination frag array entry. Upon the time ice_construct_skb() is called, when dealing with fragmented frame, current rx_buf points to the *last* fragment, but copybreak needs to be done against the first one. That's where ice_rx_ring::first_desc helps. When frame building spans across NAPI polls (DD bit is not set on current descriptor and xdp->data is not NULL) with current Rx buffer handling state there might be some problems. Since calls to ice_put_rx_buf() were pulled out of the main Rx processing loop and were scoped from cached_ntc to current ntc, remember that now mentioned function relies on rx_buf->act, which is set within ice_run_xdp(). ice_run_xdp() is called when EOP bit was found, so currently we could put Rx buffer with rx_buf->act being *uninitialized*. To address this, change scoping to rely on first_desc on both boundaries instead. This also implies that cleaned_count which is used as an input to ice_alloc_rx_buffers() and tells how many new buffers should be refilled has to be adjusted. If it stayed as is, what could happen is a case where ntc would go over ntu. Therefore, remove cleaned_count altogether and use against allocing routine newly introduced ICE_RX_DESC_UNUSED() macro which is an equivalent of ICE_DESC_UNUSED() dedicated for Rx side and based on struct ice_rx_ring::first_desc instead of next_to_clean. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
1 parent 8a11b33 commit 2fba7dc

File tree

6 files changed

+200
-98
lines changed

6 files changed

+200
-98
lines changed

drivers/net/ethernet/intel/ice/ice_base.c

+11-6
Original file line numberDiff line numberDiff line change
@@ -492,16 +492,18 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring)
492492
int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
493493
{
494494
struct device *dev = ice_pf_to_dev(ring->vsi->back);
495-
u16 num_bufs = ICE_DESC_UNUSED(ring);
495+
u32 num_bufs = ICE_RX_DESC_UNUSED(ring);
496496
int err;
497497

498498
ring->rx_buf_len = ring->vsi->rx_buf_len;
499499

500500
if (ring->vsi->type == ICE_VSI_PF) {
501501
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
502502
/* coverity[check_return] */
503-
xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev,
504-
ring->q_index, ring->q_vector->napi.napi_id);
503+
__xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev,
504+
ring->q_index,
505+
ring->q_vector->napi.napi_id,
506+
ring->vsi->rx_buf_len);
505507

506508
ring->xsk_pool = ice_xsk_pool(ring);
507509
if (ring->xsk_pool) {
@@ -521,9 +523,11 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
521523
} else {
522524
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
523525
/* coverity[check_return] */
524-
xdp_rxq_info_reg(&ring->xdp_rxq,
525-
ring->netdev,
526-
ring->q_index, ring->q_vector->napi.napi_id);
526+
__xdp_rxq_info_reg(&ring->xdp_rxq,
527+
ring->netdev,
528+
ring->q_index,
529+
ring->q_vector->napi.napi_id,
530+
ring->vsi->rx_buf_len);
527531

528532
err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
529533
MEM_TYPE_PAGE_SHARED,
@@ -534,6 +538,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
534538
}
535539

536540
xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq);
541+
ring->xdp.data = NULL;
537542
err = ice_setup_rx_ctx(ring);
538543
if (err) {
539544
dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n",

drivers/net/ethernet/intel/ice/ice_ethtool.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -3092,7 +3092,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
30923092

30933093
/* allocate Rx buffers */
30943094
err = ice_alloc_rx_bufs(&rx_rings[i],
3095-
ICE_DESC_UNUSED(&rx_rings[i]));
3095+
ICE_RX_DESC_UNUSED(&rx_rings[i]));
30963096
rx_unwind:
30973097
if (err) {
30983098
while (i) {

drivers/net/ethernet/intel/ice/ice_main.c

+9-4
Original file line numberDiff line numberDiff line change
@@ -2888,9 +2888,12 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
28882888
bool if_running = netif_running(vsi->netdev);
28892889
int ret = 0, xdp_ring_err = 0;
28902890

2891-
if (frame_size > ice_max_xdp_frame_size(vsi)) {
2892-
NL_SET_ERR_MSG_MOD(extack, "MTU too large for loading XDP");
2893-
return -EOPNOTSUPP;
2891+
if (prog && !prog->aux->xdp_has_frags) {
2892+
if (frame_size > ice_max_xdp_frame_size(vsi)) {
2893+
NL_SET_ERR_MSG_MOD(extack,
2894+
"MTU is too large for linear frames and XDP prog does not support frags");
2895+
return -EOPNOTSUPP;
2896+
}
28942897
}
28952898

28962899
/* need to stop netdev while setting up the program for Rx rings */
@@ -7354,6 +7357,7 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
73547357
struct ice_netdev_priv *np = netdev_priv(netdev);
73557358
struct ice_vsi *vsi = np->vsi;
73567359
struct ice_pf *pf = vsi->back;
7360+
struct bpf_prog *prog;
73577361
u8 count = 0;
73587362
int err = 0;
73597363

@@ -7362,7 +7366,8 @@ static int ice_change_mtu(struct net_device *netdev, int new_mtu)
73627366
return 0;
73637367
}
73647368

7365-
if (ice_is_xdp_ena_vsi(vsi)) {
7369+
prog = vsi->xdp_prog;
7370+
if (prog && !prog->aux->xdp_has_frags) {
73667371
int frame_size = ice_max_xdp_frame_size(vsi);
73677372

73687373
if (new_mtu + ICE_ETH_PKT_HDR_PAD > frame_size) {

0 commit comments

Comments
 (0)