Skip to content

Commit f86d1fb

Browse files
committed
Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ...
2 parents 526942b + 7c6327c commit f86d1fb

File tree

1,753 files changed

+93690
-64652
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,753 files changed

+93690
-64652
lines changed

Documentation/ABI/testing/sysfs-devices-platform-soc-ipa

+49-13
Original file line numberDiff line numberDiff line change
@@ -46,33 +46,69 @@ Description:
4646
that is supported by the hardware. The possible values
4747
are "MAPv4" or "MAPv5".
4848

49+
What: .../XXXXXXX.ipa/endpoint_id/
50+
Date: July 2022
51+
KernelVersion: v5.19
52+
Contact: Alex Elder <[email protected]>
53+
Description:
54+
The .../XXXXXXX.ipa/endpoint_id/ directory contains
55+
attributes that define IDs associated with IPA
56+
endpoints. The "rx" or "tx" in an endpoint name is
57+
from the perspective of the AP. An endpoint ID is a
58+
small unsigned integer.
59+
60+
What: .../XXXXXXX.ipa/endpoint_id/modem_rx
61+
Date: July 2022
62+
KernelVersion: v5.19
63+
Contact: Alex Elder <[email protected]>
64+
Description:
65+
The .../XXXXXXX.ipa/endpoint_id/modem_rx file contains
66+
the ID of the AP endpoint on which packets originating
67+
from the embedded modem are received.
68+
69+
What: .../XXXXXXX.ipa/endpoint_id/modem_tx
70+
Date: July 2022
71+
KernelVersion: v5.19
72+
Contact: Alex Elder <[email protected]>
73+
Description:
74+
The .../XXXXXXX.ipa/endpoint_id/modem_tx file contains
75+
the ID of the AP endpoint on which packets destined
76+
for the embedded modem are sent.
77+
78+
What: .../XXXXXXX.ipa/endpoint_id/monitor_rx
79+
Date: July 2022
80+
KernelVersion: v5.19
81+
Contact: Alex Elder <[email protected]>
82+
Description:
83+
The .../XXXXXXX.ipa/endpoint_id/monitor_rx file contains
84+
the ID of the AP endpoint on which IPA "monitor" data is
85+
received. The monitor endpoint supplies replicas of
86+
packets that enter the IPA hardware for processing.
87+
Each replicated packet is preceded by a fixed-size "ODL"
88+
header (see .../XXXXXXX.ipa/feature/monitor, above).
89+
Large packets are truncated, to reduce the bandwidth
90+
required to provide the monitor function.
91+
4992
What: .../XXXXXXX.ipa/modem/
5093
Date: June 2021
5194
KernelVersion: v5.14
5295
Contact: Alex Elder <[email protected]>
5396
Description:
54-
The .../XXXXXXX.ipa/modem/ directory contains a set of
55-
attributes describing properties of the modem execution
56-
environment reachable by the IPA hardware.
97+
The .../XXXXXXX.ipa/modem/ directory contains attributes
98+
describing properties of the modem embedded in the SoC.
5799

58100
What: .../XXXXXXX.ipa/modem/rx_endpoint_id
59101
Date: June 2021
60102
KernelVersion: v5.14
61103
Contact: Alex Elder <[email protected]>
62104
Description:
63-
The .../XXXXXXX.ipa/feature/rx_endpoint_id file contains
64-
the AP endpoint ID that receives packets originating from
65-
the modem execution environment. The "rx" is from the
66-
perspective of the AP; this endpoint is considered an "IPA
67-
producer". An endpoint ID is a small unsigned integer.
105+
The .../XXXXXXX.ipa/modem/rx_endpoint_id file duplicates
106+
the value found in .../XXXXXXX.ipa/endpoint_id/modem_rx.
68107

69108
What: .../XXXXXXX.ipa/modem/tx_endpoint_id
70109
Date: June 2021
71110
KernelVersion: v5.14
72111
Contact: Alex Elder <[email protected]>
73112
Description:
74-
The .../XXXXXXX.ipa/feature/tx_endpoint_id file contains
75-
the AP endpoint ID used to transmit packets destined for
76-
the modem execution environment. The "tx" is from the
77-
perspective of the AP; this endpoint is considered an "IPA
78-
consumer". An endpoint ID is a small unsigned integer.
113+
The .../XXXXXXX.ipa/modem/tx_endpoint_id file duplicates
114+
the value found in .../XXXXXXX.ipa/endpoint_id/modem_tx.

Documentation/admin-guide/sysctl/net.rst

+12
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,18 @@ GRO has decided not to coalesce, it is placed on a per-NAPI list. This
391391
list is then passed to the stack when the number of segments reaches the
392392
gro_normal_batch limit.
393393

394+
high_order_alloc_disable
395+
------------------------
396+
397+
By default the allocator for page frags tries to use high order pages (order-3
398+
on x86). While the default behavior gives good results in most cases, some users
399+
might have hit a contention in page allocations/freeing. This was especially
400+
true on older kernels (< 5.14) when high-order pages were not stored on per-cpu
401+
lists. This allows to opt-in for order-0 allocation instead but is now mostly of
402+
historical importance.
403+
404+
Default: 0
405+
394406
2. /proc/sys/net/unix - Parameters for Unix domain sockets
395407
----------------------------------------------------------
396408

Documentation/bpf/btf.rst

+42-7
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ sequentially and type id is assigned to each recognized type starting from id
7474
#define BTF_KIND_ARRAY 3 /* Array */
7575
#define BTF_KIND_STRUCT 4 /* Struct */
7676
#define BTF_KIND_UNION 5 /* Union */
77-
#define BTF_KIND_ENUM 6 /* Enumeration */
77+
#define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */
7878
#define BTF_KIND_FWD 7 /* Forward */
7979
#define BTF_KIND_TYPEDEF 8 /* Typedef */
8080
#define BTF_KIND_VOLATILE 9 /* Volatile */
@@ -87,6 +87,7 @@ sequentially and type id is assigned to each recognized type starting from id
8787
#define BTF_KIND_FLOAT 16 /* Floating point */
8888
#define BTF_KIND_DECL_TAG 17 /* Decl Tag */
8989
#define BTF_KIND_TYPE_TAG 18 /* Type Tag */
90+
#define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */
9091

9192
Note that the type section encodes debug info, not just pure types.
9293
``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
@@ -101,10 +102,10 @@ Each type contains the following common data::
101102
* bits 24-28: kind (e.g. int, ptr, array...etc)
102103
* bits 29-30: unused
103104
* bit 31: kind_flag, currently used by
104-
* struct, union and fwd
105+
* struct, union, fwd, enum and enum64.
105106
*/
106107
__u32 info;
107-
/* "size" is used by INT, ENUM, STRUCT and UNION.
108+
/* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64.
108109
* "size" tells the size of the type it is describing.
109110
*
110111
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
@@ -281,10 +282,10 @@ modes exist:
281282

282283
``struct btf_type`` encoding requirement:
283284
* ``name_off``: 0 or offset to a valid C identifier
284-
* ``info.kind_flag``: 0
285+
* ``info.kind_flag``: 0 for unsigned, 1 for signed
285286
* ``info.kind``: BTF_KIND_ENUM
286287
* ``info.vlen``: number of enum values
287-
* ``size``: 4
288+
* ``size``: 1/2/4/8
288289

289290
``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
290291

@@ -297,6 +298,10 @@ The ``btf_enum`` encoding:
297298
* ``name_off``: offset to a valid C identifier
298299
* ``val``: any value
299300

301+
If the original enum value is signed and the size is less than 4,
302+
that value will be sign extended into 4 bytes. If the size is 8,
303+
the value will be truncated into 4 bytes.
304+
300305
2.2.7 BTF_KIND_FWD
301306
~~~~~~~~~~~~~~~~~~
302307

@@ -364,7 +369,8 @@ No additional type data follow ``btf_type``.
364369
* ``name_off``: offset to a valid C identifier
365370
* ``info.kind_flag``: 0
366371
* ``info.kind``: BTF_KIND_FUNC
367-
* ``info.vlen``: 0
372+
* ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
373+
or BTF_FUNC_EXTERN)
368374
* ``type``: a BTF_KIND_FUNC_PROTO type
369375

370376
No additional type data follow ``btf_type``.
@@ -375,6 +381,9 @@ type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
375381
:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
376382
(ABI).
377383

384+
Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are
385+
supported in the kernel.
386+
378387
2.2.13 BTF_KIND_FUNC_PROTO
379388
~~~~~~~~~~~~~~~~~~~~~~~~~~
380389

@@ -493,7 +502,7 @@ the attribute is applied to a ``struct``/``union`` member or
493502
a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
494503
valid index (starting from 0) pointing to a member or an argument.
495504

496-
2.2.17 BTF_KIND_TYPE_TAG
505+
2.2.18 BTF_KIND_TYPE_TAG
497506
~~~~~~~~~~~~~~~~~~~~~~~~
498507

499508
``struct btf_type`` encoding requirement:
@@ -516,6 +525,32 @@ type_tag, then zero or more const/volatile/restrict/typedef
516525
and finally the base type. The base type is one of
517526
int, ptr, array, struct, union, enum, func_proto and float types.
518527

528+
2.2.19 BTF_KIND_ENUM64
529+
~~~~~~~~~~~~~~~~~~~~~~
530+
531+
``struct btf_type`` encoding requirement:
532+
* ``name_off``: 0 or offset to a valid C identifier
533+
* ``info.kind_flag``: 0 for unsigned, 1 for signed
534+
* ``info.kind``: BTF_KIND_ENUM64
535+
* ``info.vlen``: number of enum values
536+
* ``size``: 1/2/4/8
537+
538+
``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.::
539+
540+
struct btf_enum64 {
541+
__u32 name_off;
542+
__u32 val_lo32;
543+
__u32 val_hi32;
544+
};
545+
546+
The ``btf_enum64`` encoding:
547+
* ``name_off``: offset to a valid C identifier
548+
* ``val_lo32``: lower 32-bit value for a 64-bit value
549+
* ``val_hi32``: high 32-bit value for a 64-bit value
550+
551+
If the original enum value is signed and the size is less than 8,
552+
that value will be sign extended into 8 bytes.
553+
519554
3. BTF Kernel API
520555
=================
521556

Documentation/bpf/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ that goes into great technical depth about the BPF Architecture.
1919
faq
2020
syscall_api
2121
helpers
22+
kfuncs
2223
programs
2324
maps
2425
bpf_prog_run

Documentation/bpf/instruction-set.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ BPF_XOR | BPF_K | BPF_ALU64 means::
127127
Byte swap instructions
128128
----------------------
129129

130-
The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit
130+
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
131131
code field of ``BPF_END``.
132132

133133
The byte swap instructions operate on the destination register
@@ -351,7 +351,7 @@ These instructions have seven implicit operands:
351351
* Register R0 is an implicit output which contains the data fetched from
352352
the packet.
353353
* Registers R1-R5 are scratch registers that are clobbered after a call to
354-
``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions.
354+
``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions.
355355

356356
These instructions have an implicit program exit condition as well. When an
357357
eBPF program is trying to access the data beyond the packet boundary, the

0 commit comments

Comments
 (0)