title | description | services | ms.topic | ms.date | ms.author | author |
---|---|---|---|---|---|---|
Cluster configuration in Azure Kubernetes Services (AKS) |
Learn how to configure a cluster in Azure Kubernetes Service (AKS) |
container-service |
article |
02/09/2020 |
jpalma |
palma21 |
As part of creating an AKS cluster, you may need to customize your cluster configuration to suit your needs. This article introduces a few options for customizing your AKS cluster.
AKS supports Ubuntu 18.04 as the default node operating system (OS) in general availability (GA) for clusters.
A container runtime is software that executes containers and manages container images on a node. The runtime helps abstract away sys-calls or operating system (OS) specific functionality to run containers on Linux or Windows. For Linux node pools, containerd
is used for node pools using Kubernetes version 1.19 and greater. For Windows Server 2019 node pools, containerd
is generally available and can be used in node pools using Kubernetes 1.20 and greater, but Docker is still used by default.
Containerd
is an OCI (Open Container Initiative) compliant core container runtime that provides the minimum set of required functionality to execute containers and manage images on a node. It was donated to the Cloud Native Compute Foundation (CNCF) in March of 2017. The current Moby (upstream Docker) version that AKS uses already leverages and is built on top of containerd
, as shown above.
With a containerd
-based node and node pools, instead of talking to the dockershim
, the kubelet will talk directly to containerd
via the CRI (container runtime interface) plugin, removing extra hops on the flow when compared to the Docker CRI implementation. As such, you'll see better pod startup latency and less resource (CPU and memory) usage.
By using containerd
for AKS nodes, pod startup latency improves and node resource consumption by the container runtime decreases. These improvements are enabled by this new architecture where kubelet talks directly to containerd
through the CRI plugin while in Moby/docker architecture kubelet would talk to the dockershim
and docker engine before reaching containerd
, thus having extra hops on the flow.
Containerd
works on every GA version of Kubernetes in AKS, and in every upstream kubernetes version above v1.19, and supports all Kubernetes and AKS features.
Important
Clusters with Linux node pools created on Kubernetes v1.19 or greater default to containerd
for its container runtime. Clusters with node pools on a earlier supported Kubernetes versions receive Docker for their container runtime. Linux node pools will be updated to containerd
once the node pool Kubernetes version is updated to a version that supports containerd
. You can still use Docker node pools and clusters on older supported versions until those fall off support.
Using containerd
with Windows Server 2019 node pools is generally available, although the default for node pools created on Kubernetes v1.22 and earlier is still Docker. For more details, see [Add a Windows Server node pool with containerd
][/learn/aks-add-np-containerd].
It is highly recommended to test your workloads on AKS node pools with containerd
prior to using clusters with a Kubernetes version that supports containerd
for your node pools.
- For
containerd
, we recommend usingcrictl
as a replacement CLI instead of the Docker CLI for troubleshooting pods, containers, and container images on Kubernetes nodes (for example,crictl ps
).- It doesn't provide the complete functionality of the docker CLI. It's intended for troubleshooting only.
crictl
offers a more kubernetes-friendly view of containers, with concepts like pods, etc. being present.
Containerd
sets up logging using the standardizedcri
logging format (which is different from what you currently get from docker’s json driver). Your logging solution needs to support thecri
logging format (like Azure Monitor for Containers)- You can no longer access the docker engine,
/var/run/docker.sock
, or use Docker-in-Docker (DinD).- If you currently extract application logs or monitoring data from Docker Engine, please use something like Azure Monitor for Containers instead. Additionally AKS doesn't support running any out of band commands on the agent nodes that could cause instability.
- Even when using Docker, building images and directly leveraging the Docker engine via the methods above is strongly discouraged. Kubernetes isn't fully aware of those consumed resources, and those approaches present numerous issues detailed here and here, for example.
- Building images - You can continue to use your current docker build workflow as normal, unless you are building images inside your AKS cluster. In this case, please consider switching to the recommended approach for building images using ACR Tasks, or a more secure in-cluster option like docker buildx.
Azure supports Generation 2 (Gen2) virtual machines (VMs). Generation 2 VMs support key features that aren't supported in generation 1 VMs (Gen1). These features include increased memory, Intel Software Guard Extensions (Intel SGX), and virtualized persistent memory (vPMEM).
Generation 2 VMs use the new UEFI-based boot architecture rather than the BIOS-based architecture used by generation 1 VMs. Only specific SKUs and sizes support Gen2 VMs. Check the list of supported sizes, to see if your SKU supports or requires Gen2.
Additionally not all VM images support Gen2, on AKS Gen2 VMs will use the new AKS Ubuntu 18.04 image. This image supports all Gen2 SKUs and sizes.
By default, Azure automatically replicates the operating system disk for a virtual machine to Azure storage to avoid data loss should the VM need to be relocated to another host. However, since containers aren't designed to have local state persisted, this behavior offers limited value while providing some drawbacks, including slower node provisioning and higher read/write latency.
By contrast, ephemeral OS disks are stored only on the host machine, just like a temporary disk. This provides lower read/write latency, along with faster node scaling and cluster upgrades.
Like the temporary disk, an ephemeral OS disk is included in the price of the virtual machine, so you incur no additional storage costs.
Important
When a user does not explicitly request managed disks for the OS, AKS will default to ephemeral OS if possible for a given node pool configuration.
When using ephemeral OS, the OS disk must fit in the VM cache. The sizes for VM cache are available in the Azure documentation in parentheses next to IO throughput ("cache size in GiB").
Using the AKS default VM size Standard_DS2_v2 with the default OS disk size of 100GB as an example, this VM size supports ephemeral OS but only has 86GB of cache size. This configuration would default to managed disks if the user does not specify explicitly. If a user explicitly requested ephemeral OS, they would receive a validation error.
If a user requests the same Standard_DS2_v2 with a 60GB OS disk, this configuration would default to ephemeral OS: the requested size of 60GB is smaller than the maximum cache size of 86GB.
Using Standard_D8s_v3 with 100GB OS disk, this VM size supports ephemeral OS and has 200GB of cache space. If a user does not specify the OS disk type, the node pool would receive ephemeral OS by default.
Ephemeral OS requires at least version 2.15.0 of the Azure CLI.
Configure the cluster to use Ephemeral OS disks when the cluster is created. Use the --node-osdisk-type
flag to set Ephemeral OS as the OS disk type for the new cluster.
az aks create --name myAKSCluster --resource-group myResourceGroup -s Standard_DS3_v2 --node-osdisk-type Ephemeral
If you want to create a regular cluster using network-attached OS disks, you can do so by specifying --node-osdisk-type=Managed
. You can also choose to add more ephemeral OS node pools as per below.
Configure a new node pool to use Ephemeral OS disks. Use the --node-osdisk-type
flag to set as the OS disk type as the OS disk type for that node pool.
az aks nodepool add --name ephemeral --cluster-name myAKSCluster --resource-group myResourceGroup -s Standard_DS3_v2 --node-osdisk-type Ephemeral
Important
With ephemeral OS you can deploy VM and instance images up to the size of the VM cache. In the AKS case, the default node OS disk configuration uses 128GB, which means that you need a VM size that has a cache larger than 128GB. The default Standard_DS2_v2 has a cache size of 86GB, which is not large enough. The Standard_DS3_v2 has a cache size of 172GB, which is large enough. You can also reduce the default size of the OS disk by using --node-osdisk-size
. The minimum size for AKS images is 30GB.
If you want to create node pools with network-attached OS disks, you can do so by specifying --node-osdisk-type Managed
.
When you deploy an Azure Kubernetes Service cluster in Azure, a second resource group gets created for the worker nodes. By default, AKS will name the node resource group MC_resourcegroupname_clustername_location
, but you can also provide your own name.
To specify your own resource group name, install the aks-preview Azure CLI extension version 0.3.2 or later. Using the Azure CLI, use the --node-resource-group
parameter of the az aks create
command to specify a custom name for the resource group. If you use an Azure Resource Manager template to deploy an AKS cluster, you can define the resource group name by using the nodeResourceGroup
property.
az aks create --name myAKSCluster --resource-group myResourceGroup --node-resource-group myNodeResourceGroup
The secondary resource group is automatically created by the Azure resource provider in your own subscription. You can only specify the custom resource group name when the cluster is created.
As you work with the node resource group, keep in mind that you can't:
- Specify an existing resource group for the node resource group.
- Specify a different subscription for the node resource group.
- Change the node resource group name after the cluster has been created.
- Specify names for the managed resources within the node resource group.
- Modify or delete Azure-created tags of managed resources within the node resource group.
This enables an OIDC Issuer URL of the provider which allows the API server to discover public signing keys.
[!INCLUDE preview features callout]
You must have the following resource installed:
- The Azure CLI
- The
aks-preview
extension version 0.5.50 or later - Kubernetes version 1.19.x or above
To use the OIDC Issuer feature, you must enable the EnableOIDCIssuerPreview
feature flag on your subscription.
az feature register --name EnableOIDCIssuerPreview --namespace Microsoft.ContainerService
You can check on the registration status by using the az feature list command:
az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/EnableOIDCIssuerPreview')].{Name:name,State:properties.state}"
When ready, refresh the registration of the Microsoft.ContainerService resource provider by using the az provider register command:
az provider register --namespace Microsoft.ContainerService
# Install the aks-preview extension
az extension add --name aks-preview
# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview
To create a cluster using the OIDC Issuer.
az group create --name myResourceGroup --location eastus
az aks create -n aks -g myResourceGroup --enable-oidc-issuer
To update a cluster to use OIDC Issuer.
az aks update -n aks -g myResourceGroup --enable-oidc-issuer
az aks show -n aks -g myResourceGroup --query "oidcIssuerProfile.issuerUrl" -otsv
- Learn how upgrade the node images in your cluster.
- See Upgrade an Azure Kubernetes Service (AKS) cluster to learn how to upgrade your cluster to the latest version of Kubernetes.
- Read more about
containerd
and Kubernetes - See the list of Frequently asked questions about AKS to find answers to some common AKS questions.
- Read more about Ephemeral OS disks.