Workshop: Kubernetes AKS, Kube-State-Metrics, and Prometheus

A Hands-On Guide to Monitoring Kubernetes with Azure AKS

Kubernetes Workshop: Azure AKS, Kube-State-Metrics, and Prometheus (with Example Workload)

Prerequisites
Setup

Install Azure CLI
Install Kubectl
Install Helm
Log in to Azure
Create a Resource Group

Workshop Steps

Create an AKS Cluster
Connect Kubectl to AKS
Deploy an Example NGINX Workload
Install Kube-State-Metrics using Helm
Install Prometheus using Helm
Access Prometheus Dashboard
Query Kube-State-Metrics in Prometheus
Advanced PromQL: Joining Metrics with Labels

Teardown

Delete Helm Releases
Delete the Example NGINX Workload
Delete the AKS Cluster
Delete the Resource Group

1. Prerequisites

Before you begin, ensure you have:

An Azure account with an active subscription.
Administrator privileges on your local machine to install software.

2. Setup

This section covers installing the necessary command-line tools and logging into your Azure account.

2.1 Install Azure CLI

The Azure CLI is a command-line tool for managing Azure resources.

Windows:

Recommended: Install Azure CLI using Windows Package Manager (winget):
```
winget install --exact --id Microsoft.AzureCLI
```
Alternative: Download the MSI installer from the official Microsoft documentation: Install Azure CLI on Windows and run the installer, following the prompts.
After installation, open a new command prompt or PowerShell and verify the installation:
```
az --version
```

macOS/Linux:

Follow the instructions on the official Microsoft documentation: Install Azure CLI

2.2 Install Kubectl

kubectl is the Kubernetes command-line tool, used to run commands against Kubernetes clusters.

Windows (Recommended):

Install kubectl using Windows Package Manager (winget):
```
winget install -e --id Kubernetes.kubectl
```
After installation, open a new command prompt or PowerShell and verify the installation:
```
kubectl version --client
```

Windows (Alternative - Manual Download):

Download the latest stable kubectl executable:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/windows/amd64/kubectl.exe"

Create a directory for kubectl, for example, C:\kubectl.
Move the downloaded kubectl.exe to this directory.
Add C:\kubectl to your system’s PATH environment variable.
- Search for “Environment Variables” in the Windows search bar and open “Edit the system environment variables”.
- Click “Environment Variables…”.
- Under “User variables for…”, select Path and click “Edit…”.
- Click “New” and add C:\kubectl. Click “OK” on all open windows.
Close and reopen your command prompt or PowerShell to apply the changes.
Verify the installation:
```
kubectl version --client
```

macOS/Linux:

Follow the instructions on the official Kubernetes documentation: Install Kubectl

2.3 Install Helm

Helm is a package manager for Kubernetes, which we’ll use to deploy kube-state-metrics and Prometheus. Windows (using Windows Package Manager - Recommended):

Open a command prompt or PowerShell.
Install Helm using winget:
```
winget install Helm.Helm
```
Verify the installation:
```
helm version
```

Windows (using Chocolatey - Alternative):

If you don’t have Chocolatey, install it by following the instructions on their website: Chocolatey Installation
Open an elevated PowerShell or Command Prompt (Run as Administrator).
Install Helm:
```
choco install kubernetes-helm
```
Verify the installation:
```
helm version
```

Windows (manual download - Alternative):

Download the desired version of Helm from the official releases page: Helm Releases (look for helm-vX.Y.Z-windows-amd64.zip)
Unzip the downloaded file. You’ll find a helm.exe executable.
Move helm.exe to a directory included in your system’s PATH (e.g., C:\Program Files\Helm and add it to PATH, or place it in C:\kubectl if you added that to your PATH).
Verify the installation:
```
helm version
```

macOS/Linux:

Follow the instructions on the official Helm documentation: Installing Helm

2.4 Log in to Azure

Open your terminal or command prompt and log in to your Azure account:

az login

This command will open a web browser for you to complete the login process.

2.5 Create a Resource Group

A resource group is a logical container for Azure resources.

RESOURCE_GROUP_NAME="myAKSWorkshopRG"
LOCATION="eastus" # You can choose a different region, e.g., westus, centralus

az group create --name $RESOURCE_GROUP_NAME --location $LOCATION

3. Workshop Steps

Now let’s get to the core of the workshop: creating an AKS cluster, deploying a sample application, installing the monitoring tools, and querying metrics.

3.1 Create an AKS Cluster

This command creates a basic AKS cluster. For production environments, you’d want to consider more advanced configurations (e.g., multiple node pools, advanced networking).

AKS_CLUSTER_NAME="myAKSCluster"

az aks create \
    --resource-group $RESOURCE_GROUP_NAME \
    --name $AKS_CLUSTER_NAME \
    --node-count 1 \
    --generate-ssh-keys \
    --node-vm-size standard_ds2_v2 # A common general-purpose VM size

This command will take a few minutes to complete.

3.2 Connect Kubectl to AKS

Once the cluster is created, configure kubectl to connect to your new AKS cluster.

az aks get-credentials --resource-group $RESOURCE_GROUP_NAME --name $AKS_CLUSTER_NAME

Verify kubectl can communicate with your cluster:

kubectl get nodes

You should see your AKS node(s) listed with a Ready status.

3.3 Deploy an Example NGINX Workload

We’ll deploy a simple NGINX web server as our example workload. This will create a Deployment and a Service.

Create a file named nginx-example.yaml with the following content:

# nginx-example.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3 # We'll deploy 3 replicas to generate more metrics
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: LoadBalancer # Expose NGINX via a LoadBalancer for external access (optional for this workshop, but good practice)

Apply this manifest to your cluster:

kubectl apply -f nginx-example.yaml

Verify the NGINX deployment and pods are running:

kubectl get deployment nginx-deployment
kubectl get pods -l app=nginx
kubectl get service nginx-service

You should see 3 NGINX pods in Running status and a service with an EXTERNAL-IP (might be <pending> for a moment).

3.4 Install Kube-State-Metrics using Helm

kube-state-metrics listens to the Kubernetes API server and generates metrics about the state of Kubernetes objects (e.g., deployments, pods, nodes).

First, add the Prometheus community Helm repository:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Now, install kube-state-metrics:

helm install kube-state-metrics prometheus-community/kube-state-metrics --namespace kube-system --create-namespace

Verify that kube-state-metrics is running:

kubectl get pods -n kube-system -l app.kubernetes.io/name=kube-state-metrics

You should see a pod named kube-state-metrics-... with a Running status.

3.5 Install Prometheus using Helm

Next, we’ll install Prometheus, which will scrape metrics from kube-state-metrics.

Create a file named prometheus-values.yaml with the following content:

# prometheus-values.yaml
server:
  service:
    type: LoadBalancer # Expose Prometheus via a LoadBalancer for easy access
alertmanager:
  enabled: false
kube-state-metrics:
  enabled: false # We installed it separately, so disable the one bundled with Prometheus

Now, install Prometheus using the Helm chart with your custom values:

helm install prometheus prometheus-community/prometheus -f prometheus-values.yaml --namespace monitoring --create-namespace

This will deploy Prometheus server, its associated service accounts, and roles. The LoadBalancer service type will provision an external IP address for Prometheus. This may take a few minutes for the external IP to be assigned.

Verify Prometheus pods are running:

kubectl get pods -n monitoring -l app.kubernetes.io/name=prometheus

You should see pods like prometheus-server-... with Running status.

3.6 Access Prometheus Dashboard

Get the external IP address of the Prometheus server:

kubectl get svc -n monitoring prometheus-server

Look for the EXTERNAL-IP for the prometheus-server service. It might show <pending> for a few minutes. Keep running the command until an IP address appears.

Once you have the external IP, open your web browser and navigate to http://<EXTERNAL-IP>:80.

You should see the Prometheus UI.

Alternative: Access Prometheus Locally via Port Forwarding

If you don’t have an external IP or prefer not to expose Prometheus externally, you can use kubectl port-forward to access the dashboard locally:

kubectl port-forward -n monitoring svc/prometheus-server 9090:80

Then, open your browser and go to http://localhost:9090.

This method is secure and works even if your cluster does not provision a public IP.

3.7 Query Kube-State-Metrics in Prometheus

In the Prometheus UI, go to the “Graph” tab.

In the expression input box, you can enter various PromQL queries to explore metrics from kube-state-metrics.

Here are some example queries related to your NGINX workload:

Number of running NGINX pods:

kube_pod_status_phase{pod=~"nginx-deployment.*", phase="Running"}

Number of desired NGINX replicas for the deployment:

kube_deployment_spec_replicas{deployment="nginx-deployment"}

Current number of available replicas for the NGINX deployment:

kube_deployment_status_replicas_available{deployment="nginx-deployment"}

Pod restart count (if any NGINX pods have restarted):

kube_pod_container_status_restarts_total{pod=~"nginx-deployment.*"}

Information about Kubernetes deployments (including NGINX):
```
kube_deployment_info
```

Type kube_ in the expression box and explore the auto-completion suggestions. You’ll find many metrics providing insights into the state of your NGINX deployment and other Kubernetes objects.

3.8 Advanced PromQL: Joining Metrics with Labels

Prometheus allows you to join data from different metrics using label matching. This is useful for correlating information across resources, such as deployments and pods, or pods and nodes. Here are some advanced examples you can try in the Prometheus UI:

Join NGINX pod restarts with pod phase (show restart count only for running pods):

kube_pod_container_status_restarts_total{pod=~"nginx-deployment.*"}
  * on(pod) group_left(phase)
  kube_pod_status_phase{phase="Running"}

Show desired vs available replicas for all deployments (side-by-side):
```
kube_deployment_spec_replicas - kube_deployment_status_replicas_available
```
This shows the difference between desired and available replicas for each deployment.
List NGINX pods with their node assignment:
```
kube_pod_info{pod=~"nginx-deployment.*"}
```
This metric includes labels for pod, namespace, and node, allowing you to see which node each NGINX pod is running on.

Explore more by using the on() and group_left()/group_right() operators in PromQL to join metrics on shared labels. This enables powerful cross-resource queries and deeper insights into your Kubernetes workloads.

4. Teardown

It’s crucial to clean up your Azure resources to avoid incurring unnecessary costs.

4.1 Delete Helm Releases

Delete the Helm releases for Prometheus and Kube-State-Metrics:

helm uninstall prometheus -n monitoring
helm uninstall kube-state-metrics -n kube-system

4.2 Delete the Example NGINX Workload

Delete the NGINX deployment and service:

kubectl delete -f nginx-example.yaml

4.3 Delete the AKS Cluster

az aks delete --resource-group $RESOURCE_GROUP_NAME --name $AKS_CLUSTER_NAME --yes

This command will prompt for confirmation. The --yes flag bypasses the prompt.

4.4 Delete the Resource Group

Deleting the resource group will remove all resources contained within it, including the AKS cluster, virtual networks, public IPs, and storage accounts created by AKS.

az group delete --name $RESOURCE_GROUP_NAME --yes --no-wait

The --no-wait flag tells the command to return immediately without waiting for the deletion to complete, which can take several minutes.

Workshop: Kubernetes AKS, Kube-State-Metrics, and Prometheus

A Hands-On Guide to Monitoring Kubernetes with Azure AKS

Kubernetes Workshop: Azure AKS, Kube-State-Metrics, and Prometheus (with Example Workload)

Table of Contents

1. Prerequisites

2. Setup

2.1 Install Azure CLI

2.2 Install Kubectl

2.3 Install Helm

2.4 Log in to Azure

2.5 Create a Resource Group

3. Workshop Steps

3.1 Create an AKS Cluster

3.2 Connect Kubectl to AKS

3.3 Deploy an Example NGINX Workload

3.4 Install Kube-State-Metrics using Helm

3.5 Install Prometheus using Helm

3.6 Access Prometheus Dashboard

3.7 Query Kube-State-Metrics in Prometheus

3.8 Advanced PromQL: Joining Metrics with Labels

4. Teardown

4.1 Delete Helm Releases

4.2 Delete the Example NGINX Workload

4.3 Delete the AKS Cluster

4.4 Delete the Resource Group