Deploying WSO2 API Manager in Production-Grade Kubernetes
- Andrea Perera
- Software Engineer Trainee - WSO2
Kubernetes is a leading open source container orchestration solution for managing containerized applications across multiple hosts. It allows users to easily deploy, maintain, and scale applications in containers. WSO2 API Manager is a fully open source solution for managing all aspects of the API lifecycle and is designed for massively scalable deployments.
In this article we will explore
- Why we would need to deploy WSO2 API Manager in Kubernetes
- Autoscaling WSO2 API Manager based on the production load
- Applying rolling updates on WSO2 API Manager with zero downtime
Why Do We Need to Deploy WSO2 API Manager in Kubernetes?
The technology industry is rapidly changing, especially for DevOps. It has shifted from virtual machine (VM) based approaches to a container-based approach. The VM-based approach has many disadvantages including
- The need to set up and make sure everything works well
- Inefficient management of resources such as RAM, CPU, and storage
Unlike VMs, containers are lightweight and can stand alone. Only the app code, runtime, system tools, and libraries are packaged inside containers, without needing the operating system. Multiple containers can run on one machine and share the OS kernel with other containers.
Docker is an open source containerization platform that can be used to create, deploy, share. and run any application anywhere. Docker containers are isolated with the host, which makes it fast, efficient, reliable, and scalable. Even though containers are scalable, you need to do some things manually to scale it and after scaling up the containers need to be managed. As a result, Kubernetes comes into the picture.
Kubernetes is an open source container orchestration system for automating application deployment, scaling, and management.
Some advantages of deploying WSO2 API Manager in Kubernetes includes:
- Availability and scalability
If the desired amount of pods are not available or if the container stops, Kubernetes will restart or containerize a new pod and ensure the service is always running. Kubernetes can scale pods/nodes up and down based on the production load dynamically by cluster autoscaler, vertical pod autoscaler and horizontal pod autoscaler.
- Networking and port mapping
Kubernetes supports service discovery. It provides an abstraction over the IP address so that the containers are automatically wired with the IPs when we want to scale it up or restart it. Now you can multi-host in Kubernetes, expose internal services externally via Ingress and manage traffic.
- Storage
Kubernetes supports many types of volumes such as configmaps, persistent volume claim, and cinder NFS that you can use to share data resources across multiple machines/nodes.
- Health checks and monitoring
Monitoring helps reduce response time to incidents and enables detecting, troubleshooting and debugging systemic problems. This can be achieved by deploying WSO2 API Manager on Kubernetes. If the container is unable to make progress, liveness probes will restart the pod. Readiness probes are used to know when a container is ready to start accepting traffic. Kubernetes also uses monitoring tools for maintaining the health and availability of the system. It also has a dashboard to check logs, the status of pods/services/nodes, and CPU usage among other things.
- Orchestration and other DevOps responsibilities
Kubernetes end users will interact with REST APIs. It uses YAML to declare the artifacts. YAML is a human-friendly data serialization standard for all programming languages that even a beginner can understand. In addition to the YAML file, you can use kubectl commands to deploy your artifacts too.
The YAML file below can be used to deploy a service called “hello kubernetes”:
apiVersion: "v1" kind: "Service" metadata: annotations: {} finalizers: [] labels: app: "helloKubernetesService" name: "kubernetes-hello-v1" ownerReferences: [] spec: externalIPs: [] loadBalancerSourceRanges: [] ports: - port: 9090 protocol: "TCP" targetPort: 9090 selector: app: "helloKubernetesService" type: "ClusterIP"
You can see how we have used both YAML files and kubectl commands to deploy WSO2 API Manager.
- Resource management
The minimum and maximum value of resources such as the amount of memory and CPU can be specified for a pod. These values can be independently managed and scaled at the users requirement. This allows schedulers and scalers to take better decisions.
All these things can be achieved by deploying WSO2 API Manager in Kubernetes.
Deployment Architecture
Figure 1
In order to deploy WSO2 API Manager in Kubernetes we need to have
- A Kubernetes cluster
- A single file node server to share persistent data among all the nodes
- Two persistent volumes for the database and WSO2 API Manager
- Persistent volume claims to define persistent storage
- Ingress to expose services externally
Figure 1 above consists of one WSO2 API Manager with Analytics, a MySQL database and a sample backend service. All the Docker images will be in WSO2 Docker Registry.
Autoscaling WSO2 API Manager Based on the Production Load
One of the major features in Kubernetes is autoscaling. If we deploy WSO2 API Manager in a VM-based deployment, when the production load is high you need should manually go and scale up the allocated resources. Depending on the load, Kubernetes autoscaler dynamically adjusts the number of nodes in the cluster and the number of pods in a deployment to meet the end user demand. This requires coordination between two layers of scalability:
- Cluster level scalability
- Cluster Autoscaler (CA)
Scales up or down based on the number of nodes inside your cluster. It periodically checks on the pods in pending state and increases the size of the cluster by increasing the number of nodes if more resources are needed.
- Cluster Autoscaler (CA)
- Pods layer autoscalers
- Vertical Pods Autoscaler (VPA)
Allows autoscaling infrastructure vertically. VPA allocates more CPU or memory to existing pods and increases the size of the pod. - Horizontal Pod Autoscaler (HPA)
Allows autoscaling infrastructure horizontally. This is explained in detail below.
- Vertical Pods Autoscaler (VPA)
We can use Kubernetes for the autoscaling WSO2 API Manager based on resource CPU usage. This is can be accomplished by using HPA.
HPA scales the number of pods in a replication-controller/deployment/replica-set based on observed CPU utilization provided metrics. In order to achieve HPA, the container can set its own requests and limits by including resources: limits and resources: requests.
These lines should be added at spec.template.spec.containers level in the Deployment.yaml file
resources: requests: memory: "2Gi" cpu: "2000m" limits: memory: "3Gi" cpu: "3000m"
The above sample defines that the container needs to be created with a minimum of 2000 milli-cores of CPU and 2Gi of memory. If this specification fails to handle the production load, then the CPU cores will increase to 3000 milli-cores and memory allocated to this container will be increased to 3Gi.
Figure 2 below shows how HPA work. There’s a built in HPA controller in Kubernetes, which continuously checks the predefined metrics and target CPU utilization. It monitors all HPA objects and if the target is exceeded it will increase the number of replicas to maintain an average CPU utilization across all pods.
Figure 2
We can set a HPA using kubectl create command or using HPA YAML.
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: creationTimestamp: 2019-04-10T05:59:12Z name: wso2apim-with-analytics-apim namespace: wso2 resourceVersion: "11546677" selfLink: /apis/autoscaling/v1/namespaces/wso2 /horizontalpodautoscalers//wso2apim-with-analytics-apim uid: c3714b90-5b55-11e9-a15f-42010aa000b7 spec: maxReplicas: 2 minReplicas: 1 scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: wso2apim-with-analytics-apim targetCPUUtilizationPercentage: 70
You need to define what you want to scale in spec.scaleTargetRef. It can be a Deployment/ReplicaSet/ReplicationController. In the above sample I have defined to scale “wso2apim-with-analytics-apim” deployment. HPA starts scaling if the target CPU utilization goes above 70% maintaining the pod count between 1 and 2 replicas of the pods. It will scale down again when CPU utilization is below the target value.
Apply Rolling Updates on WSO2 API Manager with Zero Downtime
In a VM-based deployment, to update WSO2 API Manager, first we have to deploy the latest version of WSO2 API Manager in parallel to the already running version one at a time to avoid downtime. We also have to do all the configurations manually again, which takes time and extra resources.
When performing a rolling update, it allows us to catch errors during the process so that we can rollback before it affects all of our end users. This can be done by rolling updates in Kubernetes by simply changing the image tag to the latest version and changing to RollingUpdate strategy in the deployment YAML file.
You have to do a kubectl edit or just edit the YAML file as follows:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: wso2apim-with-analytics-apim spec: replicas: 1 minReadySeconds: 30 strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 0 type: RollingUpdate template: metadata: labels: deployment: wso2apim-with-analytics-apim spec: containers: - name: wso2apim-with-analytics-apim-worker image: docker.wso2.com/wso2am:latest livenessProbe: exec: command: - /bin/bash - -c - nc -z localhost 9443 initialDelaySeconds: 100 periodSeconds: 10 readinessProbe: exec: command: - /bin/bash - -c - nc -z localhost 9443 initialDelaySeconds: 100
RollingUpdate strategy can be configured as follows:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: wso2apim-with-analytics-apim spec: replicas: 1 minReadySeconds: 30 strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 0 type: RollingUpdate
Deployment ensures that only one replica of the new version is created at a time. At least one pod will be available during the update process as the maxUnavailable is 0. maxSurge specifies the maximum number of pods that can be created over the desired number of pods. We have used the readiness probe to ensure that the old pod won’t terminated until the new pods are ready to receive requests. All old pods won’t terminate at once. The initialDelaySeconds will be the Number of seconds to wait before performing the first probe.
The periodSeconds specifies that the kubelet should perform a liveness probe every 10 seconds. The liveness probe restarts the pod if it’s not working.
Finally we can redeploy using kubectl apply -f.
kubectl apply -f wso2apim-update-deployment.yaml -n wso2
How to deploy WSO2 API Manager in Kubernetes and how to autoscale and do rolling updates are discussed fully in this blog. Check our webinar on this topic which was held recently and you can find the resources in samples-apim repository.
Conclusion
DevOps has shifted from a VM-based approach to a container-based approach. Docker is the most widely used containerization platform. Kubernetes helps with container orchestration and supports many complex scenarios. Some key benefits include autoscaling when the production load changes, launching new containers when there’s a failure, high availability, and monitoring capabilities. This article explored how these benefits can be leveraged by deploying WSO2 API Manager on Kubernetes.