Scaling worker environment with k8s
27 Sep 2023After having set up a consumer app as a service and setting it up under kubernetes cluster, I have set a minimum of 2 replicas with the possibility of horizontal scaling up to 8. The consumers, or worker service on start up makes a connection to postgres and rabbitmq and have some queueing jobs to be run. The api which is a separate service does the publishing of the jobs to the rabbitmq and the consumer services picks them up, based on the job type and payload builds the appropriate job and runs it.
I have set the consumer services as a separate service because I want it to scale horizontally under higher load so each pod will have its own connection to rabbitmq and the queues are represented by channels and goroutines. On higher load kubernetes would add another pod or more pods and each of them will create a new connection to rabbitmq with their queues/channels, so for example if i have a connection with 4 queues which will subscribe to exchange rabbitmq for example each by 5 channels/goroutines which will total of 20 goroutines running, effectively I have 20 consumers. After k8s will add me another pod this will duplicate. After the traffic subsides the pods should scale down and the number of goroutines will also double down.
The other part is the publishing of messages. For each of the api services connecting to the rabbitmq I am keeping a persistent rabbitmq connection with 4 channels for publishing. When requests come in on the api endpoints that publish to rabbitmq will reuse that connection. On high traffic the apis will be scaled up and new pods added and new connections for publishing created.
Both of the services, api and consumers connect to the postgres with setting max/min open connections, which makes a difference in the publishing as I would first get things from the database and then publish by ID to the queue.
To check the active open connection to postgres:
SELECT count(*) FROM pg_stat_activity;
The way to monitor the k8s cluster adding and removing pods:
minikube addons enable metrics-server
kubectl top pods
with setting connection pool on postgres:
ab -n 3000 -c 100 http://127.0.0.1:53942/api/test/1
Concurrency Level: 100
Time taken for tests: 6.065 seconds
Complete requests: 3000
Failed requests: 0
Total transferred: 2370000 bytes
HTML transferred: 1998000 bytes
Requests per second: 494.62 [#/sec] (mean)
Time per request: 202.177 [ms] (mean)
Time per request: 2.022 [ms] (mean, across all concurrent requests)
Transfer rate: 381.59 [Kbytes/sec] received
This means it will query and pulish 3K requests in 6 seconds.
Before:
kubectl top pods
NAME CPU(cores) MEMORY(bytes)
api-shop-deployment-dcbfc5877-t5kfn 1m 34Mi
api-shop-deployment-dcbfc5877-xpfdx 5m 42Mi
consumers-deployment-757d987d9d-kgdmb 2m 39Mi
consumers-deployment-757d987d9d-pkngn 3m 42Mi
I am running the ab
command now for publishing 3K requests to check the scaling of the pods:
During, it got scaled:
kubectl top pods
NAME CPU(cores) MEMORY(bytes)
api-shop-deployment-dcbfc5877-pxd7f 56m 65Mi
api-shop-deployment-dcbfc5877-t5kfn 61m 63Mi
api-shop-deployment-dcbfc5877-tcv6x 63m 57Mi
api-shop-deployment-dcbfc5877-xpfdx 59m 64Mi
consumers-deployment-757d987d9d-jqh6j 141m 59Mi
consumers-deployment-757d987d9d-kgdmb 140m 60Mi
consumers-deployment-757d987d9d-pkngn 131m 59Mi
consumers-deployment-757d987d9d-qwlhm 138m 58Mi
After the traffic is processed by the consumers and api does not have anything more to publish, the pods go back to 2 for api and 2 for consumers. In the rabbitmq admin panel I can monitor the number of queues/goroutines going up and down based on the traffic need.