This is good stuff. His cluster was having responsiveness issues which was fixed by restarting the pods or services. He asks some k8s developers via twitter for help and one of them suggests it may be conntrack exhaustion. Check dmesg on the nodes he says. dmesg shows “TCP: out of memory – consider tuning tcp_mem.” Read the rest to find out how it was resolved and what was proposed to kubernetes project to hopefully mark the node as unhealthy by kubelet.
Running through this article touches many subjects and is for intermediate users.