Kafka pods failing
If kafka pods are failing with
Session 0x0 for sever localhost/0:0:0:0:0:0:0:1:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.net.ConnectException: Connection refused
and zookeeper pods are silently (pods still green) failing with PKIX cert verification issue, steps to fix are:
Delete Secrets:
gv-kafka-cluster-cluster-operator-certs
gv-kafka-cluster-entity-topic-operator-certs
gv-kafka-cluster-entity-user-operator-certs
gv-kafka-cluster-kafka-brokers
Wait for the Operator to recreate these Secrets
Delete Pods:
gv-kafka-cluster-zookeeper-0
gv-kafka-cluster-kafka-0
Wait for them to be recreated
Redeploy all services which are using kafka - forcing them to recreate missing topics. (For example classification pipeline won’t full start if regex topics are missing )
Data stored in Kafka might be lost, but these clusters are dysfunctional anyway
Related content
Classified as Getvisibility - Partner/Customer Confidential