Keycloak High Availability Setup in Kubernetes

06 / Sep / 2021 by Satyam Singh 0 comments

Introduction

Keycloak is an authentication framework that gives application users federation and single sign-on (SSO) capabilities. In this blog we will discuss the key concepts that you should have in mind while deploying a Keycloak Cluster on top of Kubernetes.

Keycloak Cluster Setup

First of all we should know that for a Keycloak cluster, all keycloak instances should use same database. Totally there are 3 solutions for clustering, and all of the solutions are based on the discovery protocols of JGroups.
1. PING
2. TCPPING
3. JDBC_PING

Keycloak uses Infinispan cache and Infinispan use JGroups to discover nodes. We will be focusing on JDBC_PING in this blog.

JDBC_PING

JDBC_PING uses TCP protocol with 7600 port which is similar as TCPPING, but the difference between them is, TCPPING requires you to configure the IP and port of all instances, but for JDBC_PING you just need to configure the IP and port of current instance.

This is because in JDBC_PING solution each instance inserts its own information into the database and the instances discover peers by the ping data read from database.
If you don’t set the JGROUPS_DISCOVERY_EXTERNAL_IP env, the pod ip will be used, that means on Kubernetes you can simply set JGROUPS_DISCOVERY_PROTOCOL=JDBC_PING and then your Keycloak cluster is good to go.

You have to add JDBC_PING.cli to /opt/jboss/tools/cli/jgroups/discovery/ directory. You can find it from JDBC_PING.cli

– name: JGROUPS_DISCOVERY_PROTOCOL
value: JDBC_PING

Possible Issues

You may face a very common issue of Cluster nodes taking a long time to start or not being stable. JGROUPSPING table is often the culprit behind this.

The cluster state is managed through a JGROUPSPING table. Each time a node is created, it must know its own IP address in order to be able to add it to this table and join the cluster. In the database, you will find stale JGroup node entries in the JGROUPSPING table and they start to accumulate. When a new node starts, it tries to contact the other addresses in the table and waits for their response one by one. This often results in a timeout and prohibits the node from being stable.

Resolution

Add the following command in JGROUPS_DISCOVERY_PROPERTIES:
remove_old_coords_on_view_change=”true”,remove_all_data_on_view_change=”true”

– name: JGROUPS_DISCOVERY_PROPERTIES
value: ‘dns_query=keycloak-app,remove_old_coords_on_view_change=”true”,remove_all_data_on_view_change=”true”‘

This configuration handles the case when all IP addresses present into the JGROUPSPING table belong to unhealthy nodes. When a new node starts, if the node cannot contact the other addresses, it cleans the table on its own.

FOUND THIS USEFUL? SHARE IT

Leave a Reply

Your email address will not be published. Required fields are marked *