PostgreSQL is a powerful object-relational database system (ORDBMS). With the ACID-compliant, transactional nature and having earned a reputation for reliability, feature robustness, and performance, many companies use it for managing their master data. For production deployments, it is important for applications to maintain a highly available (HA) environment. PostgreSQL offers HA, but there were challenges to get it set up correctly on Kubernetes environment. Stolon is a cloud-native PostgreSQL manager to maintain HA. It has the ability to run on a Kubernetes environment as well and utilizes the PostgreSQL native cluster mechanism to add more value to the high availability feature.
Why ‘Stolon’ PostgreSQL Cluster??
Implementing a PostgreSQL cluster inside Kubernetes is always a challenge since it cannot be directly integrated with stateful services. The well knows methods for implementing clusters are sorintlab’s Stolon, CrunchyData PostgreSQL cluster, and Zalando‘s Patroni/Spilo PostgreSQL cluster.
As it stands, it is my opinion that Stolon is the best method for implementing a PostgreSQL cluster inside Kubernetes because of:
- The High Availability of PostgreSQL data
- Open source data storage services
- Better customization of PostgreSQL versions based on application requirements
- Its ability to easily modify service names, database names, and user access privileges
- Automated failover switching with very minimal delay
- High resiliency
- Easy cluster scaling
- Easy replication
Some information about Postgres Cluster configuration
A PostgreSQL cluster consists of:
- Stolon Cluster
- Stolon Sentinel(s)
- Stolon keepers
- Stolon Proxies
A highly available PostgreSQL Cluster is implemented with the help of a Stolon cluster in Kubernetes and all the configurations passed through configmap (stolon-cluster-kube-stolon) using stolon cluster. Any update in the Postgres parameter can also be passed as a rolling update through the configmap.
Note: For PostgreSQL cluster setup (including all three of the components mentioned above) wait for this Stolon component to be available.
A PostgreSQL database engine runs as a Stolon keeper service and is implemented as a Statefulset with persistent volume. Each pod in statefulsets is the master and stand-alone of the cluster. Data synchronous between each of the cluster candidates (master & stand-alone) are performed with the help of a separate Postgres user. Every keeper MUST have a different UID which can either be manually provided (–uid option) or one will automatically be generated. Based on this UID, the master election takes place. After the first start, the keeper id (provided or generated) is saved inside the keeper’s data directory.
A Sentinel discovers and monitors Stolon keepers and calculates the optimal cluster view. The Sentinel uses the UID of master & standalone(s) to monitor and keep a track of a Stolon keeper. Sentinel service is set up as a deployment type.
A Stolon Proxy will enable the PostgreSQL service endpoint with a fixed IP and DNS name for accessing the PostgreSQL service. This proxy will help switch the master connection based on master failover change. The stolon-proxy is a sort of fencer since it’ll close connections to old masters and direct new connections to the current master.
Stolon requires two kinds of users:
- manages/queries the keepers’ controlled instances. (AKA Normal connection users)
- executes (if enabled) pg_rewind based resync
The Replication user
- manages/queries the keepers’ controlled instances
- performs replication between postgres instances
Postgres Cluster for HA Production deployments
In order to obtain high availability and resilience, we have customized the default PostgreSQL cluster parameters. What follows is a description of the setup of our environment.
PostgreSQL has “synchronous replication” (SR) as an option for data availability. By default this option is disabled. We have enabled this function so that transactions are committed on one or more replicas before a success message is returned to the database client. This guarantees that if the client saw a commit message that the data is persisted to at least two nodes (master & stand-alone). This option is important for instances when data is so valuable that we’d rather have the database reject writes than risk losing them after the commit on the database cluster master fails.
This is the interval after the first fail to declare that a master (keeper) as not healthy by the stolon sentinel service. The default value is 20 seconds, but we modified it to 5 seconds for faster recovery.
Our current setup eliminates the requirement of shared storage for master and standalone since it uses Postgres streaming replication. Using this streaming replication all the standalones are in sync with the master keeper.
We can alter Postgres parameters utilizing the Stolon cluster features thus eliminating the need for downtime more than our failover switch time. The failover switch mechanism will make sure the change is done by rolling updates.
Max_connections increased from 100 to 10000 connections so that the concurrent process can handle the maximum amount of transactions at a time.
Failover master Switching
Once the existing master is lost, the stand-alone pod will be elected as the new master by Stolon Sentinel and will accordingly maintain the connection with the help of the stolon proxy service. Since the data is synchronized between all pods with the help of streaming synchronization, there will not be any data mismatches. This new master will serve until it experiences an issue. During the master change, the proxy service will also redirect the connection to that new master pod.
The master switch happens within a delay interval of 10 – 12 seconds. Once the master connection is lost, the cluster elects another stand-by as the new master and switches connections to the new one within 12 seconds.
Need More Information?
Thank you for reading! Feel free to drop in your comments, thoughts, or corrections down below! If you liked the article, do not forget to click on the ‘Like’ icon and subscribe for more articles!