Synchronous Multi-Master Clustering with MariaDB and Galera

With the vast adoption of MySQL over the years by many businesses and software vendors, there has always been a strong driver for resilience, scalability and fault-tolerance. MySQL has always had various methods for achieving this but in my opinion they have never been clean and simple methods for “set it up and forget” type solutions.

For many years I’ve looked for a true multi-master solution for open source databases and I’ve always ended up in a cold dark corner with no simple answer to my requirements.

Since the porting of the MySQL project and the creation of MariaDB I’ve been meaning to re-investigate this requirement.

Today I have spent a bit of time looking into Galera. Galera is a Multi-Master Synchronous clustering solution which has been created by the MariaDB project and I have got to say, I am definitely impressed.

All in all, Galera has addressed 100% of my requirements I have had for avoiding a single point of failure in an open source database solution. Not only does each cluster member server contain a completely consistent copy of your databases, but thrown behind any load balancing solution gives you fault tolerance and high availability as well.

Here we have zero single points of failure, which is exactly what we are all searching for in our infrastructure.

In this article I’ll be walking through a setup of a 3 node MariaDB Galera cluster running on Red Hat Enterprise Linux 6.

Prerequisites

All systems must be running Red Hat Enterprise Linux 6 x86_64
All systems must have the MariaDB Galera 5.5.29 yum repository.

 

Details

I will be using the below host details in this environment.

db01.example.com 10.0.1.1
db02.example.com 10.0.1.2
db03.example.com 10.0.1.3

 

Software

First things first, make sure all your systems are up to date.

Next, on all 3 nodes, create the file /etc/yum.repos.d/mariadb.repo with the below details. This will enable the Galera yum repository mentioned in the prerequisites.

[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/5.5.29/rhel6-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

Once you have set up the yum repository, install the MariaDB and Galera packages.

yum install -y MariaDB-Galera-server MariaDB-client

Configuration

First up, lets make sure the mysql service will start when the systems are rebooted. To do this, run the following on all servers.

chkconfig mysql on

Now, so we can continue with the config, start mysql on all servers.

service mysql start

On each of the 3 nodes, we need to set up a default password for root, both for access to localhost but also from other systems in the cluster.

To do this, run the following.
Note: Don’t forget to change the word ‘password’ to something more secure.

mysql -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'password' WITH GRANT OPTION;"
mysql -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'password' WITH GRANT OPTION;"

Next, we need to edit /etc/my.cnf with the details of the new cluster.

On all nodes, with the exception of db01.example.com, add the below to the end of the /etc/my.cnf file.
Note: If you changed your root mysql password in the above steps, you will need to also update that password on the line started with “wsrep_sst_auth”

[mariadb]
wsrep_cluster_address=gcomm://db01.example.com
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_sst_auth=root:password
log-error=/var/log/mysql.log

 

Although Galera can initialise a new cluster on any node, I will be using db01.example.com for the purpose of this article.

On db01.example.com, my /etc/my.cnf file looks as follows. You’ll notice that it is configured to be initialized by db02.example.com when rejoining a cluster. This is in the event the server becomes unavailable and needs to rejoin an existing cluster.

[mariadb]
wsrep_cluster_address=gcomm://db02.example.com
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_sst_auth=root:password
log-error=/var/log/mysql.log

In the above configuration, I have set all nodes to use db01.example.com as the point of contact when joining a cluster (except db01 which uses db02). It is entirely up to you to elect which system you chose to use when joining a system to an already active cluster. db01.example.com is pointing to db02.example.com for the purpose of rejoining the cluster in the event that db01.example.com goes offline for any reason.

You may be thinking, but how does db01.example.com join the cluster on db02.example.com when it hasn’t actually joined yet? This sounds exactly like a chicken and egg situation. This now moves us on to initializing the cluster.

Initializing the cluster

Above we set everything up to be in a position to persist a reboot, now its time to bring the cluster online. First things first, make sure mysql is stopped on all servers.

service mysql stop

Now, with everything offline, we need to bring db01.example.com online first, so that all other nodes can connect to it.

You’ll remember that the config file on db01.example.com actually points to db02.example.com. This is not the case when you first bring the cluster online.

To start db01.example.com, when no other servers are online in the cluster, run the following.

Note: Specifying the gcomm address on the commandline overrules the address in the config file.

service mysql start --wsrep-cluster-address="gcomm://"

Once you have your console back, verify that the mysql service is still running. If everything went smoothly, it will still show as online. If something went wrong, the service will be stopped and you can have a look through the /var/log/mysql.log for clues to what caused it to fail.

To check the status, run the following

service mysql status

A successful status will look as follows.

[root@db01 ~]# service mysql status
SUCCESS! MySQL running (2713)
[root@db01 ~]#

Next, when db01.example.com is online, you can now start all other servers in the cluster.

To do this, simply run the following on db02 and db03.
Notice that we are not specifying the gcomm address? This is now coming from the config file.

service mysql start.

 

Verify the setup

Verifying your cluster setup is actually really easy. As it is a multi-master replicated cluster, simply creating a database on one of the servers and checking it appears on the others will prove that the replication is working. If you create a new database on each of the servers, this will verify the multi-master element as well.

 

For the purpose of this test, I will create a database called db01 on the server db01.example.com, a database called db02 on the server db02.example.com and a database called db03 on the server db03.example.com.

To do this, I have run the following commands, each on their own servers.

For db01;

[root@db01 ~]# mysql -u root -p -e "create database db01;"
Enter password:
[root@db01 ~]#

For db02;

[root@db02 ~]# mysql -u root -p -e "create database db02;"
Enter password:
[root@db02 ~]#

For db03;

[root@db03 ~]# mysql -u root -p -e "create database db03;"
Enter password:
[root@db03 ~]#

 

You can use the mysql command “show databases;” to print your list of databases on each server. If everything is working without error, you will see the following on all 3 servers.

Use the command as follows

[root@db01 ~]# mysql -u root -p -e "show databases;"
Enter password:
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db01               |
| db02               |
| db03               |
| mysql              |
| performance_schema |
| test               |
+--------------------+
[root@db01 ~]#

If you see the above output in your own setup, then you have successfully set up a MariaDB Galera cluster.

7 comments on “Synchronous Multi-Master Clustering with MariaDB and Galera

  1. Visi November 11, 2013 11:13

    Important is to have rsync installed, or you can’t join servers to the cluster as they will fail to sync.

  2. John December 12, 2013 13:30

    What are you using for load balancing?

  3. tquang April 24, 2014 14:14

    Dear Dale Macartney,

    Please explain for me about config:
    On DB1 with gcomm://db02.example.com
    On DB2 and 3 with gcomm://db01.example.com

    In this case, if DB1 die, are DB2 and DB3 still working or also death? And if DB1 die, anyway to recovery to bring it back online?

    Thanks for my bad questions!

  4. Dinesh April 30, 2014 14:17

    Thanks, nice explanation. It would be good, if you add the details about what are the ports we need to manually enable in firewall.

    Anyhow, thanks of your detailed explanation.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>