How to Replicate Storage Across Servers using GlusterFS on Ubuntu 14
GlusterFS is a highly-available, replicated storage layer that transparently expands storage across servers. If you’re looking for a way to access the same files on multiple web servers, or to share storage across a containerized cluster, GlusterFS is a great way of ensuring that your data is located with the apps and services that require it. In this guide we’ll set up an Ubuntu 14.04 LTS GlusterFS cluster on which you can run other services that need a clustered network filesystem.
Getting Started
This guide expects that you have the following hardware, real or virtual, in place before beginning:
• Web server (Cloud Server or Dedicated Server) with configured LAN and WAN interfaces
• Storage server with only a LAN interface configured
• Root access on all servers
Tutorial
This guide sets up a replicated document root for a web server. The principles can be used for any other shared storage scenario by identifying the parts of the filesystem you’d like to cluster, and by changing the GlusterFS setup accordingly.
Begin by setting up the LAN network interface.
• web1: 10.0.0.47
• gluster1: 10.0.0.48
• gluster2: 10.0.0.49
In order for the GlusterFS nodes to communicate, we’ll edit each of their hosts files and set custom hostnames for each system.
nano /etc/hosts
10.0.0.48 gluster1
10.0.0.49 gluster2
GlusterFS needs a directory where files are to be stored. Files in this directory will be transparently replicated between nodes. That same directory will be the web server’s document root, serving up files via HTTP.
mkdir /data
With the nodes configured and the directory made, let’s install GlusterFS itself onto each node in the cluster.
apt-get install glusterfs-server
From gluster1, you’ll now need to peer with the second node.
gluster peer probe gluster2
peer probe: success.
With that accomplished, let’s display the status of the configured trusted storage pool.
gluster peer status
Number of Peers: 1
Hostname: gluster2
Port: 24007
Uuid: c9648c88-2502-46c6-8f70-1d70d83126b5
State: Peer in Cluster (Connected)
Each Gluster node needs a brick directory where internal data is tracked. Here we set up these directories on each node in the cluster.
Gluster1:
mkdir /data/brick1
Gluster2:
mkdir /data/brick2
On gluster1, we need to create a storage volume and replication. If the volume is being created on the same disk, /dev/sda for example, you’ll need to use the “–force” flag to force its creation.
gluster volume create glustervol1 replica 2 transport tcp gluster1:/data/brick1 gluster2:/data/brick2 force
volume create: glustervol1: success: please start the volume to access data
Now launch the volume you’ve just created on gluster1.
gluster volume start glustervol1
volume start: glustervol1: success
The local IP range must now be configured to have access to the newly-created volume. We’ll take care of that here.
gluster volume set glustervol1 auth.allow 10.0.0.*
volume set: success
Let’s view the status of the volume after the changes we’ve just made.
gluster volume info
Volume Name: glustervol1
Type: Replicate
Volume ID: 8044a7e7-8812-47b9-8f96-9d09fae1298a
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: gluster1:/data/brick1
Brick2: gluster2:/data/brick2
Options Reconfigured:
auth.allow: 10.0.0.*
Now we’ll add some fine-tuning for the new volume.
gluster volume set glustervol1 performance.write-behind off
gluster volume set glustervol1 performance.io-thread-count 64
gluster volume set glustervol1 network.ping-timeout "5"
gluster volume set glustervol1 performance.write-behind-window-size 524288
gluster volume set glustervol1 performance.cache-refresh-timeout 1
Great, our volume is operating. Next we’ll set up the web server to retrieve files from the clustered filesystem. Install a basic web server on web1, the web server and GlusterFS client.
apt-get install apache2 -y
apt-get install glusterfs-client -y
Next we’ll mount the GlusterFS volume at /var/www/html.
mount.glusterfs gluster1:/glustervol1 /var/www/html/
Having done this, we now confirm that the mount succeeded.
mount | grep glusterfs
gluster1:/glustervol1 on /var/www/html type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
To ensure that the filesystem is mounted on every boot, we’ll add the GlusterFS mount to /etc/fstab on the web server.
nano /etc/fstab
Add this line at the end :
gluster1:/glustervol1 /var/www/html glusterfs defaults,_netdev,direct-io-mode=disable 0 0
Now it’s time to put GlusterFS’ replication through its paces. Create a file on the web1 web server, in its document root.
cd /var/www/html
touch index.html
Check that the file exists in both nodes of the GlusterFS cluster.
Gluster1:
ls /data/brick1
index.html
Gluster2:
ls /data/brick2
index.html
Conclusion
You’re now serving up files that are distributed across this simple 2-node cluster. Expanding this setup to many more machines is a great way to play around with the power of GlusterFS. GlusterFS is a great tool for anyone needing a highly-available storage solution. If you found this article helpful, feel free to share it with your friends and let us know in the comments below!