BTMash

Blob of contradictions

Using lsyncd to continuously mirror directories

Fri, 05/10/2019 - 10:34 -- btmash

At my workplace, we (used to) use GlusterFS as our NFS between various servers. It stored uploaded user files, various versions of our assets, etc. GlusterFS has had a love-hate relationship in our company. When it works, its fine. When it doesn't, it does so horribly (we have lost data and had to figure out ways to recover it). There has also been version incompatibility issues so that means upgrading our infrastructure to new versions was avoided. Until March of this year when Ubuntu 14.04 was no longer going to be supported; we needed to upgrade those servers. However, there were version incompatibilities between the versions of gluster if we upgraded to Ubuntu 16.04 or 18.04 and that meant we could face another crappy situation. Thankfully, the apps that we have on our servers that use these directories have local directories that symlink to the real ones so doing the switchover itself would be pretty simple (remove old symlink; replace with symlink to new location) which we'll get into in the next section.

Since we are on AWS, we used this as a opportunity to move away from GlusterFS to EFS (which is Amazon's elastic NFS). But our dataset consisted of over 300 gigs of data (with constant changes to the files being added/deleted). Running rsync was time consuming (it took well over an hour to complete a run, we had run into timeout issues, etc) and after rsync was done, there would be new files that would need to get synced (which meant another hour...which meant another sync..you can see where this is going). We needed a way to continuously sync data from one (or multiple) directories to another. I had mostly worked on one time syncs before that were much smaller and so I wasn't sure of which tool to be using (rsyncd, lsyncd, drbd, syncthing, etc) but I got the general idea of what it will be doing in the background (using inotify to notify your daemon of filesystem changes; then act on it).

Of the set of tools, lsyncd seemed to do what we needed and also seemed easy to set up. With that said, there is a HUGE caveat. The target directory will always be kept in sync with what is on the source directory. Meaning that any new files added to the target directory will get removed. So this would mean our strategy for the switchover would be the following:

  1. Configure and enable lsyncd to start copying
  2. Test that everything is working as expected (by adding new files and seeing that they end up in the new directory)
  3. Ensure as many of the files are copied over as possible
  4. Stop lsyncd
  5. Make switchover on servers from old directory to new directory
  6. Run rsync for the few remaining files

so I decided to give it a try. First came the installation which was simple since its a part of the ubuntu repos:

apt install lsyncd

And next came the setup which is where the files are going to sync (and logs). This would end up in /etc/lsyncd/lsyncd.conf

settings {
    logfile = "/var/log/lsyncd.log",
    statusFile = "/var/log/lsyncd-status.log",
    statusInterval = 20
}

sync {
    default.rsync,
    source = "/path/to/original/directory",
    target = "/path/to/new/location",
    rsync  = {
        binary   = "/usr/bin/rsync",
        archive  = true,
        compress = true,
        perms    = true,
        update   = false
    }
}

So this will sync over files from /path/to/original/directory to /path/to/new/location and use rsync to do so. And we can see a running tally of what is syncing or has been synced so far in /var/log/lsyncd.log. So then it is simply a matter of enabling lsyncd

service lsyncd start

And you'll see lsyncd start to do its magic of syncing over files continuously. We used that time to test, verify files were going up on EFS as expected and plan out a date for the switchover (and doing dry runs to see how long it would take and make sure we didn't miss anything). So when the day of the switchover came (which is steps 4 - 6), it took less than 5 minutes across all of our servers, followed by a final rsync (which took its hour - this was fine since we were talking only a handful of files not being in transferred instead of hundreds or thousands) and were very happy with the migration. And I'm also happy to say:

  1. We uninstalled GlusterFS
  2. EFS is performing about as well as we expected (not as well as Gluster, but more than good enough)
  3. Our remaining upgrades went swimmingly

So all in all, a very good day :)