ssync


Author:Michael W. Shaffer mwshaffer@yahoo.com
Current Version:2.2
Status:Stable
Release Date:2002-03-21
Source Archive:ssync-2.2.tar.gz
DEB Package (i386 potato):ssync_2.2-1-potato_i386.deb
DEB Package (i386 woody):ssync_2.2-1-woody_i386.deb
RPM Package (i386 RH6.2):ssync-2.2-1.i386.rpm
SRPM Package:ssync-2.2-1.src.rpm
Change Log:CHANGES
License:GPL

Contents


What is it?

Ssync is a minimalistic tool for keeping filesystems in synchronization. My main goals in writing ssync were correctness, simplicity, speed, low-resource consumption, and portability. It features a number of options to control how things are synchronized and under what conditions, as well as useful dry-run and verbose modes.

I have been using ssync on Debian and RedHat Linux systems for about nine months in a production environment, and I have reports that it has worked well when compiled from source on at least FreeBSD and SCO. I have also built and tested it on Yellow Dog Linux (PPC) and HP-UX 10.20 and 11.00. If you have success (or problems) using ssync on other platforms, please let me know. I believe that it will build and function correctly on most UNIX-like platforms with a working ANSI or C89 compliant compiler.

Why another synchronization tool?

The name ssync is a contraction of [s]imple filesystem [sync]hronizer. It was designed to be an extremely simple and reliable solution to a significant operational need. On the network I manage, I recently put into production a pair of loosely coupled highly available Linux file servers which run Samba, NFS, and dhttpd to service the file sharing needs of about 500 users with client machines running Windows and various UNIX platforms. I chose not to use any of the currently available HA packages to manage these systems for various reasons: The actual monitoring and failover features are handled by a separate daemon I created called peerd. Since the implementation does not rely on a shared disk subsystem, some means of keeping the two separate filesystems of the peer machines in relatively close synchronization was needed. Originally, the solution to this requirement was a shell script which ran various rsync commands, first using a connection to an rsync server process on the master machine and later relying on a couple of NFS filesystems exported on the master and mounted on the slave specifically for the replication. As it turned out, this solution was less than satisfactory since rsync would randomly but fairly frequently fail to complete the synchronization of one or more directory trees by either hanging indefinitely or barfing out numerous puzzling and spurious errors. The more I thought about it, the more I was convinced that what was needed was something much less complex and hopefully more reliable than rsync seemed to be in this application, and thus was born ssync / ssyncd. I don't pretend that this program is useful for anything besides the rather narrow mission for which it was designed (and it may not even be useful for that). I do think, however, that it at least provides an alternative sync tool for certain situations, and I was unable to find any viable alternative to rsync in the open source world when I wrote this.

Features


Limitations

The basic function of ssync is simply to make the directories, files, and links on a destination filesystem match those on a source filesystem. The default behavior is to read a list of paths to sync from a specified file and recursively process each of them. You may also specify the paths to sync with the (-f | --src-path) and (-t | --dst-path) command line options if you just want to quickly sync two paths without bothering to create configuration and work files.

Building and Installing

As of version 1.8 there are now binary packages available. If you have a Linux system which uses either the .rpm or .deb package formats, then all you have to do is install the package and edit the config files. I have tested and deployed ssync on both RedHat and Debian Linux. I am not aware of any Linux specific features which it uses, so I think it will work fine on most other UNIX-like platforms as well. As of the 2.0 release, I have eliminated what few GCC-isms the code contained and added the GCC -ansi and -pedantic flags to the makefile, so I think it will now build and work on most UNIX systems with a reasonably ANSI or C89 compliant compiler. With the GCC -ansi flag on, and because I did use snprintf(), lstat(), lchown(), and a couple of other not-strictly-POSIX things, it does require -D_BSD_SOURCE to build on Linux. If your platform does not have any of these functions for some reason, just let me know and I'll see if there are any workarounds.

There is no configure script since I just didn't feel like writing one and I don't really think one is necessary at this point. There may be one in the future. You may need to change the makefile if you don't have gcc available. Otherwise, a plain old make should do it. The build will produce two binaries, ssync (the interactive version), and ssyncd, the daemon. Also included is a rather generic ssyncd.init startup script which can be copied to /etc/init.d or wherever your distribution puts startup files. Examples of the the config files /etc/ssyncd.conf and /etc/ssyncd.work are provided, and they should be edited as appropriate to your situation. If you are running the interactive ssync version, it will obey whatever command line options you give as well as any configuration it might find in a file called .ssyncrc in the current directory. I have not yet gotten around to implementing any behavior for ssync to look for a .ssyncrc file in the user's home directory.

Configuration

All of the available configuration options are shown in the example ssyncd.conf configuration file and can be set either in this file (for ssyncd), in .ssyncrc (for ssync), or on the command line (for both). A summary of config options is below. The -c option, of course, only makes sense on the command line (duh). You will see a complete list of all updates, deletions, and exceptions at the default log-level of 0. If you want to suppress everything except errors, set log level 3 (warn). Log level 2 (info) is probably what most people want.

Config fileLong OptionShort OptionComment
---help-hdisplay usage message and version
conf-path--conf-path-cread alternative config file from the default
interval--interval-inumber of seconds to sleep between completing one run and starting the next
work-file--work-file-wpath for file containing work paths (see also src-path and dst-path)
src-path--src-path-falternative way to specify a single source path
dst-path--dst-path-talternative way to specify a single destination path
priority--priority-nscheduling priority (-20 - +20), see renice(8)
no-detach--no-detach-Fdo not daemonize (use with log-mode: stderr)
no-sync-data--no-sync-data-Ddo not sync data (content) of files
no-sync-time--no-sync-time-Tdo not sync atime / mtime
no-sync-meta--no-sync-meta-Mdo not sync meta-data (uid / gid / mode)
update-only--update-only-Uonly sync things if source mtime is > destination mtime
test--test-Xrun sync procedure and collect statistics without actually modifying anything
pid-path--pid-path-ppath for pid file
log-mode--log-mode-m[file|syslog|stderr] logging mode
log-path--log-path-lpath for log file if using file based logging
log-ident--log-ident-sidentification string if using syslog based logging
log-level--log-level-vlogging verbosity (0 - 5), lower levels are more verbose (2 is normal, 3 is errors only, 0 lists all updates and deletions

Here's the example ssyncd.conf file:


#
# ssyncd.conf
#

interval:		300			# time between sync runs in seconds
work-file:		/etc/ssyncd.work	# list of paths to synchronize (you can also specify
                                                # a single source and destination in the config file
						# or on the command line with src-path and dst-path
#src-path:		/src/path		# alternative specification of one source path
#dst-path:		/dst/path		# alternative specification of one destination path

priority:		0			# scheduling priority (range -20 - +20)
                                                # be careful with this! and read renice(8)
                                                # if you don't know what it means

#no-detach:             yes                     # [y|n] do not detach from terminal
#no-sync-data:		yes			# [y|n] do not sync data (file contents)
#no-sync-time:		yes			# [y|n] do not sync atime / mtime
#no-sync-meta:		yes			# [y|n] do not sync meta-data (uid / gid / mode)
#update-only:		yes			# [y|n] update only if source mtime > dest mtime
#test:			yes			# [y|n] test only (modify nothing in dest.)

pid-path:		/var/run/ssyncd.pid	# path for pid file

log-mode:               file                    # [file|syslog|stderr] logging mode
log-path:		/var/log/ssyncd.log	# path for file based logging
log-ident:		ssyncd			# id for syslog based logging
log-level:		2	# 0 - ALL
				# 1 - TRACE
				# 2 - INFO
				# 3 - WARN
				# 4 - SEVERE
				# 5 - FATAL


The work file just contains a list of work items, one per line, in the form:

/source/path | /destination/path
The paths can be either files or directories, and source directories will be processed recursively. There is no form of substitution or environment variable parsing, and there is no facility for excluding things. If the destination is a different type than the source (i.e. source is a file and destination is a directory), then the program will unlink the destination object (recursively) and re-create it as the new type. This means that if you wanted to sync a file into a directory, you should give the full path name of the destination including the file name. This 'feature' might also have some disastrously unexpected effects if you tried to specify a symlink to a directory or file as the source path and a real directory or file as the destination. The config file parsing routines are really simple-minded and will just discard all whitespace in either config file (meaning paths with whitespace will not be parsed correctly). If it causes a lot of issues, I may refine this behavior in the future. Here's the example ssyncd.work file:
#
# ssyncd.work:   Example work file for ssync / ssyncd
#
# Each line must be of the form:
#
#   source path | destination path
#

# Individual files
/mnt/peer/etc/aliases          | /etc/aliases
/mnt/peer/etc/group            | /etc/group
/mnt/peer/etc/group-           | /etc/group-
/mnt/peer/etc/gshadow-         | /etc/gshadow-
/mnt/peer/etc/gshadow          | /etc/gshadow
/mnt/peer/etc/passwd           | /etc/passwd
/mnt/peer/etc/passwd-          | /etc/passwd-
/mnt/peer/etc/shadow-          | /etc/shadow-
/mnt/peer/etc/shadow           | /etc/shadow

# Directory trees
/mnt/peer/etc/cron.d           | /etc/cron.d
/mnt/peer/etc/cron.daily       | /etc/cron.daily
/mnt/peer/etc/cron.monthly     | /etc/cron.monthly
/mnt/peer/etc/cron.weekly      | /etc/cron.weekly
/mnt/peer/etc/init.d           | /etc/init.d
/mnt/peer/etc/logrotate.d      | /etc/logrotate.d
/mnt/peer/etc/rc0.d            | /etc/rc0.d
/mnt/peer/etc/rc1.d            | /etc/rc1.d
/mnt/peer/etc/rc2.d            | /etc/rc2.d
/mnt/peer/etc/rc3.d            | /etc/rc3.d
/mnt/peer/etc/rc4.d            | /etc/rc4.d
/mnt/peer/etc/rc5.d            | /etc/rc5.d
/mnt/peer/etc/rc6.d            | /etc/rc6.d
/mnt/peer/etc/rcS.d            | /etc/rcS.d

mwshaffer@yahoo.com