Atom feed of this document
 

 Detecting Failed Drives

It has been our experience that when a drive is about to fail, error messages will spew into /var/log/kern.log. There is a script called swift-drive-audit that can be run via cron to watch for bad drives. If errors are detected, it will unmount the bad drive, so that OpenStack Object Storage can work around it. The script takes a configuration file with the following settings:

            [drive-audit]
            Option 	Default 	Description
            log_facility 	LOG_LOCAL0 	Syslog log facility
            log_level 	INFO 	Log level
            device_dir 	/srv/node 	Directory devices are mounted under
            minutes 	60 	Number of minutes to look back in /var/log/kern.log
            error_limit 	1 	Number of errors to find before a device is unmounted
            

This script has only been tested on Ubuntu 10.04, so if you are using a different distro or OS, some care should be taken before using in production.



loading table of contents...