by Scott Nolin, John Lalande — July 2014
These documents are copied from University of Wisconsin SSEC working documentation and may be useful for some, but we provide no guarantee of accuracy, correctness, or safety. Use at your own risk. |
For up to date information, see the Robinhood Policy Engine Website https://github.com/cea-hpc/robinhood/wiki
Robinhood collects file metadata on Lustre filesystems, allowing us to easily generate lists of files (search by owner, by age, etc.) and to perform corresponding actions on said files (e.g., tmpwatch).
For an overview with statistics on the file systems, the web interfaces may be helpful.
Other useful commands are rbh-find and rbh-report, which must be run on SERVERNAME.
See the man pages for these commands or the robinhood documentation for more info.
Our robinhood server is already up and running, so if you just need to add a new file system to those already monitored on SERVERNAME:
Create the MySQL database that Robinhood will use for the new Lustre filesystem. Run rbh-config create_db
The create_db command will interactively set up a new database. Created a SSD volume (e.g., /ssd-s4).Each file system that robinhood monitors will have a corresponding config file in /etc/robinhood.d/tmpfs.
Edit your filesystem configuration file carefully - if you leave it as the default you can tmpwatch all files older than 8 days! Don't do that. Maybe start from a known good one from TC.
See the existing config files for examples of systems that are tmpwatched and ones that aren't, and consult the robinhood documentation when crafting the config file. You can use an existing config file as a template, but again, be careful.
You will need to turn on changelog support and run an initial scan.
If the initial scan looks good, check that changelogs are being read by the robinhood server. Run: rbh-report -a -f VolumeName (e.g., rbh-report -a -f cdata). Note: this may not show up right away -- you may need to wait an hour or two.
There are two different web interfaces for the robinhood data: robinhood-webui, which is an RPM included in the robinhood distribution, and robinhood-multifs-web (github.com/abrenner/robinhood-multifs-web). These are both available at SERVERNAME.
To add a file system to the standard robinhood-webui:
To add a file system to the robinhood-multifs-web interface:
INSERT INTO `rbh_stats`.`config` (`fullpath`, `friendlyName`, `fsInodeNumber`, `dbGroup`, `label`, `description`) VALUES ('/cscratch', '/cscratch', NULL, 'cscratch', 'primary', 'S4-Cardinal-Scratch');
This work is now done, but in case we need to set up a new Robinhood server, here's how SERVERNAME was set up.
To install robinhood policy engine, we followed the documentation PDFs included with the release (current documentation available at sourceforge.net/projects/robinhood/files/robinhood/2.5.2/doc/).
To summarize the process, and with a few SSEC-specific twists:
Download and install RPMs of latest version from sourceforge (sourceforge.net/projects/robinhood/files/robinhood/). Install the robinhood-adm and robinhood-tmpfs RPMs.
After Robinhood has been installed, you will need to create the MySQL databases for Robinhood to store data in. You will need one database per filesystem.
Pre-requisities for creating the databases:
#RobinHood Policy Engine tuning innodb_file_per_table # 50% to 90% of the physical memory innodb_buffer_pool_size=55G # 2*nbr_cpu_cores innodb_thread_concurrency=32 # memory cache tuning innodb_max_dirty_pages_pct=15 # robinhood is massively multithreaded: set enough connections # for its threads, and its multiple instances max_connections=256 # If you get DB connection failures, increase this parameter: connect_timeout=60 # This parameter appears to have a significant impact on performances: # see this article to tune it appropriately: # http://www.mysqlperformanceblog.com/ 2008/11/21/how-to-calculate-a-good-inn innodb_log_file_size=500
To easily create robinhood database, you can use the ‘rbh-config’ script. Run this script on the
database host to check your system configuration and perform database creation steps:
Check database requirements: rbh-config precheck_db
Create the database: rbh-config create_db
The create_db command will interactively set up a new database.
Robinhood's MySQL database access is very disk I/O intensive, so on SERVERNAME, we stored these databases on solid state disk.
To do this, we:
Each file system that robinhood monitors will have a corresponding config file in /etc/robinhood.d/tmpfs. You can use one of the existing config files as a template, but be very careful when creating the config file! Robinhood was built to purge files -- it's great for tmpwatching Lustre filesystems. Which means it could also be great at deleting files that you didn't mean to delete but were caught by a policy in your config file.
See the existing config files for examples of systems that are tmpwatched and ones that aren't, and consult the robinhood documentation when crafting the config file.
Turn on changelog support:
Once you have your config file finished and have enabled changelog support, start the robinhood service (service robinhood start) -- filesystems that are already being monitored will continue, but robinhood will start monitoring the new filesystem you added a config file for.
You should also run your first scan on the file system. Run: robinhood --once -C -S -f configfile.cfg
On file systems with many files, this may take a few hours to complete.
Check that changelogs are being read by the robinhood server. Run: rbh-report -a -f VolumeName(e.g., rbh-report -a -f cdata). Note: this may not show up right away -- you may need to wait an hour or two.
INSERT INTO `rbh_stats`.`config` (`fullpath`, `friendlyName`, `fsInodeNumber`, `dbGroup`, `label`, `description`) VALUES ('/cscratch', '/cscratch', NULL, 'cscratch', 'primary', 'S4-Cardinal-Scratch');
0 * * * * /usr/bin/lynx --dump http://SERVERNAME/multifs/index.php/cron/getStats 2>&1 /dev/null
Indeed, robinhood is not notified by Lustre in case of access, only for changes (actually it could, but it is not recommended on a production system, as it would represent a huge flow of events). This explains atime is outdated in rbh DB, unlike change time. acces times in DB can be updated by scanning regularly your filesystem. They are also refreshed when applying policy on entries. You can see this outdated atime in rbh-find output, but don't worry, atime is correctly handled when applying a policy (like rmdir): robinhood refreshes its atime when it applies a policy to an entry to ensure it perfectly respects the policy criteria on access times. Regards, Thomas