$seoHelper.renderFullSimple($sitemeshPage,"{2} - {3}")
Page tree
Skip to end of metadata
Go to start of metadata

Contents

Overview

A common problem for many users, is having files auto-purged from scratch before they've had a chance to identify and move those files to home or research spaces.  This method uses a simple script to identify files that are approaching the auto-purge deadline, combined with a user cron job.  The script emails the user at a designated interval with a list of files that are within a time window specified by the user.

The Script

The following script will check the timestamps on all files within the directory specified and output those that are over a specified number of days old (with respect to ctime and mtime).  The list of files produced is then sent to the user via email:

 

#!/bin/bash
 
## Change the days cutoff to suit your needs
timeLimit=40
 
## Specify the directory you want to check
mydir="/mnt/scratch/someuser"
 
## Your email below
EMAIL="someuser@msu.edu"
 
## Alter the Subject if desired
SUBJECT="Scratch timestamp report"
 
cd $mydir
filelist=($(find `pwd -P` -mtime +${timeLimit} -ctime +${timeLimit} | xargs -I {} bash -c 'if [ ! -d {} ] ; then echo {}; fi'))
message="The following are the files in $mydir over $timeLimit days: \n\n"
 
for filename in "${filelist[@]}"
 
do
    message+=${filename}"\n"
done
 
EMAILMESSAGE=`echo -e ${message}`
 
/bin/mail -s "$SUBJECT" "$EMAIL" <<< "$EMAILMESSAGE"

 

Save this script along with any modifications (minimally the email address, directory path, and the number of days you want to use), and set the execution bits:

 

chmod 755 checkScratch.sh

 

Setup User Crontab

Now we need to setup a user cron job to run this script at the desired interval (daily, weekly, etc.).  To do this, we need to select a dev-node and create a crontab file:

 

ssh dev-intel10
crontab -e

 

The "crontab -e" command will open an empty file using "vi" which will become your crontab for that dev-node.  At this point, you can add directives on when the script should be run:

 

SHELL=/bin/bash
PATH=~/bin:/sbin:/bin:/usr/sbin:/usr/bin
# For details see man 4 crontabs
 
# Example of job definition:
# .---------------- minute (0 - 59)
# |  .------------- hour (0 - 23)
# |  |  .---------- day of month (1 - 31)
# |  |  |  .------- month (1 - 12) OR jan,feb,mar,apr ...
# |  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
 
0 4  *  *  * ~/bin/checkScratch.sh

 

In this example, my script will run from "~/bin" at 4:00 am every day (you really only need the very last line shown above, the rest is mostly for information). 

When you are done editing your crontab, save it and then check it:

 

crontab -l

 

Icon

Cron jobs are specific to the dev-node you choose. A crontab on one dev-node will not be editable/runnable on another. But you really don't NEED to have more than one crontab (why would you want to run the same script more than once at the same time?). You need to remember which dev-node you established your crontab on to make changes.

Results

Here's what I get in my inbox when my scratch check script runs:

 

The following are the files in /mnt/scratch/johnj over 40 days:
 
/mnt/lustre_scratch_2012/johnj/chr2.fa
/mnt/lustre_scratch_2012/johnj/chrUn_gl000232.fa
/mnt/lustre_scratch_2012/johnj/chr8_gl000197_random.fa
/mnt/lustre_scratch_2012/johnj/wg.fna
/mnt/lustre_scratch_2012/johnj/chrUn_gl000227.fa
/mnt/lustre_scratch_2012/johnj/chrUn_gl000228.fa
/mnt/lustre_scratch_2012/johnj/chr17_gl000203_random.fa
/mnt/lustre_scratch_2012/johnj/chr18_gl000207_random.fa
/mnt/lustre_scratch_2012/johnj/chrUn_gl000213.fa
/mnt/lustre_scratch_2012/johnj/v_phaser_2.zip
/mnt/lustre_scratch_2012/johnj/chrUn_gl000240.fa
/mnt/lustre_scratch_2012/johnj/chr16.fa
/mnt/lustre_scratch_2012/johnj/chrUn_gl000215.fa
.......
.......