Page tree
Skip to end of metadata
Go to start of metadata



The Augustus gene prediction program provides several training annotation files for various species.  It also permits the user to do their own training on another species or to retrain for one of the provided species.  In cases where Augustus has been installed in a central location for multi-user environments (e.g. HPCC), write privileges may not be available for the "config" directory containing these training files.  This tutorial provides a workaround for those cases.

Available Training Annotation Files

The following species are provided with the Augustus package:

 Acyrthosiphon pisum (pea aphid) Candida tropicalis Gallus gallus Saccharomyces cerevisiae
 Aedes aegypti (mosquito), Chaetomium globosum Histoplasma capsulatum Schistosoma mansoni (worm)
 Amphimedon queenslandica (sponge) Chlamydomonas reinhardtii (green algae) Homo sapiens (human),  Schizosaccharomyces pombe
 Ancylostoma ceylanicum Coccidioides immitis Kluyveromyces lactis Solanum lycopersicum (tomato)
 Arabidopsis thaliana (plant), Coprinus cinereus (fungus), Laccaria bicolor Staphylococcus aureus
 Aspergillus fumigatus Cryptococcus neoformans gattii Leishmania tarentolae (protozoa, intronless) Tetrahymena thermophila (ciliate)
 Aspergillus nidulans Cryptococcus neoformans neoformans Lodderomyces elongisporus Theobroma cacao (cacao)
 Aspergillus oryzae Danio rerio Magnaporthe grisea Thermoanaerobacter tengcongensis (a bacterium)
 Aspergillus terreus Debaryomyces hansenii Nasonia vitripennis (wasp) Toxoplasma gondii (parasitic protozoa)
 Botrytis cinerea Drosophila melanogaster (fruit fly),  Neurospora crassa Tribolium castaneum (bug)
 Brugia malayi (nematode), Encephalitozoon cuniculi Nicotiana attenuata (coyote tobacco) Trichinella spiralis
 Caenorhabditis elegans (worm) Eremothecium gossypii Petromyzon marinus (sea lamprey) Triticum aestivum (wheat)
 Callorhinchus milii Escherichia coli Phanerochaete chrysosporium Ustilago maydis
 Candida albicans Fusarium graminearum Pichia stipitis Yarrowia lipolytica
 Candida guilliermondii Galdieria sulphuraria (red algae) Rhizopus oryzae Zea mays (maize)

Specifying an Alternate Config Directory

In order to have full write access to the Species Training Annotation files, you need to respecify the location of the "config" directory to the Augustus program.  The environmental variable "AUGUSTUS_CONFIG_PATH" specifies the location of this directory.  Changing the location to a place where you have write permissions and passing that onto Augustus can be done in a couple of different ways.

First, you should copy the existing config directory to a place where you have full write permissions.  You will need to identify the default "config" path first.  This can be done using:

module load augustus

Then "rsync" the full config directory to the alternate location:

rsync -av /opt/software/augustus/2.7.0--GCC-4.4.5/config /mnt/home/someUser/

Next, use one of the following methods:

Method 1

Simply redefine the environmental variable from the command line.  Please note that this change will last only for as long as your current login session:

export AUGUSTUS_CONFIG_PATH=/mnt/home/someUser/config

Method 2

A more permanent method of accomplishing the above would be to modify your "~/.bashrc" file so that the path is redefined automatically, each time you login:

vim ~/.bashrc
module load augustus
export AUGUSTUS_CONFIG_PATH=/mnt/home/someUser/config

The modifications above are only good for systems (like the HPCC) where a modules system is used. You must load the module (which sets AUGUSTUS_CONFIG_PATH) and then reset it. For non-modules systems, just remove the "module load" line.

To make this take effect during the current session:

source ~/.bashrc

This should only need to be done during the session when the path is redefined.  Subsequent logins will have the new path loaded.  

On the MSU HPCC, do NOT use modules for loading augustus. You will have accomplished this in the .bashrc file, and loading augustus again will munge your path!

Method 3

You can use a direct approach to modify the environmental variable at run time:

augustus --AUGUSTUS_CONFIG_PATH=/mnt/home/someUser/config/ --species=SPECIES queryfilename

More Information

For more information on the specifics of augustus training and retraining, the following links may prove useful: