The SnpEff configuration file is essential for specifying key run parameters, including (most importantly), the location of the databases to be used for your analysis. For your convenience, a configuration file template has been provided for your use in the following common directory:
To use SnpEff, you first need to copy this file to your working directory, and make any necessary changes desired for your analysis. Note that the configuration parameter "data_dir" (the database location) defaults to the following path in your home directory space:
You may leave this as-is, providing you actually place your databases in this path. Otherwise, you will need to update it to match the correct location.
The current list of pre-built databases available for SnpEff can be obtained by using the following:
For your convenience, a list has been created and is available for your inspection in the common directory path:
Grep'ing on that file is probably the easiest method of finding out if your genome is supported.
However, the recommended method of obtaining the most recent pre-built databases is to use the SnpEff command itself. For example:
Using the above would place a copy of the Human Genome in the directory ~/snpEff/data.
Note that grep'ing on the "supported_dbs" file yields the following:
Once you have made the necessary modifications to your configuration file, and downloaded your desired databases, you are ready to run SnpEff. As part of executing the SnpEff command, make sure to specify the location of the configuration file you wish to use for the run.
For more information on using SnpEff, please refer to the following documentation: