Data downloaded from the NCBI website, or prepared by users can, in most cases, be easily converted for use with BLAST. This brief tutorial is designed to illustrate a fairly basic scenario where the user wants to download a set of FASTA sequences from the NCBI website and prepare them for BLAST-ing.
The simplest way to do this is to note the link of the FASTA file, and use either the "wget" or "curl" command. For example:
This will download the file "Ta.seq.all.gz" into the current directory. Now unzip the file:
This will leave a file called "Ta.seq.all" in your directory.
Preparing the Indices
To prepare the BLAST indices for nucleotides:
The command above will produce several files, such as:
If you want to produce protein indices instead of, or in addition to nucleotides, run:
In this case, this will produce the files:
You can verify whether your BLAST formatting was successful by looking at the "formatdb.log" file which should now be present in your directory.