To map against a set of contigs of total length L (in bp), using K spaced seeds of weight W, SHRiMP2 needs the following minimum amount of RAM, in bytes:
The third term represents working memory.
The second term varies with the weight W of the seeds used. With the default 4 seeds of weight 12, on 64 bit-machines (with 8-byte pointers), this amounts to 0.75GB. (**) By hashing kmers (see -H parameter), the exponent can be brought
down to 12, at the expense of some speed drop.
The first term generally dominates. E.g., with the default settings (K=4 seeds), to map against the full hg18 (L=3*10^9 bp), the first term becomes 3*10^9 x 4 x 4 = 48GB of RAM.
To make it accessible to machines with smaller amounts of RAM, SHRiMP2 provides a mechanism to split a genome into several chunks, each of which comes with the overhead of terms 2 and 3. E.g., in our tests, we split hg18 into 4 chunks, each using about 13GB of RAM, which can fit comfortably on a machine with 16GB of RAM.