Page tree
Skip to end of metadata
Go to start of metadata

Contents

Overview

Homer was developed to discover enriched motifs in ChIP-Seq peaks using "findMotifGenomes.pl".  The general usage form of this utility is:

findMotifsGenome.pl <peak/BED file> <genome> <output directory> [options]

This tutorial will discuss how to use this tool on the HPCC.

Data Sets

Homer can utilize one of several preconfigured reference genomes, or a custom reference supplied by the user in FASTA format.  Available HOMER data sets including the following:

  • hg19 v5.4 human genome and annotation for UCSC hg19
  • tair10 v5.4 arabidopsis genome and annotation (tair10)
  • dm3 v5.4 fly genome and annotation for UCSC dm3
  • mm9 v5.4 mouse genome and annotation for UCSC mm9
  • galGal4 v5.4 chicken genome and annotation for UCSC galGal4
  • sacCer3 v5.4 yeast genome and annotation for UCSC sacCer3
  • hg17 v5.4 human genome and annotation for UCSC hg17
  • xenTro3 v5.4 frog genome and annotation for UCSC xenTro3
  • sacCer2 v5.4 yeast genome and annotation for UCSC sacCer2
  • ASM294v2 v5.4 pombe genome and annotation (ASM294v2)
  • panTro3 v5.4 human genome and annotation for UCSC panTro3
  • danRer7 v5.4 zebrafish genome and annotation for UCSC danRer7
  • hg38 v5.4 human genome and annotation for UCSC hg38
  • rn4 v5.4 rat genome and annotation for UCSC rn4
  • mm10 v5.4 mouse genome and annotation for UCSC mm10
  • ce6 v5.4 worm genome and annotation for UCSC ce6
  • hg18 v5.4 human genome and annotation for UCSC hg18
  • IRGSP-1 v5.4 rice genome and annotation (IRGSP-1)
  • mm8 v5.4 mouse genome and annotation for UCSC mm8
  • papAnu2 v5.4 human genome and annotation for UCSC papAnu2
  • ce10 v5.4 worm genome and annotation for UCSC ce10
  • tetNig2 v5.4 tetraodon genome and annotation for UCSC tetNig2
  • rn5 v5.4 rat genome and annotation for UCSC rn5
  • xenTro2 v5.4 frog genome and annotation for UCSC xenTro2
  • laevis7.1 v5.4 laevis genome and annotation (laevis7.1)
Icon

As of this writing, only the "dm3, hg19, hg38, mm9, mm10, sacCer3, and tair10" genomes have been installed on the HPCC.  For additional genomes, please contact the HPCC.

Copies of the available data sets can be found in:
 

/mnt/research/common-data/Bio/HOMER/data

Using HOMER Data Sets

When you initiate a run to find enriched motifs, HOMER will attempt to create the subdirectory "preparsed" inside the data directory containing the genome reference data set.  Unfortunately, for central installations such as the one on the HPCC, users will not have write permissions to the installation directory.  Moreover, it is not currently possible to specify an alternate location for the "preparsed".  Finally, the contents of "preparsed" is dependent not only on the reference data set, but also upon the fragment size used.  Therefore, since different users utilize different sizes it is not possible to pre-populate the preparsed directory in a way that is suitable for all users.

There are a couple of approaches to dealing with this problem:

Copy the Data Manually

Create a copy of the data set and directory structure you're interested in from the /mnt/research space in a space where you have write permissions, such as "home" or "scratch".  Then treat the data set like a "custom" genome, passing in the full path to "findMotifGenomes.pl" (discussed in the next section).

Homer Configure Utility

Grab a copy of the 3 files at the top level of the /mnt/research repository and copy these to the location where you'd like to place the data sets:
 

cp /mnt/research/common-data/Bio/HOMER/configureHomer.pl .
cp /mnt/research/common-data/Bio/HOMER/config.txt .
cp /mnt/research/common-data/Bio/HOMER/update.txt .

 

Then run the "configure-Homer.pl" script to install the desired data set, for example dm3:
 

perl ./configureHomer.pl -install dm3


To see a list of the available data sets use the following:
 

perl ./configureHomer.pl -list

Finding Motifs

To see how a motif search can be performed, let's assume that our reference data is located in:

/mnt/home/someUser/data/genomes/dm3

And, that our BED file is named "mybedResults.bed".  We can them simply run the following:
 

module load HOMER
findMotifsGenome.pl  mybedResults.bed /mnt/home/someUser/data/genomes/dm3  /mnt/home/someUser/myMotifs

 

This will search our BED input file against the "dm3" data set and return the results into the "myMotifs" subdirectory located in our home space.

For more advanced options, simply run the "findMotifsGenome.pl" application with no input arguments.