IGV
This option enables additional files to be associated with the FASTA reference sequence file, as described below. These files are archived in a zip with with a .genome extension. This option also allows the reference sequence to be defined as a directory of FASTA files, rather than a single FASTA.
Prerequisites:
- Either (1) a FASTA file that contains the sequence data for each chromosome, or (2) a directory. Directories of zip archives and gzipped FASTAs are no longer supported.
- A cytoband file, which IGV uses to display the chromosome ideogram. (Optional)
- An annotation file, which IGV uses to display the reference gene track. The file can be in BED format, GFF format, or any variation of the genePred table format. (Optional)
- An alias file defining alternative names for chromosomes. (Optional)
Note: If you are choosing files from the NCBI directory, you will generally want to use the .fna or .ffn file (nucleic acid sequences), as opposed to the .faa (amino acids). Choose the .gff file for the annotation file.
Step-by-step:
- Click Genomes>Create .genome File. IGV displays the a window where you enter the information.
- Enter an ID and a descriptive name for the genome.
- Enter the path on your file system or a web URL to the FASTA file for the genome. If the FASTA file has not already been indexed, an index will be created during the import process. This will generate a file with a “.fai” extension which must be in the same directory as the FASTA file; thus it is necessary that the directory containing the file be writable.
- Optionally, specify the cytoband file and the annotation (gene) file.
-
If the sequence (chromosome) names differ between your FASTA and annotation files, you might need to create an alias file to provide a mapping between the different names. Certain well-known aliases are built into IGV and do not require an alias file. These include mappings that involve adding or removing the prefix “chr” to the name, for example 1 > chr1 and chr1 > 1. Also, NCBI identifiers of the form **gi 125745044 ref NC_002229.3 in a FASTA file will be mapped to names of the form NC_002229.3** in the corresponding GFF file. - Click Save. IGV displays the Genome Archive window.
- Select the directory in which to save the genome archive (.genome) file and click *Save. IGV saves the genome and loads it into IGV.