You are viewing the Vignette for R package chromoMap on bitbiology.com.

Author: Lakshay Anand ( lakshayanand15@gmail.com )

chromoMap provides interactive, configurable and elegant graphics visualization of the human chromosomes allowing users to map chromosome elements (like genes,SNPs etc.) on the chromosome plot. It introduces a special plot viz. the “chromosome Heatmaps” that, in addition to mapping elements, can visualize the data associated with chromosome elements (for eg. gene expession) in the form of heat colors which can be highly advantageous in the scientific interpretations and research work. Bacause of the large size of the chromosomes, it is impractical to visualize each element on the same plot. However, the chromoMap provides a magnified view for each of chromosome location to render additional information and visualization specific for that location. You can map thousands of genes and can view all mappings easily.Users can investigate the detailed information about the mappings(like gene names or total genes mapped on a location) or can view the magnified single or double stranded view of the location showing each mapped element in sequential order. The plots can be saved as cutomizable HTML documents that can be shared easily. In addition, you can include them in R Markdown or in R Shiny applications.

Introduction

Chromosomes are genetic material consisting of essential elements ,like genes , that spans over the length of the chromosome. Mapping an element on the chromosome is a task of marking the element position on that chromosome. Chromosomes are usually very large and can consist of several thousands of elements. This pose a restriction in their visualization as all the elements cannot be practically marked on a chromosomes in a restricted canvas size. chromoMap is designed to create chromosome visualizations that can be used for mapping elements based on their position on the chromosome while allowing the users to view each mapping on the same plot. The plot display 22 autosomes, an X chromosome, a Y chromosome, and also the mitochondrial chromosome(Mt). All the chromosomes, except the Mt chromosome, are represented as linear bars with approximate relevant lengths and centromere positions. While the Mt chromosome is represented as a circular ring. Each chromosome , for the sake of including all positions, is divided into sections called as chromosome loci. Each locus is actually a genomic range(measured in bp) on the chromosome.Expectedly, more than one element can be mapped on a single locus. In that case, the plot visualize the mapping by painting the locus bar with color. However, users can hover over the locus bar to investigate detailed mappings for a locus. Users can also create chromosome heatmaps to visualize element-associated data on the plot.

The chromoMap provide the users to create three types of chromosome plots:

Installation

Users can simply install the package with following command on R console:

install.packages("chromoMap")

Input Data

The data required for creating chromoMaps is in the form of R data.frame with each column providing specific aspect of the data.

For basic annotation the data must contain the following columns:

For heatmaps, you need to add the additional columns for providing the data information for each element. These are provided as:

The package is included with a dataset pancandata to get started with.You can use the data with following commands:

library(chromoMap)
data("pancandata")

This will load the dataset into R workspace. pancandata is the dataset of pancreatic cancer obtained from The Cancer Genome Atlas database. The dataset consist of two data data1 and data2.

data1 consist of annotation information of 630 genes that were predicted as differentially expressed in case of pancreatic cancer. It is a data.frame emulating the format of input data required for annotation using chromoMap. You can access data1 by using pancandata$data1 as:

head(pancandata$data1,10)
#>        name chrom     start     data
#> 1      TFF1 chr21 -42362281 6.559301
#> 2   UGT1A10  chr2 233636476 6.485715
#> 3    MUC5AC chr11   1157952 6.469104
#> 4      PSCA  chr8 142670307 6.410684
#> 5   CEACAM5 chr19  41708584 6.400217
#> 6      TFF2 chr21 -42346357 6.220588
#> 7      AGR2  chr7 -16792639 6.077799
#> 8  SERPINB5 chr18  63476910 6.014451
#> 9    FAM83A  chr8 123179046 6.011718
#> 10     KLK7 chr19 -50976478 5.999328

However, for demonstration of “heatmap-double” type of chromosome plot, another data, data2 ,is provided that also contains secondary data. data2 is actually data for gene expression values of 25,465 genes for one normal sample and one tumor sample. This data is also used to demonstrate how thousands of genes can be annotated on the plot.

To view data2 use:

head(pancandata$data2,10)
#>          name chrom      start       data secondData
#> 1        A1BG chr19  -58346805 -3.6030487 -2.3910927
#> 2        NAT2  chr8   18391244 -1.4393052 -3.7294315
#> 3         ADA chr20  -44619518  2.8010899  2.9129588
#> 4        CDH2 chr18  -27950962  2.0287984  3.2640842
#> 5        AKT3  chr1 -243499718  4.3607969  5.3076926
#> 6     GAGE12F  chrX   49341113 -7.1130151 -7.1130151
#> 7   RNA5-8SN5 chr22    8212570 -7.1130151 -7.1130151
#> 8  ZBTB11-AS1  chr3  101676429  0.3534994  0.3825793
#> 9        MED6 chr14  -70583220  3.6320227  3.4703585
#> 10      NR2E3 chr15   71810547 -2.3910927 -2.3910927

the data column contains the normal gene expression values while the secondData column contains tumor expression values.

Note- The chromoMap will take the data in this format only. Make sure the column names are as given above. The dataset included in this package has been pre-processed and converted into the format required for mapping. We will us the given dataset in this vignette.

Basic Annotation

As the most simple chromosome plots, chromoMap provide annotation plots that can be constructed as:

library(chromoMap)
data("pancandata")
chromoMap(pancandata$data1,type = "annotation")

The genes are mapped or annotated on the chromosomes as shown. Hover over each of the annotated locations to view more information.

The tooltips will show the following data:

The Chromosome Heatmaps

chromoMap can be used to plot chromosome heatmaps that can ,in addition to mapping elements on the chromosomes, visualize the data as heat colors. Two categories of chromosome heatmap plots can be created:

Heatmap-single

heatmap-single will take only single data for each element and visualize the data in single strand.

library(chromoMap)
data("pancandata")
chromoMap(pancandata$data1,type = "heatmap-single")

Note: The entire plot is showing averaged data heatmap for each of location mapped, i.e, heat color is assigned based on the average value of data at a given location. For viewing actual data heatmap for each of the mapped gene, hover over the locations. The toottips shows folowwing data:

Heatmap-double

A special kind of heatmap plot is the heatmap-double plot that will take secondary data in addition to the primary data for each element.

Such a plot can be obtained by:

library(chromoMap)
data("pancandata")
chromoMap(pancandata$data2,type = "heatmap-double")