Discipline: Biological Sciences
Subcategory: Genetics
Syed Hussain Ather - Indiana University-Bloomington
Co-Author(s): Chunyu Liu, University of Illinois at Chicago, IL Miguel Brown, University of Chicago, IL
In order to study gene expression profiles, sequences must be compared to reference transcriptomes, the set of the transcript collection of an organism. As a result, the amount of transcript information that can be extracted from RNA-Seq is limited by the so-called reference transcriptome. For this reason, it is necessary to have the most up-to-date reference transcriptomes in order to annotate extracted RNA sequences with the most available information. The purpose of this study is to identify novel genomic information of the human brain obtained from newly released ISO-Seq data. ISO-Seq data consists of full-length transcripts of various alternative splicing combinations. ISO-Seq data provides us the unique opportunity to observe full-length transcripts in specific tissue. Therefore, we intend to use such novel information to evaluate the completeness of the current reference transcriptome, specifically for our study of human brain gene expression.
We first compared ISO-seq data with the current reference transcriptome, Gencode V19, which we used for RNA-Seq analysis and identified 166 genes that detected by ISO-Seq, but not part of Gencode reference. Then, we added these “novel” genes into a customized reference to re-analyze our RNA-Seq data. We found that out of the 166 novel transcripts identified, 96 had mapped reads from our brain RNA-seq data and 73 were significantly expressed.
This research sheds light on potential novel genes and pathways expressed in human brain. The next step for this project would be to further examine those overlapping transcripts in addition to non-overlapping transcripts to get a more thorough picture of the novel genomic data. Further investigation will be necessary to thoroughly elucidate these findings to determine whether or not the newly added information is truly novel or, for instance, only an expressed sequence tag.
Funder Acknowledgement(s): This study was supported, in part, by a grant from NIH Conte Center awarded to Barry Aprison PhD, Director of Education & Outreach, University of Chicago, Chicago, DC.
Faculty Advisor: Chunyu Liu,