Bioinformatics of RNA structure and transcriptome regulation

Synopsis:

We develop dedicated computational methods and algorithms for investigating how RNA structural features and RNA-RNA interactions regulate the expression of protein-coding and non-coding genes. Our primary focus are the transcriptomes of eukaryotic model organisms such as the fruit fly (Drosophila melanogaster), the mouse (Mus musculus) and the human (Homo sapiens). To investigate those, we combine computational analysis pipelines developed by us with large scale data sets generated using state-of-the-art high-throughput sequencing and investigation protocols.

Figure: Conserved RNA structure with corresponding multiple sequence alignment (top) overlapping the splice site of a fruit fly gene (bottom). This structure may regulate the alternative splicing of the gene via structural changes induced by RNA editing. Red arrows highlight RNA editing sites.

Scientific questions:

We currently (status 2016) focus on the following scientific questions:

  • Which roles do local RNA structures play in regulating the alternative splicing of eukaryotic genes?
  • And what are the underlying molecular mechanisms?
  • Is it possible to devise methods for predicting new types of trans RNA-RNA interactions that can happily handle full-length transcripts?
  • Which functional roles play trans RNA-RNA interactions in the nucleus and in the cytoplasm of eukaryotic cells?

Methods employed:

The computational methods we employ are usually custom made by us for a specific biological question we want to investigate. For us, the scientific question decides on the most appropriate method (not the other way around). We typically prefer probabilistic methods as these allow us to train any free parameters in a principled way and to assign measures of confidence to our predictions. So far, we have developed methods and algorithms for, for example, HMMs, pairHMMs, probabilistic models of evolution, SCFGs, phylo-SCFGs, graphical models etc. We have shown in numerous settings that biological questions usually benefit from taking a comparative approach, i.e. by simultaneously analysing data from a range of evolutionarily related organisms. This has in the past allowed us to devise powerful methods, for example, for gene prediction, RNA secondary-structure prediction and RNA-RNA interaction prediction.

Future collaborations:

I have joined the MDC early 2016 from the University of British Columbia in Vancouver, Canada, and look forward to setting up collaborations with new colleagues at the MDC and the wider scientific community in Berlin, Germany and Europe. Just get in touch via email, if you are interested.