English Info

Computational Biology is a research area for analyzing biological data such as genome sequence to find new knowledge that helps biological/medical research, or for developing an efficient method that help analyzing biological data.

Biology is one of the most fascinating application of computer science. Twenty years ago, the major task in the biological research was to verify a hypothesis by an experimental method and a computational analysis of data was considered as only a small support for it. However, the current biological research cannot be successful without the help of computational analysis, due to the dramatic improvement in DNA sequencing technology. Development of efficient algorithms based on the state-of-the-art CS technology will be the key to success in new biology.


Research Topics

The state-of-the-art DNA sequencer generates 200 Giga bases per day, which is hundreds of thousand times as large amount of data as the technology of 15 years ago can generate1. This dramatic change in data generation has brought many issues in DNA sequence analyses. In our lab, we aim to develop efficient methods for utilizing such biological big data.

Our topics include, but not limited to:

Developing efficient methods for fundamental genome sequence analysis

The output of the genome sequencer is huge amount of sequence fragments. Therefore, it is necessary to align each fragment with the reference genome, which is already determined before, or to assemble the fragments to reconstruct the original sequence, or to cluster similar fragments, before conducting downstream analyses. We aim to develop more efficient methods for such fundamental tasks by using efficient string processing or/and graph mining algorithms.

Reference genome graph

Currently, the reference genome is represented as a single sequence. Since genome sequence is unique to individuals, it is more natural to represent the reference as a graph structure that can illustrate diversity of sequences of the same species rather than a sequence. We aim to develop new method to construct an efficient data structure for reference genome graph.

Privacy-preserving datamining for biological data

The huge cost down in DNA sequencing has encouraged large-scale personal genome sequencing, however, genomic data that include personal information are not fully utilized at present because privacy issues hinder flexible analyses for finding novel knowledge. To tackle this problem, we aim to develop an efficient algorithm that enables several parties to jointly conduct biological data analysis over their input, while at the same time keeping these inputs private by using cryptographic technique such as homomorphic encryption.

Other topics (only keywords)

Sequence compression, finding association between genes and disease, analyses of genome structural variation

© The Computational Biology Research Laboratory at Waseda University 2018. All rights reserved.

Powered by Hydejack v8.1.1