Bioinformatics lectures at Rice University. Li Zhang Lecture 1 Department of Bioinformatics and Computational Biology MD Anderson Cancer Center March-April , 2012. Contact information. Li Zhang Phone: 713-563-4298 (office) 713-962-6661 (cell) Email: [email protected]
Bioinformatics lectures at Rice University
Department of Bioinformatics and Computational Biology
MD Anderson Cancer Center
Why should we study bioinformatics?
Why it is important to study bioinformatics?
Let us see a few growth charts …
The Protein Data Bank (PDB) is a repository for the 3-D structural data of large biological molecules, such as proteins and nucleic acids. Most structures are determined by X-ray diffraction, but about 15% of structures are determined by NMR.
Large scale organized efforts by Structural Genomics Initiative and International Structural Genomics Consortium have greatly accelerated the pace of growth.
Gene Expression Omnibus (GEO) database holdsover 10 000 experiments comprising 300 000 samples, 16 billionindividual abundance measurements, for over 500 organisms, submittedby 5000 laboratories from around the world. The database typicallyreceives over 60 000 query hits and 10 000 bulk FTP downloadsper day, and has been cited in over 5000 manuscripts.
There are 126 billion bases in 135 million sequence records in the traditional GenBank divisions and 191 billion bases in 62 million sequence records in the WGS division as of April 2011.
A brief history of the big bang of the digital universe
“The story is similar in fields as varied as science and sports, advertising and public health — a drift toward data-driven discovery and decision-making. It’s a revolution. We’re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.”
-------- By Steve Lohr, “The Age of Big data”, The New York Times, 2012.
Simply put, it is big and complex.
The value of big data is that analysis of the big data can lead to
enhanced decision making,
insight discovery and
In business, big data can help to identify unknown needs, customize advertisement, monitor and evaluate operation, which leads to big profit and big saving. In science, big data is a huge resource for a lot of scientific discoveries.
A brief introduction of molecular biology
James Watson and Francis Crick
Oxford Nanopore, long the sleeper project to watch in the field of mapping DNA, just announced two products that could dramatically change the field of DNA sequencing: a new DNA sequencer that may be able to handle a human genome in 15 minutes, and a USB thumb drive DNA sequencer that can read DNA directly from blood with no prep work.
“‘Game changer’ is an understatement,” says George Church of Harvard University. (Church was one of the inventors on one of the patents licensed to Oxford Nanopore that led to the device.” He ticks off the devices specs: Tiny instruments for $900. Able to read DNA in 10,000-letter stretches — compared to a couple hundred for current technologies. Able to sequence a human genome in fifteen minutes (although you’d need 20 of the server-size devices coming in 2013, not just the USB stick.)
There have been a large series of breakthroughs in micro-electronics and nano-electronics that have produced instruments that quantify and/or characterize large number of biological molecules in parallel using very small mount of biomaterial.
Such technical advances have made possible to comprehensively characterize and quantify the building blocks (DNA, RNA, protein) in a biological system.
List of sequenced genomes of mammals:
Bioinformatics provides tools to catalyze the transformations
Ion semiconductor sensing