An unprecedented wealth of data is being generated by genome sequencing projects and other experimental efforts to determine the structure and function of biological molecules. The demands and opportunities for interpreting these data are expanding rapidly. Bioinformatics is the development and application of computer methods for management, analysis, interpretation, and prediction, as well as for the design of experiments. Machine learning approaches (e.g., neural networks, hidden Markov models, and belief networks) are ideally suited for areas where there is a lot of data but little theory, which is the situation in molecular biology. The goal in machine learning is to extract useful information from a body of data by building good probabilistic models--and to automate the process as much as possible.
In this book Pierre Baldi and Søren Brunak present the key machine learning approaches and apply them to the computational problems encountered in the analysis of biological data. The book is aimed both at biologists and biochemists who need to understand new data-driven algorithms and at those with a primary background in physics, mathematics, statistics, or computer science who need to know more about applications in molecular biology.
This new second edition contains expanded coverage of probabilistic graphical models and of the applications of neural networks, as well as a new chapter on microarrays and gene expression. The entire text has been extensively revised.
"With this work, Baldi and Brunak have provided a sound foundation for the process of classifying and interconnecting the hierarchy of parts encoded by genomic sequence data and their variability. Not only is the book appropriate for students new to this intersection between computation and biology, it will also prove useful for long-time workers on `classic' problems in computational molecular biology. The book has a continuity from beginning to end that helps a reader to develop an understanding of machine learning techniques and how to apply them to molecular biology... this book is one of four indispensable books for the bioinformatician's library."
Pierre Baldi and Soren Brunak present the key machine learning approaches and apply them to the computational problems encountered in the analysis of biological data. The book is aimed at two types of researchers and students. First are the biologists and biochemists who need to understand new data-driven algorithms, such as neural networks and hidden Markov models, in the context of biological sequences and their molecular structure and function. Second are those with a primary background in physics, mathematics, statistics, or computer science who need to know more about specific applications in molecular biology.
Pierre Baldi is Chairman of the Board, Net-ID, Inc. Søren Brunak is Director, Center for Biological Sequence Analysis, The Technical University of Denmark.
Their bayesian presentation of machine learning algorithms can be hard to follow at times, but the authors cover a large amount of very current practical and theoretical material. One of the the book's unique features is it's broad scope. The authors discuss neural networks, hidden markov models, clustering, gaussian processes and support vector machines. The bibliography contains some of the most useful references for those wishing to implement bioinformatics algorithms. The fast pace may leave some wanting more complete explanations.
You should disregard the claim that this book could be used by those unfamiliar with either molecular biology or computer science. To really make the most of this book, you should be comfortable with the material in Pattern Classification (Duda, Hart and Stork), Biological Sequence Analysis (Durbin, Eddy, Krogh, and Mitchison), and Molecular Biology of the Cell (Alberts et al). That said, this is the best bioinformatics book on the market.
Very well written, clear, and self-contained. The authors provide a masterly treatment of machine learning methods (neural networks, hidden markov models, etc.) and their applications to fundamental problems in sequence analyis and biology. The book goes all the way from first principles to advanced research topics and should be valuable for both students and researchers. Second edition has many new topics, including DNA microarrays. Requires some concentration but mathematical details are summarized in the appendices. I strongly recommend it for anyone with an interest in bioinformatics and/or machine learning.
This is one of the best, if not the best, book on bioinformatics. Very up to date. Covers biology and machine learning (Bayesian statistics). Very useful for reading, teaching, and research. Covers foundations and advanced applications. Clear and well-written. Authors have good sense of humor!
The book tried to introduce new "theories" in every paragraph. I'd rather see something practical.
Lots of factastic data, information and ideas can be found in this book. Besides it is well written, esay to understand, even if the readers are not expert in neural networks or hidden Markov Models etc. subjects. To my surprise that Baldi and Brunak are presenting thoroughly important keys of the theory with mathematics and physics in it.
Looking forward to seeing the second edition being published. I am a fan of them.
"Bioinformatics", by Baldi and Brunak, is a very well-written treatment of current stochastic algorithmics of genomics and proteomics. It is profitable reading for both the computer scientist learning relevant biology and the computational biologist learning relevant computer science. It probably favours the biologist slightly in this regard, as witnessed by my own enthusiasm for this work. Of particular value are the chapters on hidden markov processes and stochastic grammars. The treatment builds smoothly from early chapters on Bayesian fundamentals in chapter 2, to markov chain monte carlo processes in chapter 3, followed by theory and applications of neural networks, three chapters on hidden markov processes (a fascinating and vital field in modern genomics) and lastly an introductory chapter to the equally important area of stochastic grammars. Other appreciated features include: an up-to-date 452-reference bibliography; a comprehensive survey of web-based resources re both genomic databases and available search engines for DNA, RNA and protein sequence-patterns; in the appendices, there are concise definitional reviews re the coupling of information theory with entropy and aspects of HMM's.Lastly, the price is right, as is most often the case with books from MIT Press.
The above authors have succeeded well in illuminating a large piece of a very large (and growing) object: the landscape of modern informational biology. They of course cannot cover it all. Another recent book (1997) that complements this book's particular focus is that of Setubal and Meidanis ("Introduction to Computational Molecular Biology"). These authors offer a greater emphasis on string and graph theoretic approaches to sequencing algorithms and deal more directly with various heuristic approaches to fragment assembly and hybridization mapping.
The book provides an abundance of excellent information of machine learning techniques as applied to biology. I found the presentation of the material to be clear, detailed, with a wealth of support data regarding many of the complex issues of BI. Thanks to Baldi and Brunak, the ideas such as hidden Markov models and applications in molecular biology are dramatically clear.
This is an excellent book. It contains a broad introduction to the main problems of computational molecular biology and a rigorous description of the foundations of machine learning and other statistical methods. Several chapters cover a variety of applications from DNA, to RNA, to proteins problems. Unlike other books on these topics, this book has very few errors in it, if any.
This book is fantastic! It opens a panoramic view over the future, the new era of biology. I thoroughly enjoyed reading and studing it. It is full of interesting ideas, insights, and techniques. It should prove very valuable to both biologists and computer scientists. Very clear and rigorous foundations. Mr. Baldi and Mr. Brunak are also funny. Nice entertaining pictures.