Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R

Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R

Free Standard Shipping

Purchasing Options

ISBN 9781420069730
Cat# C6973
Add to cart
ISBN 9781420069747
Cat# CE6973


  • Presents combinatorial pattern matching problems in a uniform framework
  • Provides an intuitive presentation of the algorithms, followed by a detailed exposition in pseudo-code, making it easy to comprehend the algorithmic solutions
  • Offers alternative implementations of the algorithms in Perl and R to enable the testing of algorithms and the building of projects based on the code
  • Establishes a basis for applying and extending the latest results in the field by including material from the specialized research literature
  • Includes the Perl and R source code for all the algorithms on the author’s website


Emphasizing the search for patterns within and between biological sequences, trees, and graphs, Combinatorial Pattern Matching Algorithms in Computational Biology Using Perl and R shows how combinatorial pattern matching algorithms can solve computational biology problems that arise in the analysis of genomic, transcriptomic, proteomic, metabolomic, and interactomic data. It implements the algorithms in Perl and R, two widely used scripting languages in computational biology.

The book provides a well-rounded explanation of traditional issues as well as an up-to-date account of more recent developments, such as graph similarity and search. It is organized around the specific algorithmic problems that arise when dealing with structures that are commonly found in computational biology, including biological sequences, trees, and graphs. For each of these structures, the author makes a clear distinction between problems that arise in the analysis of one structure and in the comparative analysis of two or more structures. He also presents phylogenetic trees and networks as examples of trees and graphs in computational biology.

This book supplies a comprehensive view of the whole field of combinatorial pattern matching from a computational biology perspective. Along with thorough discussions of each biological problem, it includes detailed algorithmic solutions in pseudo-code, full Perl and R implementation, and pointers to other software, such as those on CPAN and CRAN.

Table of Contents


Combinatorial Pattern Matching

Computational Biology

A Motivating Example: Gene Prediction



Sequences in Mathematics

Sequences in Computer Science

Sequences in Computational Biology

Simple Pattern Matching in Sequences

Finding Words in Sequences

General Pattern Matching in Sequences

Finding Subsequences

Finding Common Subsequences

Comparing Sequences



Trees in Mathematics

Trees in Computer Science

Trees in Computational Biology

Simple Pattern Matching in Trees

Finding Paths in Unrooted Trees

Finding Paths in Rooted Trees

General Pattern Matching in Trees

Finding Subtrees

Finding Common Subtrees

Comparing Trees



Graphs in Mathematics

Graphs in Computer Science

Graphs in Computational Biology

Simple Pattern Matching in Graphs

Finding Paths in Graphs

Finding Trees in Graphs

General Pattern Matching in Graphs

Finding Subgraphs

Finding Common Subgraphs

Comparing Graphs

Appendix A: Elements of Perl

Perl Scripts

Overview of Perl

Perl Quick Reference Card

Appendix B: Elements of R

R Scripts

Overview of R

R Quick Reference Card



Bibliographic Notes appear at the end of each chapter.

Author Bio(s)

Gabriel Valiente is Associate Professor in the Department of Software at the Technical University of Catalonia. His research interests include computational biology, bioinformatics, exact and approximate matching in graphs and patterns, and graph transformation.

Editorial Reviews

I like the hands-on approach this book offers, and the very pedagogical structure it follows … . The book also has tons of examples, thoughtfully chosen and beautifully laid out … the book is very well-written and accessible, undoubtedly written by an author who takes great care in preparing his manuscripts and teaching about an area he enjoys working on.
—Anthony Labarre, SIGACT News, July 2012

This text provides a solid foundation to the field. It will work as a practical handbook for pattern matching applications in computational biology. 
—Michael Goldberg, Computing Reviews, February 2010

… the book makes a clear distinction between problems that emerge in the analysis of the structure and in the comparative analysis of two or more structures. … Well-known computational biology tools that allow searching nucleotide and protein databases for local sequence alignment are based on CPM algorithms only. The techniques presented in this book go beyond that. … detailed algorithm solutions in pseudocode, full Perl and R implementation, and pointers to software and implementation are presented. This … is what makes Valiente’s effort unique. …
—Ernesto D’Avanzo, Computing Reviews, February 2010

… It is a well-sorted collection of pattern matching algorithms that are used to work with problems in computational biology. … You can find all of the sources on the author’s website, which come in handy when you actually want to use them, since you do not have to retype them. And there is an introduction to Perl as well as to R, showing how to decode DNA/RNA-triplets to amino acids and giving some basic overview over standard functions. … I certainly recommend this as an introduction and reference to some algorithms of pattern matching in computational biology. You actually learn how algorithms over the most important data types are designed in a straightforward, logical way. …
—Jannik Pewny, IACR Book Reviews, 2009

…after a few minutes of random browsing, I was left with a feeling of total appreciation of the book, admiration for Prof. Gabriel Valiente, and a realization that this book will be part of my fundamental library for me and my group from the moment it is published. There are so many good things to say that I do not know where to start. The organization is straightforward with major sections that extend from simple sequences to trees to graphs. … This parallel structure makes it easy to apply lessons used on the simplest object (sequences) to objects of medium (trees) and significant (graphs) difficulty. …a wonderful way to learn leveraging … The Perl is beautifully clear and the examples have already taught me how to improve my own code.
—Michael Levitt, Professor and Chair, Department of Structural Biology, Stanford University, California, USA

…Balancing a careful mixture of formal methods, programming, and examples, Gabriel Valiente has managed to harmoniously bridge languages and contents into a self-contained source of lasting influence. It is not difficult to predict that this book will be studied indifferently by the specialist of biology and computer science, helping each to walk a few steps toward the other. It will entice new generations of scholars to engage in its beautiful subject.
—From the Foreword, Alberto Apostolico, Professor, College of Computing, Georgia Tech, Atlanta, USA

Unlocks the power for R for Perl programmers, and vice versa. Reveals R to be a powerful and accessible tool for bioinformatics. The title is a mouthful, but the use of both R and Perl for bioinformatics is revealing.
—Steven Skiena, Professor, Department of Computer Science, Stony Brook University, New York, USA

Other CRC Press Sites
Featured Authors
Facebook Page for CRC Press Twitter Page for CRC Press You Tube Channel for CRC Press LinkedIn Page for CRC Press Google Plus Page for CRC Press
Sign Up for Email Alerts
© 2014 Taylor & Francis Group, LLC. All Rights Reserved. Privacy Policy | Cookie Use | Shipping Policy | Contact Us