Home

Back

Next

toplogo.jpg (15559 bytes)

SPSpfam 2.3.1 Documentation

Introduction

SPSpfam is an accelerated version of Sean Eddy's popular Hmmpfam,  one of the important modules in the HMMER 2.0 suite of Hidden Markov codes.   We have modified the algorithms extensively to provide highly improved throughtput through the entire range of query characteristics.
SPSpfam is identical in the command-line interface and input/output format to Hmmpfam.  It is designed to replace Hmmpfam both in stand-alone mode or in embedded applications.  It is intended for users who use Hmpfam on a regular basis for hundreds and thousands of queries.  All of the documentation which is pertinent to Hmmpfam also pertains to SPSpfam.
There are only two differences between the applications as far as the user is concerned.  These are:
1)  SPSpfam requires a binary HMM library as an input HMM library.  A special pre-processing formatter,  SPSconvert,  is supplied to pre-process the textual HMMER2.0 HMM libraries and prepare them for use by SPSpfam.  One advantage of the binary library is that it is about half the size of the equivalent HMMER 2.0 text-based HMM library.  SPSconvert is fast,  requiring less than 2 minutes to convert Pfam_ls 8.0 or Pfam_fs 8.0 (~5100 models) from text to SPS binary format on a 2 GHz Pentium IV processor.
2)  The --cpu option must be used to ensure that multiple threads will execute on machines with more than one processor.  Some versions of Hmmpfam have been compiled with the "have_threads" option.  The threads can automatically use multple processors without the explicit use of the --cpu command-line option.
The detailed documentation for Hmmer, including Hmmpfam,  can be obtained from the Washington University Pfam website. hmmer.wustl.edu

Performance Characteristics

The performance gains of SPSpfam over Hmmpfam can vary widely depending on the nature of the input queries.  You should thoroughly understand the performance profile of SPSpfam before you consider its use.  However,  if your processing requirements fit well with the SPSpfam performance profile you will get great performance benefits.
First,  SPSpfam is designed to process several queries in one pass.  It is not nearly as profitable to use SPSpfam as a single query processor.  Below is a graph of the performance speedup of SPSpfam versus Hmmpfam.  We use a collection of 100 amino acid sequences as an example of the performance improvements as a function of the number of queries.  The benchmark details can be found in Appendix A.
 

Page 1
     

Copyright © 2003 Southwest Parallel Software, Inc.
All Rights Reserved.