SPSpfam 2.3.2 Overview
SPSpfam is an accelerated version of Sean Eddy's popular Hmmpfam,
one of the important modules in the HMMER 2.0 suite of Hidden Markov codes.
We have modified the algorithms extensively to provide highly improved
throughtput through the entire range of query characteristics.
SPSpfam is identical in the command-line interface and input/output
format to Hmmpfam. It is designed to replace Hmmpfam both in stand-alone
mode or in embedded applications. It is intended for users who use
Hmpfam on a regular basis for hundreds and thousands of queries.
All of the documentation which is pertinent to Hmmpfam also pertains to
There are only two differences between the applications as far as the
user is concerned. These are:
1) SPSpfam requires a binary HMM library as an input HMM library.
A special pre-processing formatter, SPSconvert, is supplied
to pre-process the textual HMMER2.0 HMM libraries and prepare them for
use by SPSpfam. One advantage of the binary library is that it is
about half the size of the equivalent HMMER 2.0 text-based HMM library.
SPSconvert is fast, requiring less than 2 minutes to convert Pfam_ls
8.0 or Pfam_fs 8.0 (~5100 models) from text to SPS binary format on a 2
GHz Pentium IV processor.
2) The --cpu option must be used to ensure that multiple threads
will execute on machines with more than one processor.
The detailed documentation for Hmmer, including Hmmpfam, can
be obtained from the Washington University Pfam website. hmmer.wustl.edu
The performance gains of SPSpfam over Hmmpfam can vary widely depending
on the nature of the input queries. You should thoroughly understand
the performance profile of SPSpfam before you consider its use. However,
if your processing requirements fit well with the SPSpfam performance profile
you will get great performance benefits.
First, SPSpfam is designed to process several queries in one
pass. It is not nearly as profitable to use SPSpfam as a single query
processor. Below is a graph of the performance speedup of SPSpfam
versus Hmmpfam. We use a collection of 100 amino acid sequences as
an example of the performance improvements as a function of the number
of queries. The benchmark details can be found in Appendix A.
Copyright © 2003 Southwest Parallel Software, Inc.
All Rights Reserved.