Substitution scoring matrices for proteins - An overview
Document Type
Article
Publication Title
Protein Science
Abstract
Sequence analysis is the primary and simplest approach to discover structural, functional and evolutionary details of related proteins. All the alignment based approaches of sequence analysis make use of amino acid substitution matrices, and the accuracy of the results largely depends on the type of scoring matrices used to perform alignment tasks. An amino acid substitution matrix is a 20 × 20 matrix in which the individual elements encapsulate the rates at which each of the 20 amino acid residues in proteins are substituted by other amino acid residues over time. In contrast to most globular/ordered proteins whose amino acids composition is considered as standard, there are several classes of proteins (e.g., transmembrane proteins) in which certain types of amino acid (e.g., hydrophobic residues) are enriched. These compositional differences among various classes of proteins are manifested in their underlying residue substitution frequencies. Therefore, each of the compositionally distinct class of proteins or protein segments should be studied using specific scoring matrices that reflect their distinct residue substitution pattern. In this review, we describe the development and application of various substitution scoring matrices peculiar to proteins with standard and biased compositions. Along with most commonly used standard matrices (PAM, BLOSUM, MD and VTML) that act as default parameters in various homologs search and alignment tools, different substitution scoring matrices specific to compositionally distinct class of proteins are discussed in detail.
First Page
2150
Last Page
2163
DOI
10.1002/pro.3954
Publication Date
11-1-2020
Recommended Citation
Trivedi, Rakesh and Nagarajaram, Hampapathalu Adimurthy, "Substitution scoring matrices for proteins - An overview" (2020). Open Access archive. 2042.
https://impressions.manipal.edu/open-access-archive/2042