This function calculates the Grantham distance for pairs of amino acids.
Amino acid identities should be provided as three-letter codes in x
and
y
. Amino acids identified in x
and y
are matched element-wise, i.e. the
first element of x
is paired with the first element of y
, and so on.
The Grantham distance attempts to provide a proxy for the evolutionary distance between two amino acids based on three key chemical properties: composition, polarity and molecular volume. In turn, evolutionary distance is used as a proxy for the impact of missense substitutions. The higher the distance, the more deleterious the substitution is.
The distance calculation is provided by two methods. The so-called original method, meaning that the amino acid distances used are the ones provided by Grantham in his original publication in Table 2. This is the default method. In addition, you may choose the exact method, which uses the chemical properties provided in Grantham's Table 1 to compute the amino acid differences anew. The distances calculated with the exact method are not rounded to the nearest integer and will differ by ~1 unit for some amino acid pairs from the original method.
If you want to calculate Grantham's distance by providing the values of the
amino acid properties explicitly, then use grantham_equation()
instead.
Usage
grantham_distance(
x,
y,
method = c("original", "exact"),
alpha = 1.833,
beta = 0.1018,
gamma = 0.000399,
rho = 50.723
)
Arguments
- x
A character vector of amino acid three-letter codes.
- y
A character vector of amino acid three-letter codes.
- method
Either
"original"
(default) or"exact"
, see description for more details.- alpha
The constant \(\alpha\) in the equation of Grantham's paper, in page 863.
- beta
The constant \(\beta\) in the equation of Grantham's paper, in page 863.
- gamma
The constant \(\gamma\) in the equation of Grantham's paper, in page 863.
- rho
Grantham's distances reported in Table 2, Science (1974). 185(4154): 862--4 by R. Grantham, are scaled by a factor (here named \(\rho\)) such that the mean value of all distances are 100. The
rho
parameter allows this factor \(\rho\) to be changed. By default \(\rho=50.723\), the same value used by Grantham. This value is originally mentioned in the caption of Table 2 of the aforementioned paper.
Value
A tibble of Grantham's distances for each amino acid pair.
Examples
# Grantham's distance between Serine (Ser) and Glutamate (Glu)
grantham_distance('Ser', 'Glu')
#> # A tibble: 1 × 3
#> x y d
#> <chr> <chr> <dbl>
#> 1 Ser Glu 80
# Grantham's distance between Serine (Ser) and Glutamate (Glu)
# with the "exact" method
grantham_distance('Ser', 'Glu', method = 'exact')
#> # A tibble: 1 × 3
#> x y d
#> <chr> <chr> <dbl>
#> 1 Ser Glu 79.8
# `grantham_distance()` is vectorised
# amino acids are paired element-wise between `x` and `y`
grantham_distance(x = c('Pro', 'Gly'), y = c('Glu', 'Arg'))
#> # A tibble: 2 × 3
#> x y d
#> <chr> <chr> <dbl>
#> 1 Pro Glu 93
#> 2 Gly Arg 125
# Use `amino_acid_pairs()` to generate pairs (by default generates all pairs)
aa_pairs <- amino_acid_pairs()
grantham_distance(x = aa_pairs$x, y = aa_pairs$y)
#> # A tibble: 400 × 3
#> x y d
#> <chr> <chr> <dbl>
#> 1 Ser Ser 0
#> 2 Ser Arg 110
#> 3 Ser Leu 145
#> 4 Ser Pro 74
#> 5 Ser Thr 58
#> 6 Ser Ala 99
#> 7 Ser Val 124
#> 8 Ser Gly 56
#> 9 Ser Ile 142
#> 10 Ser Phe 155
#> # … with 390 more rows