Skip to contents

This function calculates the Grantham distance for pairs of amino acids. Amino acid identities should be provided as three-letter codes in x and y. Amino acids identified in x and y are matched element-wise, i.e. the first element of x is paired with the first element of y, and so on.

The Grantham distance attempts to provide a proxy for the evolutionary distance between two amino acids based on three key chemical properties: composition, polarity and molecular volume. In turn, evolutionary distance is used as a proxy for the impact of missense substitutions. The higher the distance, the more deleterious the substitution is.

The distance calculation is provided by two methods. The so-called original method, meaning that the amino acid distances used are the ones provided by Grantham in his original publication in Table 2. This is the default method. In addition, you may choose the exact method, which uses the chemical properties provided in Grantham's Table 1 to compute the amino acid differences anew. The distances calculated with the exact method are not rounded to the nearest integer and will differ by ~1 unit for some amino acid pairs from the original method.

If you want to calculate Grantham's distance by providing the values of the amino acid properties explicitly, then use grantham_equation() instead.

Usage

grantham_distance(
  x,
  y,
  method = c("original", "exact"),
  alpha = 1.833,
  beta = 0.1018,
  gamma = 0.000399,
  rho = 50.723
)

Arguments

x

A character vector of amino acid three-letter codes.

y

A character vector of amino acid three-letter codes.

method

Either "original" (default) or "exact", see description for more details.

alpha

The constant \(\alpha\) in the equation of Grantham's paper, in page 863.

beta

The constant \(\beta\) in the equation of Grantham's paper, in page 863.

gamma

The constant \(\gamma\) in the equation of Grantham's paper, in page 863.

rho

Grantham's distances reported in Table 2, Science (1974). 185(4154): 862--4 by R. Grantham, are scaled by a factor (here named \(\rho\)) such that the mean value of all distances are 100. The rho parameter allows this factor \(\rho\) to be changed. By default \(\rho=50.723\), the same value used by Grantham. This value is originally mentioned in the caption of Table 2 of the aforementioned paper.

Value

A tibble of Grantham's distances for each amino acid pair.

Examples

# Grantham's distance between Serine (Ser) and Glutamate (Glu)
grantham_distance('Ser', 'Glu')
#> # A tibble: 1 × 3
#>   x     y         d
#>   <chr> <chr> <dbl>
#> 1 Ser   Glu      80

# Grantham's distance between Serine (Ser) and Glutamate (Glu)
# with the "exact" method
grantham_distance('Ser', 'Glu', method = 'exact')
#> # A tibble: 1 × 3
#>   x     y         d
#>   <chr> <chr> <dbl>
#> 1 Ser   Glu    79.8

# `grantham_distance()` is vectorised
# amino acids are paired element-wise between `x` and `y`
grantham_distance(x = c('Pro', 'Gly'), y = c('Glu', 'Arg'))
#> # A tibble: 2 × 3
#>   x     y         d
#>   <chr> <chr> <dbl>
#> 1 Pro   Glu      93
#> 2 Gly   Arg     125

# Use `amino_acid_pairs()` to generate pairs (by default generates all pairs)
aa_pairs <- amino_acid_pairs()
grantham_distance(x = aa_pairs$x, y = aa_pairs$y)
#> # A tibble: 400 × 3
#>    x     y         d
#>    <chr> <chr> <dbl>
#>  1 Ser   Ser       0
#>  2 Ser   Arg     110
#>  3 Ser   Leu     145
#>  4 Ser   Pro      74
#>  5 Ser   Thr      58
#>  6 Ser   Ala      99
#>  7 Ser   Val     124
#>  8 Ser   Gly      56
#>  9 Ser   Ile     142
#> 10 Ser   Phe     155
#> # … with 390 more rows