Getting PGS Publications
Source:vignettes/getting-pgs-publications.Rmd
getting-pgs-publications.Rmd
PGS Publications
PGS publications are those published works that underlie the data provided by the PGS Catalog.
To retrieve these publications use the function get_publications()
. You may use one of the following search criteria (or a combination thereof):
-
pgp_id
: the publication identifier assigned by the PGS Catalog; -
pgs_id
: the polygenic score identifier; -
pubmed_id
: PubMed identifier; -
author
: an author last name.
If you do not pass any of the criteria above, then get_publications()
will retrieve all publications in the Catalog.
Getting PGS Publications
Getting PGS publications by their identifiers:
library(quincunx)
get_publications(pgp_id = c('PGP000001', 'PGP000002'))
#> An object of class "publications"
#> Slot "publications":
#> # A tibble: 2 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000001 25855707 2015-04-08 J Natl Cance… Pred… Mavadd… 10.1… Mavadd…
#> 2 PGP000002 30554720 2018-12-13 Am J Hum Gen… Poly… Mavadd… 10.1… Mavadd…
#> # … with abbreviated variable name ¹author_fullname
#>
#> Slot "pgs_ids":
#> # A tibble: 21 × 3
#> pgp_id pgs_id stage
#> <chr> <chr> <chr>
#> 1 PGP000001 PGS000001 gwas/dev
#> 2 PGP000001 PGS000002 gwas/dev
#> 3 PGP000001 PGS000003 gwas/dev
#> 4 PGP000001 PGS000001 eval
#> 5 PGP000001 PGS000002 eval
#> 6 PGP000001 PGS000003 eval
#> 7 PGP000002 PGS000004 gwas/dev
#> 8 PGP000002 PGS000005 gwas/dev
#> 9 PGP000002 PGS000006 gwas/dev
#> 10 PGP000002 PGS000007 gwas/dev
#> # … with 11 more rows
#> # ℹ Use `print(n = ...)` to see more rows
By pgs_id
:
library(quincunx)
get_publications(pgs_id = 'PGS000003')
#> An object of class "publications"
#> Slot "publications":
#> # A tibble: 1 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000001 25855707 2015-04-08 J Natl Cance… Pred… Mavadd… 10.1… Mavadd…
#> # … with abbreviated variable name ¹author_fullname
#>
#> Slot "pgs_ids":
#> # A tibble: 6 × 3
#> pgp_id pgs_id stage
#> <chr> <chr> <chr>
#> 1 PGP000001 PGS000001 gwas/dev
#> 2 PGP000001 PGS000002 gwas/dev
#> 3 PGP000001 PGS000003 gwas/dev
#> 4 PGP000001 PGS000001 eval
#> 5 PGP000001 PGS000002 eval
#> 6 PGP000001 PGS000003 eval
By pubmed_id
:
library(quincunx)
get_publications(pubmed_id = '30554720')
#> An object of class "publications"
#> Slot "publications":
#> # A tibble: 1 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000002 30554720 2018-12-13 Am J Hum Gen… Poly… Mavadd… 10.1… Mavadd…
#> # … with abbreviated variable name ¹author_fullname
#>
#> Slot "pgs_ids":
#> # A tibble: 15 × 3
#> pgp_id pgs_id stage
#> <chr> <chr> <chr>
#> 1 PGP000002 PGS000004 gwas/dev
#> 2 PGP000002 PGS000005 gwas/dev
#> 3 PGP000002 PGS000006 gwas/dev
#> 4 PGP000002 PGS000007 gwas/dev
#> 5 PGP000002 PGS000008 gwas/dev
#> 6 PGP000002 PGS000009 gwas/dev
#> 7 PGP000002 PGS000001 eval
#> 8 PGP000002 PGS000002 eval
#> 9 PGP000002 PGS000003 eval
#> 10 PGP000002 PGS000004 eval
#> 11 PGP000002 PGS000005 eval
#> 12 PGP000002 PGS000006 eval
#> 13 PGP000002 PGS000007 eval
#> 14 PGP000002 PGS000008 eval
#> 15 PGP000002 PGS000009 eval
By author
:
library(quincunx)
get_publications(author = 'Natarajan')
#> An object of class "publications"
#> Slot "publications":
#> # A tibble: 12 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000006 30104762 2018-08-13 Nat Genet Geno… Khera … 10.1… Khera …
#> 2 PGP000030 31676865 2019-11-01 Nat Genet Geno… Klarin… 10.1… Klarin…
#> 3 PGP000042 28223407 2017-02-21 Circulation Poly… Natara… 10.1… Natara…
#> 4 PGP000060 30586733 2019-03-01 Circulation Whol… Khera … 10.1… Khera …
#> 5 PGP000076 27959714 2016-11-13 N Engl J Med Gene… Khera … 10.1… Khera …
#> 6 PGP000116 32498804 2020-06-01 J Am Coll C… Limi… Aragam… 10.1… Aragam…
#> 7 PGP000127 33021622 2020-10-06 JAMA Cardiol Clin… Trinde… 10.1… Trinde…
#> 8 PGP000159 32981348 2020-09-28 Circulation Gene… Klarin… 10.1… Klarin…
#> 9 PGP000230 34887591 2021-12-09 Nature The … Graham… http… Graham…
#> 10 PGP000252 33433237 2021-01-12 Circ Genom … Inte… Ye Y 10.1… Ye Y, …
#> 11 PGP000299 35149155 2022-02-08 Ophthalmolo… Phot… Zekava… 10.1… Zekava…
#> 12 PGP000307 35278270 2022-03-12 Eur J Heart… Gene… Forres… 10.1… Forres…
#> # … with abbreviated variable name ¹author_fullname
#>
#> Slot "pgs_ids":
#> # A tibble: 57 × 3
#> pgp_id pgs_id stage
#> <chr> <chr> <chr>
#> 1 PGP000006 PGS000013 gwas/dev
#> 2 PGP000006 PGS000014 gwas/dev
#> 3 PGP000006 PGS000015 gwas/dev
#> 4 PGP000006 PGS000016 gwas/dev
#> 5 PGP000006 PGS000017 gwas/dev
#> 6 PGP000006 PGS000013 eval
#> 7 PGP000006 PGS000014 eval
#> 8 PGP000006 PGS000015 eval
#> 9 PGP000006 PGS000016 eval
#> 10 PGP000006 PGS000017 eval
#> # … with 47 more rows
#> # ℹ Use `print(n = ...)` to see more rows
Getting PGS Publications by other criteria
The PGS Catalog REST API only supports searches by those criteria mentioned above. If you would like to get results by other criteria, e.g., publication_date
, then you need to retrieve all publications and filter them afterwards.
As an example, let’s download all publications and then keep only those publications published in 2021:
all_pub <- get_publications(interactive = FALSE)
all_pub@publications
#> # A tibble: 350 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000001 25855707 2015-04-08 J Natl Canc… Pred… Mavadd… 10.1… Mavadd…
#> 2 PGP000002 30554720 2018-12-13 Am J Hum Ge… Poly… Mavadd… 10.1… Mavadd…
#> 3 PGP000003 25748612 2015-03-04 Lancet Gene… Mega JL 10.1… Mega J…
#> 4 PGP000004 26392438 2015-09-20 Eur Heart J Risk… Tada H 10.1… Tada H…
#> 5 PGP000005 27655226 2016-09-21 Eur Heart J Geno… Abraha… 10.1… Abraha…
#> 6 PGP000006 30104762 2018-08-13 Nat Genet Geno… Khera … 10.1… Khera …
#> 7 PGP000007 30309464 2018-10-01 J Am Coll C… Geno… Inouye… 10.1… Inouye…
#> 8 PGP000008 31184202 2019-06-11 Circ Genom … Vali… Wünnem… 10.1… Wünnem…
#> 9 PGP000009 28456682 2017-04-06 J Clin Lipi… Poly… Paquet… 10.1… Paquet…
#> 10 PGP000010 27513194 2016-08-11 Genet Med Pers… Läll K 10.1… Läll K…
#> # … with 340 more rows, and abbreviated variable name ¹author_fullname
#> # ℹ Use `print(n = ...)` to see more rows
Filtering based on the year 2021:
library(dplyr, warn.conflicts = FALSE)
# Determine the PGP ids whose publication date falls within 2021.
pgp_ids_2021 <-
filter(
all_pub@publications,
publication_date >= '2021-01-01' &
publication_date <= '2021-12-31'
) %>%
pull('pgp_id')
# Filtering based on the PGP ids
pub_2021 <- all_pub[pgp_ids_2021]
# Print the first 10 PGS publications
pub_2021@publications
#> # A tibble: 79 × 8
#> pgp_id pubmed_id publication_date publication title autho…¹ doi authors
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 PGP000050 33579919 2021-02-12 Nat Commun Cross… Graff … 10.1… Graff …
#> 2 PGP000121 33608049 2021-02-19 Genome Med Devel… Tam CHT 10.1… Tam CH…
#> 3 PGP000122 33398198 2021-01-04 Nat Genet Trans… Conti … 10.1… Conti …
#> 4 PGP000128 33462484 2021-01-18 Nat Genet Genet… Sinnot… 10.1… Sinnot…
#> 5 PGP000137 34750571 2021-11-08 Nat Metab Integ… Ritchi… 10.1… Ritchi…
#> 6 PGP000138 33420020 2021-01-08 Nat Commun Disea… Fontan… 10.1… Fontan…
#> 7 PGP000146 33495597 2021-01-25 Nat Genet Commo… Harper… 10.1… Harper…
#> 8 PGP000147 33623009 2021-02-23 Nat Commun Whole… Tharej… 10.1… Tharej…
#> 9 PGP000148 33472890 2021-01-20 Cancer Res Asses… Hung RJ 10.1… Hung R…
#> 10 PGP000149 33623038 2021-02-23 Nat Commun Polyg… Huynh-… 10.1… Huynh-…
#> # … with 69 more rows, and abbreviated variable name ¹author_fullname
#> # ℹ Use `print(n = ...)` to see more rows