This function retrieves toplevel sequences. These sequences correspond to genomic regions in the genome assembly that are not a component of another sequence region. Thus, toplevel sequences will be chromosomes and any unlocalised or unplaced scaffolds.
get_toplevel_sequences( species_name = "homo_sapiens", verbose = FALSE, warnings = TRUE, progress_bar = TRUE )
species_name | The species name, i.e., the scientific name, all letters
lowercase and space replaced by underscore. Examples: |
---|---|
verbose | Whether to be chatty. |
warnings | Whether to print warnings. |
progress_bar | Whether to show a progress bar. |
A tibble
, each row being a toplevel sequence,
of 4 variables:
species_name
Ensembl species name: this is the name used internally
by Ensembl to uniquely identify a species by name. It is the scientific
name but formatted without capitalisation and spacing converted with an
underscore, e.g., 'homo_sapiens'
.
coord_system
Coordinate system type.
toplevel_sequence
Name of the toplevel sequence.
length
Genomic length toplevel sequence in base pairs.
# Get toplevel sequences for the human genome (default) get_toplevel_sequences()#> # A tibble: 194 × 4 #> species_name coord_system toplevel_sequence length #> <chr> <chr> <chr> <int> #> 1 homo_sapiens scaffold KI270757.1 71251 #> 2 homo_sapiens scaffold KI270741.1 157432 #> 3 homo_sapiens scaffold KI270756.1 79590 #> 4 homo_sapiens scaffold KI270730.1 112551 #> 5 homo_sapiens scaffold KI270739.1 73985 #> 6 homo_sapiens scaffold KI270738.1 99375 #> 7 homo_sapiens scaffold KI270737.1 103838 #> 8 homo_sapiens scaffold KI270312.1 998 #> 9 homo_sapiens scaffold KI270591.1 5796 #> 10 homo_sapiens scaffold KI270371.1 2805 #> # … with 184 more rows# Get toplevel sequences for Caenorhabditis elegans get_toplevel_sequences('caenorhabditis_elegans')#> # A tibble: 7 × 4 #> species_name coord_system toplevel_sequence length #> <chr> <chr> <chr> <int> #> 1 caenorhabditis_elegans chromosome I 15072434 #> 2 caenorhabditis_elegans chromosome II 15279421 #> 3 caenorhabditis_elegans chromosome III 13783801 #> 4 caenorhabditis_elegans chromosome IV 17493829 #> 5 caenorhabditis_elegans chromosome V 20924180 #> 6 caenorhabditis_elegans chromosome X 17718942 #> 7 caenorhabditis_elegans chromosome MtDNA 13794