extract_imgt_genes {TCRconvertR}R Documentation

Extract all gene names from a folder of FASTAs

Description

extract_imgt_genes() first runs parse_imgt_fasta() on all FASTA files in a given folder to pull out the gene names. Then it returns those names in an alphabetically sorted dataframe.

Usage

extract_imgt_genes(data_dir)

Arguments

data_dir

A string, the path to directory containing FASTA files.

Value

A dataframe of gene names.

Examples

# Given a folder with FASTA files containing these headers:
#   >SomeText|TRAC*01|MoreText|
#   >SomeText|TRAV1-1*01|MoreText|
#   >SomeText|TRAV1-1*02|MoreText|
#   >SomeText|TRAV1-2*01|MoreText|
#   >SomeText|TRAV14/DV4*01|MoreText|
#   >SomeText|TRAV38-1*01|MoreText|
#   >SomeText|TRAV38-2/DV8*01|MoreText|
#   >SomeText|TRBV29-1*01|MoreText|
#   >SomeText|TRBV29-1*02|MoreText|
#   >SomeText|TRBV29/OR9-2*01|MoreText|

fastadir <- get_example_path("fasta_dir/")
extract_imgt_genes(fastadir)

[Package TCRconvertR version 1.0 Index]