: Publication 4342

Publication 4342

Title:	BGData - A Suite of R Packages for Genomic Analysis with Big Data
Journal:	G3: Genes, Genomes, Genetics
Published:	1 May 2019
Pubmed:	https://pubmed.ncbi.nlm.nih.gov/30894453/
DOI:	https://doi.org/10.1534/g3.119.400018
URL:	https://www.g3journal.org/content/ggg/9/5/1377.full.pdf
Citations:	28 (5 in last 2 years) as of 8 Aug 2024

WARNING: the interactive features of this website use CSS3, which your browser does not support. To use the full features of this website, please update your browser.

Abstract

We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK's binary format for genotype data), a novel class of linked arrays that allows linking data stored in multiple files to form a single array accessible from the R computing environment, methods for parallel computing capabilities that can carry out computations on very large data sets without loading the entire data into memory and a basic set of methods for statistical genetic analyses. The package is accessible through CRAN and GitHub. In this note, we describe the classes and methods implemented in each of the packages that make the suite and illustrate the use of the packages using data from the UK Biobank.</p>

5 Keywords

Algorithms
Big Data
Computational Biology
Genomics
Software

2 Authors

Alexander Grueneberg
Gustavo de los Campos

1 Application

Application ID	Title
15326	Compressed Sensing and high-dimensional statistical methods in complex trait genomics

Enabling scientific discoveries that improve human health