Package 'GWASpoly' reference manual

Title:	Genome-wide Association Studies for Autopolyploids
Description:	Designed for genome-wide association studies in autopolyploids.
Authors:	Jeffrey B. Endelman, Umesh R. Rosyara
Maintainer:	Jeffrey Endelman <[email protected]>
License:	GPL-3
Version:	2.13
Built:	2025-02-08 05:39:08 UTC
Source:	https://github.com/jendelman/GWASpoly

Test markers as QTL under backward elimination

Description

Test markers as QTL under backward elimination

Usage

fit.QTL(data, trait, qtl, fixed = NULL)
fit.QTL(data, trait, qtl, fixed = NULL)

Arguments

`data`	variable inheriting from class `GWASpoly.K`
`trait`	name of trait
`qtl`	data frame to specify the multi-QTL model (see Details)
`fixed`	data frame to specify the fixed effects (see Details)

Details

qtl is a data frame with columns "Marker" and "Model", where each row corresponds to a QTL. fixed is a data frame with columns "Effect" and "Type": the first column is the name of the effect, which must match a column in the phenotype input file, and the second column is either "factor" or "numeric". The p-value and R2 for each marker are based on the likelihood ratio test under backward elimination, comparing the deviance to the chi-squared distribution.

Value

data frame with partial r2 and p-values

Extract significant QTL

Description

Output a table with significant markers

Usage

get.QTL(data, traits = NULL, models = NULL, bp.window = NULL)
get.QTL(data, traits = NULL, models = NULL, bp.window = NULL)

Arguments

`data`	Output from `set.threshold`
`traits`	Vector of trait names (by default, all traits)
`models`	Vector of model names (by default, all models)
`bp.window`	prune output to return only the most significant marker within this window size

Details

To return all significant markers (original behavior of the function), use bp.window=NULL. Assumes input map position in bp.

Value

Data frame with results. Score = -log10(p). Effect = marker effect (not available for the general and diplo-general models because there are multiple effects).

Compute marker significance scores

Description

Compute marker significance scores

Usage

GWASpoly(data, models, traits = NULL, params = NULL, n.core = 1, quiet = F)
GWASpoly(data, models, traits = NULL, params = NULL, n.core = 1, quiet = F)

Arguments

`data`	Output from `set.K`
`models`	Vector of model names
`traits`	Vector trait names (by default, all traits)
`params`	Optional list of params created by `set.params`
`n.core`	Number of cores for parallel computing
`quiet`	TRUE/FALSE whether to suppress output charting progress

Details

The following marker-effect models are available:

"additive": Indicates the marker effect is proportional to the dosage of the alternate allele
"X-dom": where X can be any integer between 1 and ploidy/2 and refers to the allele dosage needed for complete dominance (e.g., "1-dom" = simplex dominance, "2-dom" = duplex dominance). The software tries both dominance patterns for a given dosage model, e.g., whether the reference or alternate allele is dominant
"diplo-general": All heterozygotes have the same effect
"diplo-additive": All heterozygotes have the same effect, constrained to be halfway between the homozygous effects
"general": There are no constraints on the effects of the different dosage levels

To specify additional model parameters, such as the inclusion of fixed effects (Q matrix) and the minimum minor allele frequency, use set.params

Value

Variable of class GWASpoly.fitted

S4 class with genotype and phenotype data

Description

S4 class with genotype and phenotype data

Slots

map: data frame with columns Marker,Chrom,Position,Ref,Alt
pheno: data frame of phenotypes
geno: matrix (individuals x markers) of allele dosages (0,1,2,...ploidy)
fixed: data frame of fixed effects
ploidy: ploidy

S4 class with results from genome-wide scan

Description

S4 class with results from genome-wide scan

Slots

map: data frame with columns Marker,Chrom,Position,Ref,Alt
pheno: data frame of phenotypes
geno: matrix with allele dosages
fixed: data frame of fixed effects
ploidy: ploidy
K: list of covariance matrices
scores: -log10(p) results
effects: estimated marker effects
params: parameters used for the analysis

S4 class with genotypes, phenotypes, and polygenic covariance

Description

S4 class with genotypes, phenotypes, and polygenic covariance

Slots

map: data frame with columns Marker,Chrom,Position,Ref,Alt
pheno: data frame of phenotypes
geno: matrix with allele dosages
fixed: data frame of fixed effects
ploidy: ploidy
K: list of covariance matrices (one for each chromosome)

S4 class with results from genome-wide scan and detection threshold

Description

S4 class with results from genome-wide scan and detection threshold

Slots

map: data frame with columns Marker,Chrom,Position,Ref,Alt
pheno: data frame of phenotypes
geno: matrix with allele dosages
fixed: data frame of fixed effects
ploidy: ploidy
K: list of covariance matrices
scores: -log10(p) results
effects: estimated marker effects
params: parameters used for the analysis
threshold: thresholds for significance

Plot LD vs distance

Description

Plot LD vs distance

Usage

LD.plot(data, max.pair = 10000, dof = 8, max.loci = NULL, position = "bp")
LD.plot(data, max.pair = 10000, dof = 8, max.loci = NULL, position = "bp")

Arguments

`data`	variable inheriting from class `GWASpoly`
`max.pair`	maximum number of r2 pairs for the spline
`dof`	degrees of freedom for the spline
`max.loci`	maximum number of markers to use per chromosome
`position`	"bp" or "cM"

Details

A monotone decreasing, convex spline is fit using R package scam.

Value

ggplot2 object

Create Manhattan plot

Description

Create Manhattan plot

Usage

manhattan.plot(data, traits = NULL, models = NULL, chrom = NULL)
manhattan.plot(data, traits = NULL, models = NULL, chrom = NULL)

Arguments

`data`	Variable of class `GWASpoly.fitted`
`traits`	Vector of trait names (by default, all traits plotted)
`models`	Vector of model names (by default, all models plotted)
`chrom`	optional, to plot only one chromosome

Details

Results for the ref and alt versions of the dominance model are combined. If data is the output from set.threshold, then the threshold is displayed as a horizontal dashed line when models contains a single model. Because the threshold varies between models, it is not drawn when multiple models are included. Although the ref and alt versions of each dominance model are slightly different (as seen with qq.plot), they are treated as a single model for the Manhattan plot, and the average threshold is shown.

Value

ggplot2 object

Quantile-Quantile (QQ) Plot

Description

Inspect p-value inflation using a QQ plot

Usage

qq.plot(data, trait, models = NULL)
qq.plot(data, trait, models = NULL)

Arguments

`data`	Variable of class `GWASpoly.fitted`
`trait`	Trait name
`models`	Vector of model names (by default, all models plotted)

Value

ggplot2 object

Read in marker and phenotype data

Description

Read in marker and phenotype data

Usage

read.GWASpoly(ploidy, pheno.file, geno.file, format, n.traits, delim = ",")
read.GWASpoly(ploidy, pheno.file, geno.file, format, n.traits, delim = ",")

Arguments

`ploidy`	Ploidy (e.g., 2 for diploid, 4 for tetraploid)
`pheno.file`	Name of the phenotype file
`geno.file`	Name of the genotype file
`format`	Format for the marker data. See details.
`n.traits`	Number of traits
`delim`	Character to indicate the delimiter in the data files (e.g., "," for csv, "\t" for tab-delimited)

Details

The first column of the phenotype file contains the genotype identifier, columns 2 through (n.traits + 1) contain trait values, and subsequent columns contain the levels (for factors) or numeric values (for covariates) of any fixed effects. The first three columns of the genotype file are (1) marker name, (2) chromosome, and (3) position. Optionally, columns 4 and 5 can be REF and ALT, respectively. Subsequent columns contain the marker data for each individual in the population. Marker data can be coded in one of three formats:

"numeric": markers are coded based on the dosage of the alternate allele, taking on values between 0 and ploidy
"AB": e.g., AAAB, ABBB for tetraploids
"ACGT": e.g., AAAT, GGCC for tetraploids

Only bi-allelic markers are allowed. As of version 2.02 of the package, fractional values of dosage are allowed for the "numeric" format, with missing values imputed by the population mean for each marker. The fractional values are only used for the additive genetic model; for the other models, dosages are rounded to the nearest whole number. If the input allele dosages are whole numbers, then missing values are imputed with the population mode (most frequent value) for each marker.

Value

Variable of class GWASpoly.data

Set covariance matrix for polygenic effect

Description

Set covariance matrix for polygenic effect

Usage

set.K(data, K = NULL, n.core = 1, LOCO = NULL)
set.K(data, K = NULL, n.core = 1, LOCO = NULL)

Arguments

`data`	Output from `read.GWASpoly`
`K`	Optional: user-supplied matrix
`n.core`	Number of cores for parallel computing
`LOCO`	TRUE/FALSE, whether to use leave-one-chromosome-out

Details

When LOCO = TRUE, K is computed for each chromosome as $K=MM'$, where M is the centered genotype matrix (lines x markers), and scaled to have unit diagonal (the overall scaling is not important for GWAS). When LOCO = FALSE, a single K matrix is computed for all markers (this was the original behavior of the function). Alternatively, the user can supply their own positive semidefinite K, with row.names that match the genotype identifiers (this option cannot be used with LOCO).

Value

Variable of class GWASpoly.K

Set parameters

Description

Set parameters

Usage

set.params(
  fixed = NULL,
  fixed.type = NULL,
  n.PC = 0,
  MAF = NULL,
  geno.freq = NULL,
  P3D = TRUE
)
set.params(
  fixed = NULL,
  fixed.type = NULL,
  n.PC = 0,
  MAF = NULL,
  geno.freq = NULL,
  P3D = TRUE
)

Arguments

`fixed`	Vector of names of fixed effects
`fixed.type`	Vector of effect types ("numeric" or "factor"), corresponding to the effects listed in "fixed"
`n.PC`	Number of principal components to include as covariates
`MAF`	Minimum minor allele frequency
`geno.freq`	Maximum genotype frequency (after applying dominance relations)
`P3D`	TRUE/FALSE whether to use the P3D approximation (variance components not re-estimated for every marker)

Details

The list returned by the function should be passed to GWASpoly function.

Value

A list with the following components

`fixed`	Names of fixed effects
`fixed.type`	Types of fixed effects
`n.PC`	Number of principal components to include as covariates
`min.MAF`	Minimum minor allele frequency
`max.geno.freq`	Maximum genotype frequency (after applying dominance relations)
`P3D`	TRUE/FALSE whether to use the P3D approximation

Set the significance threshold

Description

Set the significance threshold

Usage

set.threshold(
  data,
  method = "M.eff",
  level = 0.05,
  n.permute = 1000,
  n.core = 1
)
set.threshold(
  data,
  method = "M.eff",
  level = 0.05,
  n.permute = 1000,
  n.core = 1
)

Arguments

`data`	Variable of class `GWASpoly.fitted`
`method`	One of the following: "M.eff","Bonferroni","FDR","permute"
`level`	Genome-wide false positive or false discovery rate (depending on `method`).
`n.permute`	Number of permutations for method "permute"
`n.core`	Number of cores to use for multicore processing

Details

The default method, "M.eff", is a Bonferroni-type correction but using an effective number of markers that accounts for LD between markers (Moskvina and Schmidt, 2008). The FDR method is based on version 1.30.0 of the qvalue package.

Value

Variable of class GWASpoly.thresh

References

Moskvina V, Schmidt KM (2008) On multiple-testing correction in genome-wide association studies. Genetic Epidemiology 32:567-573. doi:10.1002/gepi.20331

Convert VCF to dosage file

Description

Convert VCF to dosage file

Usage

VCF2dosage(
  VCF.file,
  dosage.file,
  geno.code,
  ploidy,
  samples = NULL,
  min.DP = 1,
  max.missing,
  min.minor = 5
)
VCF2dosage(
  VCF.file,
  dosage.file,
  geno.code,
  ploidy,
  samples = NULL,
  min.DP = 1,
  max.missing,
  min.minor = 5
)

Arguments

`VCF.file`	VCF filename (can be gzipped)
`dosage.file`	CSV filename to output with allele dosage
`geno.code`	genotype code in the FORMAT field: "GT" of "DS"
`ploidy`	ploidy
`samples`	optional vector of sample names, to export subset of the population
`min.DP`	minimum per sample depth (DP) to export genotype. Default is 1, for no filtering.
`max.missing`	threshold for missing data per marker, as a proportion.
`min.minor`	minimum number of samples with the minor allele. Default is 5.

Details

Only bi-allelic variants supported. The "GT" option for geno.code is the posterior maximum genotype (e.g., 0/0/1/1). "DS" represents the posterior mean dosage of the alternate allele. VCF file must conform to 4.1 or later.

Write results to file

Description

Write results to file

Usage

write.GWASpoly(data, trait, filename, what = "scores", delim = ",")
write.GWASpoly(data, trait, filename, what = "scores", delim = ",")

Arguments

`data`	Variable of class `GWASpoly.fitted`
`trait`	Trait name
`filename`	Filename
`what`	Either "scores" or "effects"
`delim`	Delimiter to use in the output file (default is comma)

Details

Score = -log10(p). Effect = marker effect (not available for the general and diplo-general models).

Package 'GWASpoly'

Help Index

Test markers as QTL under backward elimination

Description

Usage

Arguments

Details

Value

Extract significant QTL

Description

Usage

Arguments

Details

Value

Compute marker significance scores

Description

Usage

Arguments

Details

Value

S4 class with genotype and phenotype data

Description

Slots

S4 class with results from genome-wide scan

Description

Slots

S4 class with genotypes, phenotypes, and polygenic covariance

Description

Slots

S4 class with results from genome-wide scan and detection threshold

Description

Slots

Plot LD vs distance

Description

Usage

Arguments

Details

Value

Create Manhattan plot

Description

Usage

Arguments

Details

Value

Quantile-Quantile (QQ) Plot

Description

Usage

Arguments

Value

Read in marker and phenotype data

Description

Usage

Arguments

Details

Value

Set covariance matrix for polygenic effect

Description

Usage

Arguments

Details

Value

Set parameters

Description

Usage

Arguments

Details

Value

Set the significance threshold

Description

Usage

Arguments

Details

Value

References

Convert VCF to dosage file

Description

Usage

Arguments

Details

Write results to file