Skip to contents

Perform multi-omics analysis using WebGestaltR

Usage

WebGestaltRMultiOmics(
  analyteLists = NULL,
  analyteListFiles = NULL,
  analyteTypes = NULL,
  enrichMethod = "ORA",
  organism = "hsapiens",
  enrichDatabase = NULL,
  enrichDatabaseFile = NULL,
  enrichDatabaseType = NULL,
  enrichDatabaseDescriptionFile = NULL,
  collapseMethod = "mean",
  minNum = 10,
  maxNum = 500,
  fdrMethod = "BH",
  sigMethod = "fdr",
  fdrThr = 0.05,
  topThr = 10,
  reportNum = 100,
  setCoverNum = 10,
  perNum = 1000,
  gseaP = 1,
  isOutput = TRUE,
  outputDirectory = getwd(),
  projectName = NULL,
  dagColor = "binary",
  saveRawGseaResult = FALSE,
  gseaPlotFormat = "png",
  nThreads = 1,
  cache = NULL,
  hostName = "https://www.webgestalt.org/",
  useWeightedSetCover = TRUE,
  useAffinityPropagation = FALSE,
  usekMedoid = FALSE,
  kMedoid_k = 25,
  isMetaAnalysis = TRUE,
  mergeMethod = "mean",
  normalizationMethod = "rank",
  referenceLists = NULL,
  referenceListFiles = NULL,
  referenceTypes = NULL,
  referenceSets = NULL,
  listNames = NULL
)

Arguments

analyteLists

vector of the ID type of the corresponding interesting analyte list. The supported ID types of WebGestaltR for the selected organism can be found by the function listIdType. If the organism is others, users do not need to set this parameter. The length of analyteLists should be the same as the length of analyteListFiles or analyteLists.

analyteListFiles

If enrichMethod is ORA, the extension of the analyteListFiles should be txt and each file can only contain one column: the interesting analyte list. If enrichMethod is GSEA, the extension of the analyteListFiles should be rnk and the files should contain two columns separated by tab: the analyte list and the corresponding scores.

analyteTypes

a vector containing the ID types of the analyte lists.

enrichMethod

Enrichment methods: ORAor GSEA.

organism

Currently, WebGestaltR supports 12 organisms. Users can use the function listOrganism to check available organisms. Users can also input others to perform the enrichment analysis for other organisms not supported by WebGestaltR. For other organisms, users need to provide the functional categories, interesting list and reference list (for ORA method). Because WebGestaltR does not perform the ID mapping for the other organisms, the above data should have the same ID type.

enrichDatabase

The functional categories for the enrichment analysis. Users can use the function listGeneSet to check the available functional databases for the selected organism. Multiple databases in a vector are supported for ORA and GSEA.

enrichDatabaseFile

Users can provide one or more GMT files as the functional category for enrichment analysis. The extension of the file should be gmt and the first column of the file is the category ID, the second one is the external link for the category. Genes annotated to the category are from the third column. All columns are separated by tabs. The GMT files will be combined with enrichDatabase.

enrichDatabaseType

The ID type of the genes in the enrichDatabaseFile. If users set organism as others, users do not need to set this ID type because WebGestaltR will not perform ID mapping for other organisms. The supported ID types of WebGestaltR for the selected organism can be found by the function listIdType.

enrichDatabaseDescriptionFile

Users can also provide description files for the custom enrichDatabaseFile. The extension of the description file should be des. The description file contains two columns: the first column is the category ID that should be exactly the same as the category ID in the custom enrichDatabaseFile and the second column is the description of the category. All columns are separated by tabs.

collapseMethod

The method to collapse duplicate IDs with scores. mean, median, min and max represent the mean, median, minimum and maximum of scores for the duplicate IDs.

minNum

WebGestaltR will exclude the categories with the number of annotated genes less than minNum for enrichment analysis. The default is 10.

maxNum

WebGestaltR will exclude the categories with the number of annotated genes larger than maxNum for enrichment analysis. The default is 500.

fdrMethod

For the ORA method, WebGestaltR supports five FDR methods: holm, hochberg, hommel, bonferroni, BH and BY. The default is BH.

sigMethod

Two methods of significance are available in WebGestaltR: fdr and top. fdr means the enriched categories are identified based on the FDR and top means all categories are ranked based on FDR and then select top categories as the enriched categories. The default is fdr.

fdrThr

The significant threshold for the fdr method. The default is 0.05.

topThr

The threshold for the top method. The default is 10.

reportNum

The number of enriched categories visualized in the final report. The default is 20. A larger reportNum may be slow to render in the report.

setCoverNum

The number of expected gene sets after set cover to reduce redundancy. It could get fewer sets if the coverage reaches 100%. The default is 10.

perNum

The number of permutations for the GSEA method. The default is 1000.

gseaP

The exponential scaling factor of the phenotype score. The default is 1. When p=0, ES reduces to standard K-S statistics (See original paper for more details).

isOutput

If isOutput is TRUE, WebGestaltR will create a folder named by the projectName and save the results in the folder. Otherwise, WebGestaltR will only return an R data.frame object containing the enrichment results. If hundreds of gene list need to be analyzed simultaneously, it is better to set isOutput to FALSE. The default is TRUE.

outputDirectory

The output directory for the results.

projectName

The name of the project. If projectName is NULL, WebGestaltR will use time stamp as the project name.

dagColor

If dagColor is binary, the significant terms in the DAG structure will be colored by steel blue for ORA method or steel blue (positive related) and dark orange (negative related) for GSEA method. If dagColor is continous, the significant terms in the DAG structure will be colored by the color gradient based on corresponding FDRs.

saveRawGseaResult

Whether the raw result from GSEA is saved as a RDS file, which can be used for plotting. Defaults to FALSE. The list includes

Enrichment_Results

A data frame of GSEA results with statistics

Running_Sums

A matrix of running sum of scores for each gene set

Items_in_Set

A list with ranks of genes for each gene set

gseaPlotFormat

The graphic format of GSEA enrichment plots. Either svg, png, or c("png", "svg") (default).

nThreads

The number of cores to use for GSEA and set cover, and in batch function.

cache

A directory to save data cache for reuse. Defaults to NULL and disabled.

hostName

The server URL for accessing data. Mostly for development purposes.

useWeightedSetCover

Use weighted set cover for ORA. Defaults to TRUE.

useAffinityPropagation

Use affinity propagation for ORA. Defaults to FALSE.

usekMedoid

Use k-medoid for ORA. Defaults to TRUE.

kMedoid_k

The number of clusters for k-medoid. Defaults to 25.

isMetaAnalysis

whether to perform meta-analysis. Defaults to TRUE. FALSE is not currently implemented.

mergeMethod

The method to merge the results from multiple omics (options: mean, max). Only used if isMetaAnalysis = FALSE. Defaults to mean.

normalizationMethod

The method to normalize the results from multiple omics (options: rank, median, mean). Only used if isMetaAnalysis = FALSE.

referenceLists

For the ORA method, users can also use an R object as the reference gene list. referenceLists should be an R vector object containing the reference gene list.

referenceListFiles

For the ORA method, the users need to upload the reference gene list. The extension of the referenceListFile should be txt and the file can only contain one column: the reference gene list.

referenceTypes

Vector of the ID types of the reference lists. The supported ID types of WebGestaltR for the selected organism can be found by the function listIdType. If the organism is others, users do not need to set this parameter.

referenceSets

Users can directly select the reference sets from existing platforms in WebGestaltR and do not need to provide the reference set through referenceListFiles. All existing platforms supported in WebGestaltR can be found by the function listReferenceSets. If referenceListFiles and refereneceLists are NULL, WebGestaltR will use the referenceSets as the reference analyte sets. Otherwise, WebGestaltR will use the user supplied reference set for enrichment analysis. Must be a vector with length matching the input analyte list (i.e. c('genome', 'genome', 'KEGG'))

listNames

The names of the analyte lists.