Functional enrichment

Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation among people and represent a difference, within or between genes, in a sigle base pair. SNPs are a “digital code” for genetic patterns. Consequently, they are a very fine-grained, noisy measurement of genetic variability.

Functional enrichment analysis uses meta-analysis to associate high-level properties such as disease risk, high expression within given cell types or other phenotypes with genes or SNPs. This vignette uses an open dataset Nalls 2019 and gene ontology query tool (ggprofiler) to identify clusters of SNPs associated with pleiotropic properties. See this review for conceptual background on functional genomics.

First, we generate the association data.

Next, we explicitly perform clustering. First, cluster based on SNPs.

Second, cluster based on function.

bi-clustering via joint clustering and via NMF

  • tabulate the pairs of clusters that exist

  • provide evidence that joint clusters are valid (TODO)