Associate predictors (or features) with subtypes; these could be diagnoses or cluster assignments. Will use regression to return data frames intended to visualize the most important relationships between features and types. There are two approaches - worth using both. Can be combined with boostrapping to give distributional visualizations.

featureImportanceForSubtypes(
  dataframein,
  subtypeLabels,
  featureMatrix,
  associationType = c("features2subtypes", "subtypes2features", "subjects"),
  covariates = "1",
  transform = "effect_sizes",
  significance_level = 0.001,
  visualize = FALSE
)

Arguments

dataframein

Input dataframe with all relevant data

subtypeLabels

Input subtype assignments.

featureMatrix

matrix/dataframe defining the data columns as features.

associationType

either predictor features from subtypes or predict subtypes from features. will produce related but complementary results. in some cases, depending on subtypes/degrees of freedom, only one will be appropriate. the third option (subjects) reports rownames of the dataframe that best fit the related subtype.

covariates

optional string of covariates

transform

optional effect_size

significance_level

to threshold effects

visualize

boolean

Value

dataframes for visualization that show feature to subtype importance e.g. via pheatmap

Author

Avants BB

Examples

mydf = generateSubtyperData( 100 )
rbfnames = names(mydf)[grep("Random",names(mydf))]
fimp = featureImportanceForSubtypes( mydf, mydf$DX, mydf[,rbfnames], "subtypes2features" )