select top features via linear regression on an outcome

regressionBasedFeatureSelection(
  dataframein,
  subtypeLabels,
  featureMatrix,
  covariates = "1",
  n_features = 25,
  associationType = "features2subtypes"
)

Arguments

dataframein

Input dataframe with all relevant data

subtypeLabels

Input subtype assignments.

featureMatrix

matrix/dataframe defining the data columns as features.

covariates

optional string of covariates

n_features

select this many features per level

associationType

either predictor features from subtypes or predict subtypes from features. will produce related but complementary results. in some cases, depending on subtypes/degrees of freedom, only one will be appropriate.

Value

vector of feature names

Author

Avants BB

Examples

mydf = generateSubtyperData( 100 )
rbfnames = names(mydf)[grep("Random",names(mydf))]
fimp = regressionBasedFeatureSelection( mydf, mydf$DX, mydf[,rbfnames], 
   associationType ="subtypes2features" )