R/subtyper.R
regressionBasedFeatureSelection.Rd
select top features via linear regression on an outcome
regressionBasedFeatureSelection(
dataframein,
subtypeLabels,
featureMatrix,
covariates = "1",
n_features = 25,
associationType = "features2subtypes"
)
Input dataframe with all relevant data
Input subtype assignments.
matrix/dataframe defining the data columns as features.
optional string of covariates
select this many features per level
either predictor features from subtypes or predict subtypes from features. will produce related but complementary results. in some cases, depending on subtypes/degrees of freedom, only one will be appropriate.
vector of feature names
mydf = generateSubtyperData( 100 )
rbfnames = names(mydf)[grep("Random",names(mydf))]
fimp = regressionBasedFeatureSelection( mydf, mydf$DX, mydf[,rbfnames],
associationType ="subtypes2features" )