This is the inference module for subtype definition based on a matrix. Currently only supports clustering based on gaussian mixtures via the ClusterR package.
predictSubtypeClusterMulti(
mxdfin,
measureColumns,
clusteringObject,
clustername = "GMMClusters",
idvar,
visitName,
baselineVisit,
reorderingDataframe,
distance_metric = "pearson_correlation"
)
Input data frame
vector defining the data columns to be used for clustering.
a clustering object to predict clusters
column name for the identified clusters
variable name for unique subject identifier column
the column name defining the visit variables
the string naming the baseline visit
reorder the cluster names based on this dataframe mapping of original to new variable names
see medoid methods in ClusterR
the clusters attached to the data frame; also returns membership probabilities
mydf = generateSubtyperData( 100 )
rbfnames = names(mydf)[grep("Random",names(mydf))]
gmmcl = trainSubtypeClusterMulti( mydf, rbfnames, maxk=4 )
gmmclp = predictSubtypeClusterMulti( mydf, rbfnames, gmmcl )