This is the inference module for subtype definition based on a matrix. Currently only supports clustering based on gaussian mixtures via the ClusterR package.

predictSubtypeClusterMulti(
  mxdfin,
  measureColumns,
  clusteringObject,
  clustername = "GMMClusters",
  idvar,
  visitName,
  baselineVisit,
  reorderingDataframe,
  distance_metric = "pearson_correlation"
)

Arguments

mxdfin

Input data frame

measureColumns

vector defining the data columns to be used for clustering.

clusteringObject

a clustering object to predict clusters

clustername

column name for the identified clusters

idvar

variable name for unique subject identifier column

visitName

the column name defining the visit variables

baselineVisit

the string naming the baseline visit

reorderingDataframe

reorder the cluster names based on this dataframe mapping of original to new variable names

distance_metric

see medoid methods in ClusterR

Value

the clusters attached to the data frame; also returns membership probabilities

Author

Avants BB

Examples

mydf = generateSubtyperData( 100 )
rbfnames = names(mydf)[grep("Random",names(mydf))]
gmmcl = trainSubtypeClusterMulti( mydf, rbfnames, maxk=4 )
gmmclp = predictSubtypeClusterMulti( mydf, rbfnames, gmmcl )