R/subtyper.R
harmonize_sites.Rd
Adjusts specified features across sites so that the control group within each site matches the mean of the control group in a reference site. The transformation is applied to all data within each site, regardless of diagnosis, for each feature separately.
harmonize_sites(
data,
site_col,
diagnosis_col,
control_label,
feature_cols,
reference_site
)
A data frame containing the data.
A string indicating the column name for site identifiers.
A string indicating the column name for diagnosis identifiers.
The label in the diagnosis column identifying the control group.
A vector of strings specifying the feature columns to be harmonized.
The site identifier to use as the reference site for control means.
A list containing:
harmonized_data
: the data frame with features adjusted across sites
summary_stats
: a data frame with original control means by site and reference means for each feature
# harmonize_sites(df, site_col = "Site", diagnosis_col = "Diagnosis",
# control_label = "Control", feature_cols = c("Feature1", "Feature2"), reference_site = "SiteA")