Remove Perfectly Correlated Columns from a Data Frame
Source:R/simlr_nhanes.R
remove_perfectly_correlated.Rd
This function identifies columns in a data frame that are perfectly correlated (correlation coefficient of 1 or -1) and removes the second instance from each pair, ensuring that no two columns in the resulting data frame are perfectly correlated.
Examples
df <- data.frame(a = 1:5, b = 1:5, c = 5:1+rnorm(5), d = c(2, 1, 6, 8, 10))
remove_perfectly_correlated(df)
#> a c d
#> 1 1 5.5777091 2
#> 2 2 4.1181949 1
#> 3 3 1.0882795 6
#> 4 4 2.8620865 8
#> 5 5 0.7567633 10