Skip to contents

This function identifies columns in a data frame that are perfectly correlated (correlation coefficient of 1 or -1) and removes the second instance from each pair, ensuring that no two columns in the resulting data frame are perfectly correlated.

Usage

remove_perfectly_correlated(df, tolerance = 1e-06)

Arguments

df

A data frame containing numeric columns to be checked for perfect correlation.

tolerance

float tolerance value

Value

A data frame with redundant perfectly correlated columns removed.

Examples

df <- data.frame(a = 1:5, b = 1:5, c = 5:1+rnorm(5), d = c(2, 1, 6, 8, 10))
remove_perfectly_correlated(df)
#>   a         c  d
#> 1 1 5.5777091  2
#> 2 2 4.1181949  1
#> 3 3 1.0882795  6
#> 4 4 2.8620865  8
#> 5 5 0.7567633 10