This function selects the top k
rows of a dataframe according to a specified
column and criterion (either maximizing or minimizing the values). It returns
a logical vector indicating which rows are among the top k
.
topk(df, column, k, maximize = TRUE)
A logical vector of the same length as the number of rows in df
.
TRUE
indicates that the row is one of the top k
.
df <- data.frame(
id = 1:10,
value = c(5, 2, 9, 4, 7, 3, 6, 10, 8, 1)
)
# Select the top 3 rows based on maximizing the 'value' column
topk(df, "value", 3, maximize = TRUE)
#> [1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE TRUE FALSE
# Select the top 3 rows based on minimizing the 'value' column
topk(df, "value", 3, maximize = FALSE)
#> [1] FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE