R/subtyper.R
merge_ADNI_antspymm_by_closest_date.Rd
This function merges two data frames based on a common patient ID and the closest date, ensuring that each row in the first data frame (dfA
) is matched with the row from the second data frame (dfB
) that has the closest date for the same patient ID. The final merged data frame includes all columns from both dfA
and dfB
, excluding the patient ID and date columns from dfB
to avoid duplication. The function is designed to handle date columns as Date objects and includes a progress bar to indicate the matching process's progress. EXAMDATE is assumed present in dfA and date present in dfB where date is numerically formatted as YYYYMMDD.
merge_ADNI_antspymm_by_closest_date(
dfA,
dfB,
patientidcol = "subjectID",
verbose = TRUE
)
The first data frame to be merged, expected to contain columns for patient ID and date. may need dfA$subjectID = dfA$PTID
.
The second data frame to be merged, expected to contain columns for patient ID and date. Rows from dfB
are matched to dfA
based on the closest date for each patient ID.
Character string specifying the column name in both data frames that contains the patient ID. Default is 'subjectID'.
Logical indicating whether to print additional information during processing, such as the number of common IDs found and dimensions of the data frames before and after merging. Default is TRUE. setting verbose to 2 provides more feedback.
Returns a merged data frame with the same number of rows as dfA
and includes all columns from both dfA
and dfB
, with dfB
columns matched based on the closest date for each patient ID. The date columns from dfB
are excluded to avoid duplication.
The function assumes that the date columns in the dfB data frame formatted as 'YYYYMMDD' and converts them to Date objects for processing. The progress bar functionality uses base R's txtProgressBar, which is displayed in the console. dfA's EXAMDATE is of the form YYYY-MM-DD.
# Assuming dfA and dfB are already defined and have 'subjectID' and 'date' columns
# merged_df <- merge_ADNI_antspymm_by_closest_date(dfA, dfB)