I've got a REDCap database of >5000 records where I would like to return just a few variables from each record using the REDCap API and the REDCapR
package in R. Normally, this can be accomplished very easily with redcap_read()
, but the function has been returning the error:`
── Column specification ───────────────────────────────────────────────────────cols( record_id = col_double(), redcap_repeat_instrument = col_logical(), redcap_repeat_instance = col_logical(), first_name = col_character(), last_name = col_character(), `NA` = col_logical())Error in redcap_read(redcap_uri = retrieve_credential_local(my_file_path, : There are 100 subject(s) that are missing rows in the returned dataset. REDCap's PHP code is likely trying to process too much text in one bite.Common solutions this problem are: - specifying only the records you need (w/ `records`) - specifying only the fields you need (w/ `fields`) - specifying only the forms you need (w/ `forms`) - specifying a subset w/ `filter_logic` - reduce `batch_size
By default, the batch size is set to 100, so the console displays missing subject records equal to whatever the batch_size
parameter is set to.
The following code has worked before for much smaller REDCap projects (~200-300 records):
library(REDCapR)df <- redcap_read( redcap_uri = my_redcap_uri, token = my_token, fields = c("record_id","first_name","last_name"), raw_or_label_headers = 'raw', verbose = TRUE)$data
I've set the parameter verbose=TRUE
in this example because during troubleshooting, I wanted to see if I could identify what was going wrong, but each individual batch returned an HTTP status code of 200. From what I understood of the documentation and source code of the redcap_read()
function, after looping through each batch, they're all stacked together using dplyr::bind_rows()
. Then, R checks to see whether any rows are missing, and if so, prints out the error message above.
The strange part is that when executing the same request inside of REDCap's API Playground, there are no missing rows. All record IDs, first names, and last names are returned as requested. Additionally, requesting onlyrecord_id
is successful. I'm also not sure where the column NA
of type logical
is appearing from whenever I try to include first_name
and last_name
, as it's not part of the variables requested in the fields
parameter and does not appear whenever only record_id
is requested.
df <- redcap_read( redcap_uri = my_redcap_uri, token = my_token, fields = c("record_id"), raw_or_label_headers = 'raw', verbose = TRUE)$datahead(df)############── Column specification ───────────────────────────────────────────────────────cols( record_id = col_double(), redcap_repeat_instrument = col_logical(), redcap_repeat_instance = col_logical())record_id redcap_repeat_instrument redcap_repeat_instance1 NA NA 2 NA NA 3 NA NA 4 NA NA 5 NA NA 6 NA NA 7 NA NA 8 NA NA 9 NA NA 10 NA NA
If I'm missing something obvious, please let me know!