This is the workhorse function of wranglEHR that transcribes 2d data from CC-HIC to a table with 1 column per dataitem (and any metadata if relevant) and 1 row per time per patient.
a CC-HIC database connection.
an integer vector of episode_ids or NULL. If NULL (the default) then all episodes are extracted.
a string vector of CC-HIC codes names to be extracted.
a character vector, of the same length as code_names
,
with names to relabel extracted CC-HIC dataitems, or NULL (the default) to
retain the original code names. Given in the same order as
code_names
.
a function vector of summary functions to summarise data
that is contributed at a higher frequency than the set cadence
. Must
be the same length, and in the same order as code_names
.
an integer scalar. Chunks the extraction process by this
many episodes to help manage memory constraints. The default (5000) works
well for most desktop computers. If RAM is not a major limitation, setting
this to Inf
may improve performance.
a numerical scalar >= 0 or the string "timestamp". If a numerical scalar is used, it will describe the base time unit to build each row of the extracted table, in divisions of an hour. For example: 1 = 1 hour, 0.5 = 30 mins, 2 = 2 hourly. If cadence = "timestamp", then the precise datetime will be used to generate the time column. This is likely to generate a large table, so use cautiously.
a numeric vector of length 2 containing the start and
end times (in hours) relative to the ICU admission time, for which the data
extraction should occur. For example, c(0, 24)
will return the first
24 hours of data after admission. The default c(-Inf, Inf)
will
return all data.
sparse tibble with an hourly cadence as rows, and unique data items as columns. Data items that contain metadata are reallocated to their own columns.
The time unit is user definable, and set by the cadence
argument. The
default behaviour is to produce a table with 1 row per hour per patient. If
there are duplicates/conflicts (e.g more than 1 event for a given hour), then
only the first result for that hour is returned. If extracting at a lower
cadence than is naturally recorded in the database, one can specify a vector
of summary function to the coalesce_rows
argument. These summary
functions must *always* return a vector of length 1, in the same data type
and must be able to handle vectors entirely of NAs
.
Many events inside CC-HIC occur on a greater than hourly basis. Depending upon the chosen analysis, one may which to modify the cadence. 0.5 for example will produce a table with 1 row per 30 minutes per patient.
Choose what variables one wishes to extract wisely. This function is quite efficient considering what it needs to do, but it can take a very long time if extracting lots of data. It is a strong recommendation that the database is optimised with indexes prior to using this function. It is sensible to test the extraction with 100 or so patients before committing to a full extraction.
It is possible for this function to produce negative time rows (e.g. rows that occurred prior to ICU admission). If, for example a patient had a measure taken in the hours before they were admitted, then this would be added to the table with a negative time value. As a concrete example, if a patient had a sodium measured at 08:00, and they were admitted to the ICU at 20:00 the same day, then the sodium would be displayed at time = -12. This is normal behaviour and it is left to the end user to determine how best they wish to account for this.
con <- setup_dummy_db()
#> Error in (function (cond) .Internal(C_tryCatchHelper(addr, 1L, cond)))(structure(list(message = "Problem while computing `time = strftime(time, format = \"%H:%M:%S\")`.", trace = structure(list(call = list(tryCatch(withCallingHandlers({ NULL saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"), list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv, quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b") flush(stdout()) flush(stderr()) NULL invisible() }, error = function(e) { { callr_data <- as.environment("tools:callr")$`__callr_data__` err <- callr_data$err assign(".Traceback", .traceback(4), envir = callr_data) dump.frames("__callr_dump__") assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) rm("__callr_dump__", envir = .GlobalEnv) e$call <- deparse(conditionCall(e), nlines = 6) e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e)) class(e2) <- c("callr_remote_error", class(e2)) e2$error <- e calls <- sys.calls() dcframe <- which(vapply(calls, function(x) length(x) >= 1 && identical(x[[1]], quote(do.call)), logical(1)))[1] if (!is.na(dcframe)) e2$`_ignore` <- list(c(1, dcframe + 1L)) e2$`_pid` <- Sys.getpid() e2$`_timestamp` <- Sys.time() if (inherits(e, "rlib_error_2_0")) e2$parent <- e$parent e2 <- err$add_trace_back(e2, embed = FALSE) saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b", ".error")) } }, interrupt = function(e) { { callr_data <- as.environment("tools:callr")$`__callr_data__` err <- callr_data$err assign(".Traceback", .traceback(4), envir = callr_data) dump.frames("__callr_dump__") assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) rm("__callr_dump__", envir = .GlobalEnv) e$call <- deparse(conditionCall(e), nlines = 6) e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e)) class(e2) <- c("callr_remote_error", class(e2)) e2$error <- e calls <- sys.calls() dcframe <- which(vapply(calls, function(x) length(x) >= 1 && identical(x[[1]], quote(do.call)), logical(1)))[1] if (!is.na(dcframe)) e2$`_ignore` <- list(c(1, dcframe + 1L)) e2$`_pid` <- Sys.getpid() e2$`_timestamp` <- Sys.time() if (inherits(e, "rlib_error_2_0")) e2$parent <- e$parent e2 <- err$add_trace_back(e2, embed = FALSE) saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b", ".error")) } }, callr_message = function(e) { try(signalCondition(e)) }), error = function(e) { NULL try(stop(e)) }, interrupt = function(e) { NULL e }), tryCatchList(expr, classes, parentenv, handlers), tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), names[nh], parentenv, handlers[[nh]]), doTryCatch(return(expr), name, parentenv, handler), tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), tryCatchOne(expr, names, parentenv, handlers[[1L]]), doTryCatch(return(expr), name, parentenv, handler), withCallingHandlers({ NULL saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"), list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv, quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b") flush(stdout()) flush(stderr()) NULL invisible() }, error = function(e) { { callr_data <- as.environment("tools:callr")$`__callr_data__` err <- callr_data$err assign(".Traceback", .traceback(4), envir = callr_data) dump.frames("__callr_dump__") assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) rm("__callr_dump__", envir = .GlobalEnv) e$call <- deparse(conditionCall(e), nlines = 6) e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e)) class(e2) <- c("callr_remote_error", class(e2)) e2$error <- e calls <- sys.calls() dcframe <- which(vapply(calls, function(x) length(x) >= 1 && identical(x[[1]], quote(do.call)), logical(1)))[1] if (!is.na(dcframe)) e2$`_ignore` <- list(c(1, dcframe + 1L)) e2$`_pid` <- Sys.getpid() e2$`_timestamp` <- Sys.time() if (inherits(e, "rlib_error_2_0")) e2$parent <- e$parent e2 <- err$add_trace_back(e2, embed = FALSE) saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b", ".error")) } }, interrupt = function(e) { { callr_data <- as.environment("tools:callr")$`__callr_data__` err <- callr_data$err assign(".Traceback", .traceback(4), envir = callr_data) dump.frames("__callr_dump__") assign(".Last.dump", .GlobalEnv$`__callr_dump__`, envir = callr_data) rm("__callr_dump__", envir = .GlobalEnv) e$call <- deparse(conditionCall(e), nlines = 6) e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e)) class(e2) <- c("callr_remote_error", class(e2)) e2$error <- e calls <- sys.calls() dcframe <- which(vapply(calls, function(x) length(x) >= 1 && identical(x[[1]], quote(do.call)), logical(1)))[1] if (!is.na(dcframe)) e2$`_ignore` <- list(c(1, dcframe + 1L)) e2$`_pid` <- Sys.getpid() e2$`_timestamp` <- Sys.time() if (inherits(e, "rlib_error_2_0")) e2$parent <- e$parent e2 <- err$add_trace_back(e2, embed = FALSE) saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b", ".error")) } }, callr_message = function(e) { try(signalCondition(e)) }), saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"), list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv, quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b"), do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"), list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv, quote = TRUE), `<fn>`(base::quote(`<fn>`), base::quote(`<named list>`), envir = base::quote(`<env>`), quote = base::quote(TRUE)), `<fn>`(pkg = base::quote(`<pkgdown>`), examples = base::quote(TRUE), run_dont_run = base::quote(FALSE), seed = base::quote(1014), lazy = base::quote(FALSE), override = base::quote(`<list>`), install = base::quote(FALSE), preview = base::quote(FALSE), new_process = base::quote(FALSE), devel = base::quote(FALSE), crayon_enabled = base::quote(FALSE), crayon_colors = base::quote(1L), pkgdown_internet = base::quote(TRUE)), pkgdown::build_site(...), build_site_local(pkg = pkg, examples = examples, run_dont_run = run_dont_run, seed = seed, lazy = lazy, override = override, preview = preview, devel = devel), build_reference(pkg, lazy = lazy, examples = examples, run_dont_run = run_dont_run, seed = seed, override = override, preview = FALSE, devel = devel), purrr::map(topics, build_reference_topic, pkg = pkg, lazy = lazy, examples_env = examples_env, run_dont_run = run_dont_run), .f(.x[[i]], ...), withCallingHandlers(data_reference_topic(topic, pkg, examples_env = examples_env, run_dont_run = run_dont_run), error = function(err) { msg <- c(paste0("Failed to parse Rd in ", topic$file_in), i = err$message) abort(msg, parent = err) }), data_reference_topic(topic, pkg, examples_env = examples_env, run_dont_run = run_dont_run), run_examples(tags$tag_examples[[1]], env = if (is.null(examples_env)) NULL else new.env(parent = examples_env), topic = tools::file_path_sans_ext(topic$file_in), run_dont_run = run_dont_run), highlight_examples(code, topic, env = env), downlit::evaluate_and_highlight(code, fig_save = fig_save_topic, env = child_env(env), output_handler = evaluate::new_output_handler(value = pkgdown_print)), evaluate::evaluate(code, child_env(env), new_device = TRUE, output_handler = output_handler), evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, debug = debug, last = i == length(out), use_try = stop_on_error != 2L, keep_warning = keep_warning, keep_message = keep_message, output_handler = output_handler, include_timing = include_timing), timing_fn(handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler))), handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler)), try(f, silent = TRUE), tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call, nlines = 1L) prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e), "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e), "\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e)) }), tryCatchList(expr, classes, parentenv, handlers), tryCatchOne(expr, names, parentenv, handlers[[1L]]), doTryCatch(return(expr), name, parentenv, handler), withCallingHandlers(withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), warning = wHandler, error = eHandler, message = mHandler), withVisible(eval_with_user_handlers(expr, envir, enclos, user_handlers)), eval_with_user_handlers(expr, envir, enclos, user_handlers), eval(expr, envir, enclos), eval(expr, envir, enclos), setup_dummy_db(), .events %>% dplyr::mutate(datetime = strftime(datetime), date = strftime(date, format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")) %>% DBI::dbWriteTable(conn, "events", .), DBI::dbWriteTable(conn, "events", .), dplyr::mutate(., datetime = strftime(datetime), date = strftime(date, format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")), mutate.data.frame(., datetime = strftime(datetime), date = strftime(date, format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")), mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env()), withCallingHandlers({ for (i in seq_along(dots)) { context_poke("column", old_current_column) quosures <- expand_across(dots[[i]]) quosures_results <- vector(mode = "list", length = length(quosures)) for (k in seq_along(quosures)) { quo <- quosures[[k]] quo_data <- attr(quo, "dplyr:::data") if (!is.null(quo_data$column)) { context_poke("column", quo_data$column) } chunks <- NULL result <- NULL if (quo_is_symbol(quo)) { name <- as_string(quo_get_expr(quo)) if (name %in% names(new_columns)) { result <- new_columns[[name]] chunks <- mask$resolve(name) } else if (name %in% names(.data)) { result <- .data[[name]] chunks <- mask$resolve(name) } if (inherits(.data, "rowwise_df") && vec_is_list(result)) { sizes <- list_sizes(result) wrong <- which(sizes != 1) if (length(wrong)) { group <- wrong[1L] mask$set_current_group(group) abort(class = c("dplyr:::mutate_incompatible_size", "dplyr:::internal_error"), dplyr_error_data = list(result_size = sizes[group], expected_size = 1)) } } } else if (!quo_is_symbolic(quo) && !is.null(quo_get_expr(quo))) { result <- quo_get_expr(quo) result <- withCallingHandlers(vec_recycle(result, vec_size(.data)), error = function(cnd) { abort(class = c("dplyr:::mutate_constant_recycle_error", "dplyr:::internal_error"), constant_size = vec_size(result), data_size = vec_size(.data)) }) chunks <- vec_chop(result, rows) } if (is.null(chunks)) { if (is.null(quo_data$column)) { chunks <- mask$eval_all_mutate(quo) } else { chunks <- withCallingHandlers(mask$eval_all_mutate(quo), error = function(cnd) { msg <- glue("Problem while computing column `{quo_data$name_auto}`.") abort(msg, call = call("across"), parent = cnd) }) } } if (is.null(chunks)) { next } if (is.null(result)) { if (length(rows) == 1) { result <- chunks[[1]] } else { chunks <- dplyr_vec_cast_common(chunks, quo_data$name_auto) result <- vec_unchop(chunks, rows) } } quosures_results[[k]] <- list(result = result, chunks = chunks) } for (k in seq_along(quosures)) { quo <- quosures[[k]] quo_data <- attr(quo, "dplyr:::data") quo_result <- quosures_results[[k]] if (is.null(quo_result)) { if (quo_data$is_named) { name <- quo_data$name_given new_columns[[name]] <- zap() mask$remove(name) } next } result <- quo_result$result chunks <- quo_result$chunks if (!quo_data$is_named && is.data.frame(result)) { types <- vec_ptype(result) types_names <- names(types) chunks_extracted <- .Call(dplyr_extract_chunks, chunks, types) for (j in seq_along(types)) { mask$add_one(types_names[j], chunks_extracted[[j]], result = result[[j]]) } new_columns[types_names] <- result } else { name <- quo_data$name_auto mask$add_one(name = name, chunks = chunks, result = result) new_columns[[name]] <- result } } } }, error = function(e) { local_error_context(dots = dots, .index = i, mask = mask) bullets <- c(cnd_bullet_header("computing"), mutate_bullets(e)) abort(bullets, class = "dplyr:::mutate_error", parent = skip_internal_condition(e), bullets = bullets, call = error_call) }, warning = function(w) { if (check_muffled_warning(w)) { maybe_restart("muffleWarning") } local_error_context(dots = dots, .index = i, mask = mask) warn(c(cnd_bullet_header("computing"), i = cnd_header(w), i = cnd_bullet_cur_group_label(what = "warning"))) maybe_restart("muffleWarning") }), mask$eval_all_mutate(quo), strftime(time, format = "%H:%M:%S"), format(as.POSIXlt(x, tz = tz), format = format, usetz = usetz, ...), as.POSIXlt(x, tz = tz), as.POSIXlt.default(x, tz = tz), stop(gettextf("do not know how to convert '%s' to class %s", deparse1(substitute(x)), dQuote("POSIXlt")), domain = NA), .handleSimpleError(`<fn>`, "do not know how to convert 'x' to class “POSIXlt”", base::quote(as.POSIXlt.default(x, tz = tz))), h(simpleError(msg, call)), abort(bullets, class = "dplyr:::mutate_error", parent = skip_internal_condition(e), bullets = bullets, call = error_call)), parent = c(0L, 1L, 2L, 3L, 2L, 5L, 6L, 0L, 0L, 0L, 0L, 0L, 12L, 13L, 14L, 15L, 16L, 17L, 17L, 19L, 20L, 21L, 22L, 23L, 24L, 24L, 26L, 27L, 28L, 29L, 30L, 24L, 24L, 24L, 34L, 35L, 36L, 37L, 0L, 0L, 0L, 41L, 42L, 42L, 0L, 45L, 45L, 45L, 48L, 0L, 50L, 51L), visible = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE), namespace = c("base", "base", "base", "base", "base", "base", "base", "base", "base", "base", "base", NA, "pkgdown", "pkgdown", "pkgdown", "purrr", "pkgdown", "base", "pkgdown", "pkgdown", "pkgdown", "downlit", "evaluate", "evaluate", "evaluate", "evaluate", "base", "base", "base", "base", "base", "base", "base", "evaluate", "base", "base", "wranglEHR", NA, "DBI", "dplyr", "dplyr", "dplyr", "base", NA, "base", "base", "base", "base", "base", "base", "dplyr", "rlang"), scope = c("::", "local", "local", "local", "local", "local", "local", "::", "::", "::", "local", "global", "::", ":::", "::", "::", "local", "::", ":::", ":::", ":::", "::", "::", ":::", "local", "local", "::", "::", "local", "local", "local", "::", "::", ":::", "::", "::", "::", NA, "::", "::", ":::", ":::", "::", NA, "::", "::", "::", "::", "::", "::", "local", "::")), row.names = c(NA, -52L), version = 2L, class = c("rlang_trace", "rlib_trace", "tbl", "data.frame")), parent = structure(list( message = "do not know how to convert 'x' to class “POSIXlt”", call = as.POSIXlt.default(x, tz = tz)), class = c("simpleError", "error", "condition")), bullets = "Problem while computing `time = strftime(time, format = \"%H:%M:%S\")`.", call = dplyr::mutate(., datetime = strftime(datetime), date = strftime(date, format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")), use_cli_format = TRUE), class = c("dplyr:::mutate_error", "rlang_error", "error", "condition"))): error in evaluating the argument 'value' in selecting a method for function 'dbWriteTable': Problem while computing `time = strftime(time, format = "%H:%M:%S")`.
#> Caused by error in `as.POSIXlt.default()`:
#> ! do not know how to convert 'x' to class “POSIXlt”
df <- extract_timevarying(
connection = con,
episode_ids = 1:10,
code_names = "NIHR_HIC_ICU_0108"
)
#> Error in extract_timevarying(connection = con, episode_ids = 1:10, code_names = "NIHR_HIC_ICU_0108"): object 'con' not found
head(df)
#>
#> 1 function (x, df1, df2, ncp, log = FALSE)
#> 2 {
#> 3 if (missing(ncp))
#> 4 .Call(C_df, x, df1, df2, log)
#> 5 else .Call(C_dnf, x, df1, df2, ncp, log)
#> 6 }