This is the workhorse function of wranglEHR that transcribes 2d data from CC-HIC to a table with 1 column per dataitem (and any metadata if relevant) and 1 row per time per patient.

extract_timevarying(
  connection = NULL,
  episode_ids = NA_integer_,
  code_names = NA_character_,
  rename = NA_character_,
  coalesce_rows = dplyr::first,
  chunk_size = 5000,
  cadence = 1,
  time_boundaries = c(-Inf, Inf)
)

Arguments

connection

a CC-HIC database connection.

episode_ids

an integer vector of episode_ids or NULL. If NULL (the default) then all episodes are extracted.

code_names

a string vector of CC-HIC codes names to be extracted.

rename

a character vector, of the same length as code_names, with names to relabel extracted CC-HIC dataitems, or NULL (the default) to retain the original code names. Given in the same order as code_names.

coalesce_rows

a function vector of summary functions to summarise data that is contributed at a higher frequency than the set cadence. Must be the same length, and in the same order as code_names.

chunk_size

an integer scalar. Chunks the extraction process by this many episodes to help manage memory constraints. The default (5000) works well for most desktop computers. If RAM is not a major limitation, setting this to Inf may improve performance.

cadence

a numerical scalar >= 0 or the string "timestamp". If a numerical scalar is used, it will describe the base time unit to build each row of the extracted table, in divisions of an hour. For example: 1 = 1 hour, 0.5 = 30 mins, 2 = 2 hourly. If cadence = "timestamp", then the precise datetime will be used to generate the time column. This is likely to generate a large table, so use cautiously.

time_boundaries

a numeric vector of length 2 containing the start and end times (in hours) relative to the ICU admission time, for which the data extraction should occur. For example, c(0, 24) will return the first 24 hours of data after admission. The default c(-Inf, Inf) will return all data.

Value

sparse tibble with an hourly cadence as rows, and unique data items as columns. Data items that contain metadata are reallocated to their own columns.

Details

The time unit is user definable, and set by the cadence argument. The default behaviour is to produce a table with 1 row per hour per patient. If there are duplicates/conflicts (e.g more than 1 event for a given hour), then only the first result for that hour is returned. If extracting at a lower cadence than is naturally recorded in the database, one can specify a vector of summary function to the coalesce_rows argument. These summary functions must *always* return a vector of length 1, in the same data type and must be able to handle vectors entirely of NAs.

Many events inside CC-HIC occur on a greater than hourly basis. Depending upon the chosen analysis, one may which to modify the cadence. 0.5 for example will produce a table with 1 row per 30 minutes per patient.

Choose what variables one wishes to extract wisely. This function is quite efficient considering what it needs to do, but it can take a very long time if extracting lots of data. It is a strong recommendation that the database is optimised with indexes prior to using this function. It is sensible to test the extraction with 100 or so patients before committing to a full extraction.

It is possible for this function to produce negative time rows (e.g. rows that occurred prior to ICU admission). If, for example a patient had a measure taken in the hours before they were admitted, then this would be added to the table with a negative time value. As a concrete example, if a patient had a sodium measured at 08:00, and they were admitted to the ICU at 20:00 the same day, then the sodium would be displayed at time = -12. This is normal behaviour and it is left to the end user to determine how best they wish to account for this.

Examples

con <- setup_dummy_db()
#> Error in (function (cond) .Internal(C_tryCatchHelper(addr, 1L, cond)))(structure(list(message = "Problem while computing `time = strftime(time, format = \"%H:%M:%S\")`.",     trace = structure(list(call = list(tryCatch(withCallingHandlers({        NULL        saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"),             list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,             quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b")        flush(stdout())        flush(stderr())        NULL        invisible()    }, error = function(e) {        {            callr_data <- as.environment("tools:callr")$`__callr_data__`            err <- callr_data$err            assign(".Traceback", .traceback(4), envir = callr_data)            dump.frames("__callr_dump__")            assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            rm("__callr_dump__", envir = .GlobalEnv)            e$call <- deparse(conditionCall(e), nlines = 6)            e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e))            class(e2) <- c("callr_remote_error", class(e2))            e2$error <- e            calls <- sys.calls()            dcframe <- which(vapply(calls, function(x) length(x) >=                 1 && identical(x[[1]], quote(do.call)), logical(1)))[1]            if (!is.na(dcframe))                 e2$`_ignore` <- list(c(1, dcframe + 1L))            e2$`_pid` <- Sys.getpid()            e2$`_timestamp` <- Sys.time()            if (inherits(e, "rlib_error_2_0"))                 e2$parent <- e$parent            e2 <- err$add_trace_back(e2, embed = FALSE)            saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b",                 ".error"))        }    }, interrupt = function(e) {        {            callr_data <- as.environment("tools:callr")$`__callr_data__`            err <- callr_data$err            assign(".Traceback", .traceback(4), envir = callr_data)            dump.frames("__callr_dump__")            assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                 envir = callr_data)            rm("__callr_dump__", envir = .GlobalEnv)            e$call <- deparse(conditionCall(e), nlines = 6)            e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e))            class(e2) <- c("callr_remote_error", class(e2))            e2$error <- e            calls <- sys.calls()            dcframe <- which(vapply(calls, function(x) length(x) >=                 1 && identical(x[[1]], quote(do.call)), logical(1)))[1]            if (!is.na(dcframe))                 e2$`_ignore` <- list(c(1, dcframe + 1L))            e2$`_pid` <- Sys.getpid()            e2$`_timestamp` <- Sys.time()            if (inherits(e, "rlib_error_2_0"))                 e2$parent <- e$parent            e2 <- err$add_trace_back(e2, embed = FALSE)            saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b",                 ".error"))        }    }, callr_message = function(e) {        try(signalCondition(e))    }), error = function(e) {        NULL        try(stop(e))    }, interrupt = function(e) {        NULL        e    }), tryCatchList(expr, classes, parentenv, handlers), tryCatchOne(tryCatchList(expr,         names[-nh], parentenv, handlers[-nh]), names[nh], parentenv,         handlers[[nh]]), doTryCatch(return(expr), name, parentenv,         handler), tryCatchList(expr, names[-nh], parentenv, handlers[-nh]),         tryCatchOne(expr, names, parentenv, handlers[[1L]]),         doTryCatch(return(expr), name, parentenv, handler), withCallingHandlers({            NULL            saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"),                 list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,                 quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b")            flush(stdout())            flush(stderr())            NULL            invisible()        }, error = function(e) {            {                callr_data <- as.environment("tools:callr")$`__callr_data__`                err <- callr_data$err                assign(".Traceback", .traceback(4), envir = callr_data)                dump.frames("__callr_dump__")                assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                   envir = callr_data)                rm("__callr_dump__", envir = .GlobalEnv)                e$call <- deparse(conditionCall(e), nlines = 6)                e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e))                class(e2) <- c("callr_remote_error", class(e2))                e2$error <- e                calls <- sys.calls()                dcframe <- which(vapply(calls, function(x) length(x) >=                   1 && identical(x[[1]], quote(do.call)), logical(1)))[1]                if (!is.na(dcframe))                   e2$`_ignore` <- list(c(1, dcframe + 1L))                e2$`_pid` <- Sys.getpid()                e2$`_timestamp` <- Sys.time()                if (inherits(e, "rlib_error_2_0"))                   e2$parent <- e$parent                e2 <- err$add_trace_back(e2, embed = FALSE)                saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b",                   ".error"))            }        }, interrupt = function(e) {            {                callr_data <- as.environment("tools:callr")$`__callr_data__`                err <- callr_data$err                assign(".Traceback", .traceback(4), envir = callr_data)                dump.frames("__callr_dump__")                assign(".Last.dump", .GlobalEnv$`__callr_dump__`,                   envir = callr_data)                rm("__callr_dump__", envir = .GlobalEnv)                e$call <- deparse(conditionCall(e), nlines = 6)                e2 <- err$new_error(conditionMessage(e), call. = conditionCall(e))                class(e2) <- c("callr_remote_error", class(e2))                e2$error <- e                calls <- sys.calls()                dcframe <- which(vapply(calls, function(x) length(x) >=                   1 && identical(x[[1]], quote(do.call)), logical(1)))[1]                if (!is.na(dcframe))                   e2$`_ignore` <- list(c(1, dcframe + 1L))                e2$`_pid` <- Sys.getpid()                e2$`_timestamp` <- Sys.time()                if (inherits(e, "rlib_error_2_0"))                   e2$parent <- e$parent                e2 <- err$add_trace_back(e2, embed = FALSE)                saveRDS(list("error", e2), file = paste0("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b",                   ".error"))            }        }, callr_message = function(e) {            try(signalCondition(e))        }), saveRDS(do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"),             list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,             quote = TRUE), file = "/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-res-3dbc82c94b"),         do.call(do.call, c(readRDS("/var/folders/7r/d3tp4w4171vc57vmhwy6fv940000gn/T//Rtmp5xMCHY/callr-fun-3db3e4c70d1"),             list(envir = .GlobalEnv, quote = TRUE)), envir = .GlobalEnv,             quote = TRUE), `<fn>`(base::quote(`<fn>`), base::quote(`<named list>`),             envir = base::quote(`<env>`), quote = base::quote(TRUE)),         `<fn>`(pkg = base::quote(`<pkgdown>`), examples = base::quote(TRUE),             run_dont_run = base::quote(FALSE), seed = base::quote(1014),             lazy = base::quote(FALSE), override = base::quote(`<list>`),             install = base::quote(FALSE), preview = base::quote(FALSE),             new_process = base::quote(FALSE), devel = base::quote(FALSE),             crayon_enabled = base::quote(FALSE), crayon_colors = base::quote(1L),             pkgdown_internet = base::quote(TRUE)), pkgdown::build_site(...),         build_site_local(pkg = pkg, examples = examples, run_dont_run = run_dont_run,             seed = seed, lazy = lazy, override = override, preview = preview,             devel = devel), build_reference(pkg, lazy = lazy,             examples = examples, run_dont_run = run_dont_run,             seed = seed, override = override, preview = FALSE,             devel = devel), purrr::map(topics, build_reference_topic,             pkg = pkg, lazy = lazy, examples_env = examples_env,             run_dont_run = run_dont_run), .f(.x[[i]], ...), withCallingHandlers(data_reference_topic(topic,             pkg, examples_env = examples_env, run_dont_run = run_dont_run),             error = function(err) {                msg <- c(paste0("Failed to parse Rd in ", topic$file_in),                   i = err$message)                abort(msg, parent = err)            }), data_reference_topic(topic, pkg, examples_env = examples_env,             run_dont_run = run_dont_run), run_examples(tags$tag_examples[[1]],             env = if (is.null(examples_env)) NULL else new.env(parent = examples_env),             topic = tools::file_path_sans_ext(topic$file_in),             run_dont_run = run_dont_run), highlight_examples(code,             topic, env = env), downlit::evaluate_and_highlight(code,             fig_save = fig_save_topic, env = child_env(env),             output_handler = evaluate::new_output_handler(value = pkgdown_print)),         evaluate::evaluate(code, child_env(env), new_device = TRUE,             output_handler = output_handler), evaluate_call(expr,             parsed$src[[i]], envir = envir, enclos = enclos,             debug = debug, last = i == length(out), use_try = stop_on_error !=                 2L, keep_warning = keep_warning, keep_message = keep_message,             output_handler = output_handler, include_timing = include_timing),         timing_fn(handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr,             envir, enclos, user_handlers)), warning = wHandler,             error = eHandler, message = mHandler))), handle(ev <- withCallingHandlers(withVisible(eval_with_user_handlers(expr,             envir, enclos, user_handlers)), warning = wHandler,             error = eHandler, message = mHandler)), try(f, silent = TRUE),         tryCatch(expr, error = function(e) {            call <- conditionCall(e)            if (!is.null(call)) {                if (identical(call[[1L]], quote(doTryCatch)))                   call <- sys.call(-4L)                dcall <- deparse(call, nlines = 1L)                prefix <- paste("Error in", dcall, ": ")                LONG <- 75L                sm <- strsplit(conditionMessage(e), "\n")[[1L]]                w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L],                   type = "w")                if (is.na(w))                   w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                     type = "b")                if (w > LONG)                   prefix <- paste0(prefix, "\n  ")            }            else prefix <- "Error : "            msg <- paste0(prefix, conditionMessage(e), "\n")            .Internal(seterrmessage(msg[1L]))            if (!silent && isTRUE(getOption("show.error.messages"))) {                cat(msg, file = outFile)                .Internal(printDeferredWarnings())            }            invisible(structure(msg, class = "try-error", condition = e))        }), tryCatchList(expr, classes, parentenv, handlers),         tryCatchOne(expr, names, parentenv, handlers[[1L]]),         doTryCatch(return(expr), name, parentenv, handler), withCallingHandlers(withVisible(eval_with_user_handlers(expr,             envir, enclos, user_handlers)), warning = wHandler,             error = eHandler, message = mHandler), withVisible(eval_with_user_handlers(expr,             envir, enclos, user_handlers)), eval_with_user_handlers(expr,             envir, enclos, user_handlers), eval(expr, envir,             enclos), eval(expr, envir, enclos), setup_dummy_db(),         .events %>% dplyr::mutate(datetime = strftime(datetime),             date = strftime(date, format = "%Y-%m-%d"), time = strftime(time,                 format = "%H:%M:%S")) %>% DBI::dbWriteTable(conn,             "events", .), DBI::dbWriteTable(conn, "events", .),         dplyr::mutate(., datetime = strftime(datetime), date = strftime(date,             format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")),         mutate.data.frame(., datetime = strftime(datetime), date = strftime(date,             format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")),         mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env()),         withCallingHandlers({            for (i in seq_along(dots)) {                context_poke("column", old_current_column)                quosures <- expand_across(dots[[i]])                quosures_results <- vector(mode = "list", length = length(quosures))                for (k in seq_along(quosures)) {                  quo <- quosures[[k]]                  quo_data <- attr(quo, "dplyr:::data")                  if (!is.null(quo_data$column)) {                    context_poke("column", quo_data$column)                  }                  chunks <- NULL                  result <- NULL                  if (quo_is_symbol(quo)) {                    name <- as_string(quo_get_expr(quo))                    if (name %in% names(new_columns)) {                      result <- new_columns[[name]]                      chunks <- mask$resolve(name)                    }                    else if (name %in% names(.data)) {                      result <- .data[[name]]                      chunks <- mask$resolve(name)                    }                    if (inherits(.data, "rowwise_df") && vec_is_list(result)) {                      sizes <- list_sizes(result)                      wrong <- which(sizes != 1)                      if (length(wrong)) {                        group <- wrong[1L]                        mask$set_current_group(group)                        abort(class = c("dplyr:::mutate_incompatible_size",                           "dplyr:::internal_error"), dplyr_error_data = list(result_size = sizes[group],                           expected_size = 1))                      }                    }                  }                  else if (!quo_is_symbolic(quo) && !is.null(quo_get_expr(quo))) {                    result <- quo_get_expr(quo)                    result <- withCallingHandlers(vec_recycle(result,                       vec_size(.data)), error = function(cnd) {                      abort(class = c("dplyr:::mutate_constant_recycle_error",                         "dplyr:::internal_error"), constant_size = vec_size(result),                         data_size = vec_size(.data))                    })                    chunks <- vec_chop(result, rows)                  }                  if (is.null(chunks)) {                    if (is.null(quo_data$column)) {                      chunks <- mask$eval_all_mutate(quo)                    }                    else {                      chunks <- withCallingHandlers(mask$eval_all_mutate(quo),                         error = function(cnd) {                          msg <- glue("Problem while computing column `{quo_data$name_auto}`.")                          abort(msg, call = call("across"), parent = cnd)                        })                    }                  }                  if (is.null(chunks)) {                    next                  }                  if (is.null(result)) {                    if (length(rows) == 1) {                      result <- chunks[[1]]                    }                    else {                      chunks <- dplyr_vec_cast_common(chunks,                         quo_data$name_auto)                      result <- vec_unchop(chunks, rows)                    }                  }                  quosures_results[[k]] <- list(result = result,                     chunks = chunks)                }                for (k in seq_along(quosures)) {                  quo <- quosures[[k]]                  quo_data <- attr(quo, "dplyr:::data")                  quo_result <- quosures_results[[k]]                  if (is.null(quo_result)) {                    if (quo_data$is_named) {                      name <- quo_data$name_given                      new_columns[[name]] <- zap()                      mask$remove(name)                    }                    next                  }                  result <- quo_result$result                  chunks <- quo_result$chunks                  if (!quo_data$is_named && is.data.frame(result)) {                    types <- vec_ptype(result)                    types_names <- names(types)                    chunks_extracted <- .Call(dplyr_extract_chunks,                       chunks, types)                    for (j in seq_along(types)) {                      mask$add_one(types_names[j], chunks_extracted[[j]],                         result = result[[j]])                    }                    new_columns[types_names] <- result                  }                  else {                    name <- quo_data$name_auto                    mask$add_one(name = name, chunks = chunks,                       result = result)                    new_columns[[name]] <- result                  }                }            }        }, error = function(e) {            local_error_context(dots = dots, .index = i, mask = mask)            bullets <- c(cnd_bullet_header("computing"), mutate_bullets(e))            abort(bullets, class = "dplyr:::mutate_error", parent = skip_internal_condition(e),                 bullets = bullets, call = error_call)        }, warning = function(w) {            if (check_muffled_warning(w)) {                maybe_restart("muffleWarning")            }            local_error_context(dots = dots, .index = i, mask = mask)            warn(c(cnd_bullet_header("computing"), i = cnd_header(w),                 i = cnd_bullet_cur_group_label(what = "warning")))            maybe_restart("muffleWarning")        }), mask$eval_all_mutate(quo), strftime(time, format = "%H:%M:%S"),         format(as.POSIXlt(x, tz = tz), format = format, usetz = usetz,             ...), as.POSIXlt(x, tz = tz), as.POSIXlt.default(x,             tz = tz), stop(gettextf("do not know how to convert '%s' to class %s",             deparse1(substitute(x)), dQuote("POSIXlt")), domain = NA),         .handleSimpleError(`<fn>`, "do not know how to convert 'x' to class “POSIXlt”",             base::quote(as.POSIXlt.default(x, tz = tz))), h(simpleError(msg,             call)), abort(bullets, class = "dplyr:::mutate_error",             parent = skip_internal_condition(e), bullets = bullets,             call = error_call)), parent = c(0L, 1L, 2L, 3L, 2L,     5L, 6L, 0L, 0L, 0L, 0L, 0L, 12L, 13L, 14L, 15L, 16L, 17L,     17L, 19L, 20L, 21L, 22L, 23L, 24L, 24L, 26L, 27L, 28L, 29L,     30L, 24L, 24L, 24L, 34L, 35L, 36L, 37L, 0L, 0L, 0L, 41L,     42L, 42L, 0L, 45L, 45L, 45L, 48L, 0L, 50L, 51L), visible = c(TRUE,     TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,     TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,     TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,     TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,     TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE,     FALSE), namespace = c("base", "base", "base", "base", "base",     "base", "base", "base", "base", "base", "base", NA, "pkgdown",     "pkgdown", "pkgdown", "purrr", "pkgdown", "base", "pkgdown",     "pkgdown", "pkgdown", "downlit", "evaluate", "evaluate",     "evaluate", "evaluate", "base", "base", "base", "base", "base",     "base", "base", "evaluate", "base", "base", "wranglEHR",     NA, "DBI", "dplyr", "dplyr", "dplyr", "base", NA, "base",     "base", "base", "base", "base", "base", "dplyr", "rlang"),         scope = c("::", "local", "local", "local", "local", "local",         "local", "::", "::", "::", "local", "global", "::", ":::",         "::", "::", "local", "::", ":::", ":::", ":::", "::",         "::", ":::", "local", "local", "::", "::", "local", "local",         "local", "::", "::", ":::", "::", "::", "::", NA, "::",         "::", ":::", ":::", "::", NA, "::", "::", "::", "::",         "::", "::", "local", "::")), row.names = c(NA, -52L), version = 2L, class = c("rlang_trace",     "rlib_trace", "tbl", "data.frame")), parent = structure(list(        message = "do not know how to convert 'x' to class “POSIXlt”",         call = as.POSIXlt.default(x, tz = tz)), class = c("simpleError",     "error", "condition")), bullets = "Problem while computing `time = strftime(time, format = \"%H:%M:%S\")`.",     call = dplyr::mutate(., datetime = strftime(datetime), date = strftime(date,         format = "%Y-%m-%d"), time = strftime(time, format = "%H:%M:%S")),     use_cli_format = TRUE), class = c("dplyr:::mutate_error", "rlang_error", "error", "condition"))): error in evaluating the argument 'value' in selecting a method for function 'dbWriteTable': Problem while computing `time = strftime(time, format = "%H:%M:%S")`.
#> Caused by error in `as.POSIXlt.default()`:
#> ! do not know how to convert 'x' to class “POSIXlt”
df <- extract_timevarying(
  connection = con,
  episode_ids = 1:10,
  code_names = "NIHR_HIC_ICU_0108"
  )
#> Error in extract_timevarying(connection = con, episode_ids = 1:10, code_names = "NIHR_HIC_ICU_0108"): object 'con' not found
head(df)
#>                                               
#> 1 function (x, df1, df2, ncp, log = FALSE)    
#> 2 {                                           
#> 3     if (missing(ncp))                       
#> 4         .Call(C_df, x, df1, df2, log)       
#> 5     else .Call(C_dnf, x, df1, df2, ncp, log)
#> 6 }