Working with cohorts

Adding a cohort

First, we’ll load packages and create a cdm reference. In this case we’ll be using the Eunomia “GI Bleed” dataset.

library(CDMConnector)
library(dplyr)

con <- DBI::dbConnect(duckdb::duckdb(), eunomia_dir())

cdm <- CDMConnector::cdm_from_con(
  con = con,
  cdm_schema = "main",
  write_schema = "main"
)

We can define a cohort for GI bleeding, where we exclude anyone with a record of rheumatoid arthritis at any time.

# devtools::install_github("OHDSI/Capr")
library(Capr)

gibleed_cohort_definition <- cohort(
  entry = condition(cs(descendants(192671))),
  attrition = attrition(
    "no RA" = withAll(
      exactly(0,
              condition(cs(descendants(80809))),
              duringInterval(eventStarts(-Inf, Inf))))
  )
)

# requires CirceR optional dependency
cdm <- generateCohortSet(
  cdm,
  cohortSet = list(gibleed = gibleed_cohort_definition),
  name = "gibleed",
  computeAttrition = TRUE
)

We can see that we now have our cohort instantiated in the database with a reference to it added to the cdm reference.

cdm$gibleed %>% 
  glimpse()
#> Rows: 476
#> Columns: 4
#> $ cohort_definition_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ subject_id           <dbl> 35, 80, 99, 115, 116, 133, 160, 163, 164, 187, 18…
#> $ cohort_start_date    <date> 1997-07-25, 1974-10-27, 2000-03-11, 2001-04-15, …
#> $ cohort_end_date      <date> 2018-12-25, 2019-04-15, 2019-04-27, 2019-05-05, …

Cohort attributes

As well as the cohort itself, the cohort has a number of attributes. First, is a count of participants by cohort. We can use cohortCount to get these counts.

cohortCount(cdm$gibleed) %>% 
  glimpse()
#> Rows: 1
#> Columns: 3
#> $ cohort_definition_id <int> 1
#> $ number_records       <dbl> 476
#> $ number_subjects      <dbl> 476

We also have the attrition associated with entry into the cohort available via cohortAttrition.

cohortAttrition(cdm$gibleed) %>% 
  glimpse()
#> Rows: 1
#> Columns: 7
#> $ cohort_definition_id <int> 1
#> $ number_records       <dbl> 476
#> $ number_subjects      <dbl> 476
#> $ reason_id            <dbl> 1
#> $ reason               <chr> "Qualifying initial records"
#> $ excluded_records     <dbl> 0
#> $ excluded_subjects    <dbl> 0

And lastly, we can also access the settings associated with the cohort using cohortCount.

cohortSet(cdm$gibleed) %>% 
  glimpse()
#> Rows: 1
#> Columns: 2
#> $ cohort_definition_id <int> 1
#> $ cohort_name          <chr> "gibleed"
DBI::dbDisconnect(con, shutdown = TRUE)