You are now in the main GGIR vignette. See also the complementary vignettes on: Cut-points, Day segment analyses, GGIR parameters, Embedding external functions (pdf), and Reading ad-hoc csv file formats.
GGIR is an R-package to process multi-day raw accelerometer data for physical activity and sleep research. The term raw refers to data being expressed in m/s2 or gravitational acceleration as opposed to the previous generation accelerometers which stored data in accelerometer brand specific units. The signal processing includes automatic calibration, detection of sustained abnormally high values, detection of non-wear and calculation of average magnitude of dynamic acceleration based on a variety of metrics. Next, GGIR uses this information to describe the data per recording, per day of measurement, and (optionally) per segment of a day of measurement, including estimates of physical activity, inactivity and sleep. We published an overview paper of GGIR in 2019 link.
This vignette provides a general introduction on how to use GGIR and interpret the output, additionally you can find a introduction video and a mini-tutorial on YouTube. If you want to use your own algorithms for raw data then GGIR facilitates this with it’s external function embedding feature, documented in a separate vignette: Embedding external functions in GGIR. GGIR is increasingly being used by research groups across the world. A non-exhaustive overview of academic publications related to GGIR can be found here. R package GGIR would not have been possible without the support of the contributors listed in the author list at GGIR, with specific code contributions over time since April 2016 (when GGIR development moved to GitHub) shown here.
Cite GGIR:
When you use GGIR in publications do not forget to cite it properly as that makes your research more reproducible and it gives credit to it’s developers. See paragraph on Citing GGIR for details.
How to contribute to the code?
The development version of GGIR can be found on github, which is also where you will find guidance on how to contribute.
How can I get service and support?
GGIR is open source software and does not come with service or support guarantees. However, as user-community you can help each other via the GGIR google group or the GitHub issue tracker. Please use these public platform rather than private e-mails such that other users can learn from the conversations.
If you need dedicated support with the use of GGIR or need someone to adapt GGIR to your needs then Vincent van Hees is available as independent consultant.
Training in R essentials and GGIR We offer frequent online GGIR training courses. Check our dedicated training website with more details and the option to book your training. Do you have questions about the training or the booking process? Do not hesitate to contact us via: training@accelting.com.
Also of interest may be the brief free R introduction tutorial.
Change log
Our log of main changes to GGIR over time can be found here.
Install GGIR with its dependencies from CRAN. You can do this with one command from the console command line:
install.packages("GGIR", dependencies = TRUE)
Alternatively, to install the latest development version with the latest bug fixes use instead:
install.packages("remotes")
remotes::install_github("wadpac/GGIR")
Additionally, in some use-cases you will need to install one or multiple additional packages:
install.packages("GGIRread")
install.packages("read.gt3x")
do.neishabouricounts = TRUE
), install the actilifecounts
package with install.packages("actilifecounts")
cosinor = TRUE
),
install the ActCR package with
install.packages("ActCR")
read.myacc.csv
and argument
rmc.noise
in the GGIR function
documentation (pdf). Note that functionality for the following file
formats was part of GGIR but has been deprecated as it required a
significant maintenance effort without a clear use case or community
support: (1) .bin for the Genea monitor by Unilever Discover, an
accelerometer that was used for some studies between 2007 and 2012)
.bin, and (2) .wav files as can be exported by the Axivity Ltd OMGUI
software. Please contact us if you think these data formats should be
facilitated by GGIR again and if you are interested in supporting their
ongoing maintenance.GGIR comes with a large number of functions and optional settings (arguments) per functions.
To ease interacting with GGIR there is one central function, named
GGIR
, to talk to all the other functions. In the past this
function was called g.shell.GGIR
, but we decided to shorten
it to GGIR
for convenience. You can still use
g.shell.GGIR
because g.shell.GGIR
has become a
wrapper function around GGIR
passing on all arguments to
GGIR
and by that providing identical functionality.
In this paragraph we will guide you through the main arguments to
GGIR
relevant for 99% of research. First of all, it is
important to understand that the GGIR package is structured in two
ways.
Firstly, it has a computational structure of five parts which are
applied sequentially to the data, and that GGIR
controls
each of these parts:
The reason why it split up in parts is that it avoids having the re-do all analysis if you only want to make a small change in the more downstream parts. The specific order and content of the parts has grown for historical and computational reasons.
Secondly, the function arguments which we will refer to as input parameters are structured thematically independently of the five parts they are used in:
This structure was introduced in GGIR version 2.5-6 to make the GGIR code and documentation easier to navigate.
To see the parameters in each parameter category and their default values do:
library(GGIR)
print(load_params())
If you are only interested in one specific category like sleep:
library(GGIR)
print(load_params()$params_sleep)
If you are only interested in parameter “HASIB.algo” from the sleep_params object:
library(GGIR)
print(load_params()$params_sleep[["HASIB.algo"]])
Documentation for all arguments in the parameter objects can be found the vignette: GGIR configuration parameters.
All of these arguments are accepted as argument to function
GGIR
, because GGIR
is a shell around all GGIR
functionality. However, the params_
objects themselves can
not be provided as input to GGIR
.
You will probably never need to think about most of the arguments listed above, because a lot of arguments are only included to facilitate methodological studies where researchers want to have control over every little detail. See previous paragraph for links to the documentation and how to find the default value of each parameter.
The bare minimum input needed for GGIR
is:
library(GGIR)
GGIR(datadir="C:/mystudy/mydata",
outputdir="D:/myresults")
Argument datadir
allows you to specify where you have
stored your accelerometer data and outputdir
allows you to
specify where you would like the output of the analyses to be stored.
This cannot be equal to datadir
. If you copy paste the
above code to a new R script (file ending with .R) and Source it in
R(Studio) then the dataset will be processed and the output will be
stored in the specified output directory.
Below we have highlighted the key arguments you may want to be aware of. We are not giving a detailed explanation, please see the package manual for that.
mode
- which part of GGIR to run, GGIR is constructed
in five parts.overwrite
- whether to overwrite previously produced
milestone output. Between each GGIR part, GGIR stores milestone output
to ease re-running parts of the pipeline.idloc
- tells GGIR where to find the participant ID
(default: inside file header)strategy
- informs GGIR how to consider the design of
the experiment.
strategy
is set to value 1, then check out arguments
hrs.del.start
and hrs.del.end
.strategy
is set to value 3 or 5, then check out
arguments ndayswindow
, hrs.del.start
and
hrs.del.end
.maxdur
- maximum number of days you expect in a data
file based on the study protocol.desiredtz
- time zone of the experiment.chunksize
- a way to tell GGIR to use less memory,
which can be useful on machines with limited memory.includedaycrit
- tell GGIR how many hours of valid data
per day (midnight-midnight) is acceptable.includenightcrit
- tell GGIR how many hours of a valid
night (noon-noon) is acceptable.qwindow
- argument to tell GGIR whether and how to
segment the day for day-segment specific analysis.mvpathreshold
and boutcriter
-
acceleration threshold and bout criteria used for calculating time spent
in MVPA (only used in GGIR part2).epochvalues2csv
- to export epoch level magnitude of
acceleration to a csv files (in addition to already being stored as
RData file)dayborder
- to decide whether the edge of a day should
be other than midnight.iglevels
- argument related to intensity gradient
method proposed by A. Rowlands.do.report
- specify reports that need to be
generated.viewingwindow
and visualreport
- to create
a visual report, this only works when all five parts of GGIR have
successfully run. Note that the visual report was initially developed to
provide something to show to study participants and not for data quality
checking purposes. Over time we have improved the visual report to also
be useful for QC-ing the data. however, some of the scorings as shown in
the visual report are created for the visual report only and may not
reflect the scorings in the main GGIR analyses as reported in the
quantitative csv-reports. Most of our effort in the past 10 years has
gone into making sure that the csv-report are correct, while the
visualreport has mostly been a side project. This is unfortunate and we
hope to find funding in the future to design a new report specifically
for the purpose of QC-ing the anlayses done by GGIR.maxRecordingInterval
- if specified controls whether
neighboring or overlapping recordings with the same participant ID and
brand are appended at epoch level. This can be useful when the intention
is to monitor behaviour over larger periods of time but accelerometers
only allow for a few weeks of data collection. GGIR will never append or
alter the raw input file, this operation is preformed on the derived
data.This section has been rewritten and moved. Please, visit the vignette Published cut-points and how to use them in GGIR for more details on the cut-points available, how to use them, and some additional reflections on the use of cut-points in GGIR.
If you consider all the arguments above you me may end up with a call
to GGIR
that could look as follows.
library(GGIR)
GGIR(mode=c(1,2,3,4,5),
datadir="C:/mystudy/mydata",
outputdir="D:/myresults",
do.report=c(2,4,5),
#=====================
# Part 2
#=====================
strategy = 1,
hrs.del.start = 0, hrs.del.end = 0,
maxdur = 9, includedaycrit = 16,
qwindow=c(0,24),
mvpathreshold =c(100),
excludefirstlast = FALSE,
includenightcrit = 16,
#=====================
# Part 3 + 4
#=====================
def.noc.sleep = 1,
outliers.only = TRUE,
criterror = 4,
do.visual = TRUE,
#=====================
# Part 5
#=====================
threshold.lig = c(30), threshold.mod = c(100), threshold.vig = c(400),
boutcriter = 0.8, boutcriter.in = 0.9, boutcriter.lig = 0.8,
boutcriter.mvpa = 0.8, boutdur.in = c(1,10,30), boutdur.lig = c(1,10),
boutdur.mvpa = c(1),
includedaycrit.part5 = 2/3,
#=====================
# Visual report
#=====================
timewindow = c("WW"),
visualreport=TRUE)
Once you have used GGIR
and the output directory
(outputdir) will be filled with milestone data and results.
Function GGIR
stores all the explicitly entered argument
values and default values for the argument that are not explicitly
provided in a csv-file named config.csv stored in the root of the output
folder. The config.csv file is accepted as input to GGIR
with argument configfile
to replace the specification of
all the arguments, except datadir
and
outputdir
, see example below.
library(GGIR)
GGIR(datadir="C:/mystudy/mydata",
outputdir="D:/myresults", configfile = "D:/myconfigfiles/config.csv")
The practical value of this is that it eases the replication of analysis, because instead of having to share you R script, sharing your config.csv file will be sufficient. Further, the config.csv file contribute to the reproducibility of your data analysis.
Note 1: When combining a configuration file with explicitly provided
argument values, the explicitly provided argument values will overrule
the argument values in the configuration file. Note 2: The config.csv
file in the root of the output folder will be overwritten every time you
use GGIR
. So, if you would like to add annotations in the
file, e.g. in the fourth column, then you will need to store it
somewhere outside the output folder and explicitly point to it with
configfile
argument.
Create an R-script and put the GGIR call in it. Next, you can source
the R-script with the source
function in R:
source("pathtoscript/myshellscript.R")
or use the Source button in RStudio if you use RStudio.
GGIR by default support multi-thread processing, which can be turned
off by seting argument do.parallel = FALSE
. If this is
still not fast enough then we advise using a GGIR on a computing
cluster. The way we did it on a Sun Grid Engine cluster is shown below,
please note that some of these commands are specific to the computing
cluster you are working on. Also, you may actually want to use an R
package like clustermq or snowfall, which avoids having to write bash
script. Please consult your local cluster specialist to tailor this to
your situation. In our case, we had three files for the SGE setting:
submit.sh
for i in {1..707}; do
n=1
s=$(($(($n * $[$i-1]))+1))
e=$(($i * $n))
qsub /home/nvhv/WORKING_DATA/bashscripts/run-mainscript.sh $s $e
done
run-mainscript.sh
#! /bin/bash
#$ -cwd -V
#$ -l h_vmem=12G
/usr/bin/R --vanilla --args f0=$1 f1=$2 < /home/nvhv/WORKING_DATA/test/myshellscript.R
myshellscript.R
options(echo=TRUE)
args = commandArgs(TRUE)
if(length(args) > 0) {
for (i in 1:length(args)) {
eval(parse(text = args[[i]]))
}
}
GGIR(f0=f0,f1=f1,...)
You will need to update the ...
in the last line with
the arguments you used for GGIR
. Note that
f0=f0,f1=f1
is essential for this to work. The values of
f0
and f1
are passed on from the bash
script.
Once this is all setup you will need to call
bash submit.sh
from the command line.
With the help of computing clusters GGIR has successfully been run on some of the worlds largest accelerometer data sets such as UK Biobank and German NAKO study.
The time to process a typical seven day recording should be anywhere in between 3 and 10 minutes depending on the sample frequency of the recording, the sensor brand, data format, the exact configuration of GGIR, and the specifications of your computer. If you are observing processing times of 20 minutes or longer for a 7 day recording then probably you are slowed down by other factors.
Some tips on how you may be able to address this:
GGIR generates the following types of output. - csv-spreadsheets with all the variables you need for physical activity, sleep and circadian rhythm research - Pdfs with on each page a low resolution plot of the data per file and quality indicators - R objects with milestone data - Pdfs with a visual summary of the physical activity and sleep patterns as identified (see example below)