The following cohort causal graphs (CCGs) are based on 2 - 16 year old European children and adolescents from the IDEFICS/I.family cohort. The data set contains N = 5,112 children born between 1997 and 2006 who participated in all three waves of the study.
We used the temporal order of the variables as prior knowledge for the analysis by distributing the variables into different tiers. The analysis data set consisted of 51 variables that were distributed over five tiers:
We assume that variables from a tier with a lower number can affect variables in tiers with a higher number, but not vice versa. In addition, we forbid edges between some variable pairs such as edges pointing to age or individual child variables (e.g. physical activity) pointing to ISCED or parental income.
All CGGs of childhood obesity were estimated by the temporal PC-algorithm (tPC) for multiple imputed data sets using the R-packages tpc and micd. The tPC package allows to make use of prior knowledge regarding the temporal order of the cohort data and the micd package offers the possibility to run the pc algorithm on multiple imputed data sets and with mixed variable scales. PC and tPC algorithm are both constraint-based structure learning algorithms. Both, tPC and micd rely on the PC algorithm as implemented in pcalg.
Tiers | Variable/Node | Unit | Comments |
---|---|---|---|
Context | Sex | female/male | Sex of child |
Context | Region | North/Central/South | Place of residence |
Context | Migrant | no/yes | Children were assumed to have a migrant background if they usually speak with their parents in a language other than the national language of the corresponding country |
Early life | Mother's age at birth | years | |
Early life | Total breastfeeding | months | incl. breast-feeding combinations prior child's diet was fully integrated into usual household diet |
Early life | Birthweight | gramm | |
Early life | Weeks of pregnancy | weeks | |
Early life | Formula milk | no/yes | Type of feeding prior child's diet was fully integrated into the usual household diet |
Early life | HH diet | months | Month when the child was introduced into the household's diet |
Early life | Smoking during pregnancy | no/yes | Mother consumed tobacco during pregnancy |
Context: B, FU1, FU2 | Age | months | |
Context: B, FU1, FU2 | School | kindergarten/school/ neither one | |
Context: B, FU1, FU2 | Income | low/middle/high | Country-specific parental income |
Context: B, FU1, FU2 | ISCED | low/middle/high | International Standard Classification of Education, highest parental education |
B, FU1, FU2 | AVM | h/day | Audio-visual media consumption |
B, FU1, FU2 | zBMI | z-score | Body mass index |
B, FU1, FU2 | Mother's BMI | kg/m^2 | Body mass index of the child's mother |
B, FU1, FU2 | Daily family meals | no/yes | |
B, FU1, FU2 | PA | h/day | Physical activity measured by questionnaire |
B, FU1, FU2 | Sleep | h/day | Total sleep |
B, FU1, FU2 | Well-being | % | Sum score based on the KINDL-R quality of life questionnaire |
B, FU1, FU2 | YHEI | % | Youth healthy eating score |
B, FU1, FU2 | HOMA | z-score | HOmeostatic Model Assessment |
FU2 | Alcohol | no/yes | Ever alcohol drinking in teen's life-time |
FU2 | Puberty | pre- or early pubertal/pubertal | Pubertal status |
FU2 | Smoking | no/yes | Ever smoking tobacco in teen's life-time |
library(tpc)
library(micd)
## suffienct statistic
suff.all <- getSuff(my.mids.data, test = "flexMItest")
## CCG
graph <- tpc(suffStat = suff.all,
indepTest = flexMItest,
skel.method = "stable.parallel",
label = V.pa,
alpha = 0.05,
tiers = c(rep(1, 3), rep(2, 7), rep(3, 13), rep(4, 13), rep(5, 15)),
forbEdges = fg, # a matrix of size
# ncol(my.mids.data$data) x ncol(my.mids.data$data)
numCores = detectCores()-1)
Note: nodes are coloured with respect to their appearance in the life course. Edges without arrowheads could not be orientated by the algorithm.
Graph characteristics | Main graph |
---|---|
Number of selected edges | 104 |
Number of undirected edges | 12 |
Avg. number of outgoing edges | 2.4 |
g.alpha <- tpc(suffStat = suff.all,
indepTest = flexMItest,
skel.method = "stable.parallel",
label = V.pa,
alpha = 0.1,
tiers = c(rep(1, 3), rep(2, 7), rep(3, 13), rep(4, 13), rep(5, 15)),
forbEdges = fg,
numCores = detectCores()-1)
Graph characteristics | Main graph | MI, α = 0.1 |
---|---|---|
Number of selected edges | 104 | 113 |
Number of undirected edges | 12 | 13 |
Avg. number of outgoing edges | 2.4 | 2.5 |
Hamming distance | - | 19 |
Structural Hamming distance | - | 34 |
g.twd <- tpc(suffStat = data.with.missing.values,
indepTest = flexCItwd,
skel.method = "stable.parallel",
alpha = 0.05,
forbEdges = fg,
labels = colnames(fg),
tiers = c(rep(1, 3), rep(2, 7), rep(3, 13), rep(4, 13), rep(5, 15)),
numCores = detectCores()-1)
Graph characteristics | Main graph | MI, α = 0.1 | TWD |
---|---|---|---|
Number of selected edges | 104 | 113 | 138 |
Number of undirected edges | 12 | 13 | 5 |
Avg. number of outgoing edges | 2.4 | 2.5 | 2.8 |
Hamming distance | - | 19 | 96 |
Structural Hamming distance | - | 34 | 110 |
library(bnlearn)
sem <- structural.em(data.with.missing.values,
maximize = "hc",
maximize.args = list(blacklist = bl))
# bl is a matrix of forbidden directed edges of dimension
# "number of forbidden arrow" X 2
Graph characteristics | Main graph | MI, α = 0.1 | TWD | SEM |
---|---|---|---|---|
Number of selected edges | 104 | 113 | 138 | 157 |
Number of undirected edges | 12 | 13 | 5 | 0 |
Avg. number of outgoing edges | 2.4 | 2.5 | 2.8 | 3.1 |
Hamming distance | - | 19 | 96 | 117 |
Structural Hamming distance | - | 34 | 110 | 131 |
For each bootstrap sample the data was once imputed using mice based on random forest imputation. The following CCGs base on 100 bootstrap replications.
Graph characteristics | Main CCG | MI, α = 0.1 | TWD | SEM | MI, BG44 | MI, BG75 |
---|---|---|---|---|---|---|
Number of selected edges | 104 | 113 | 138 | 157 | 104 | 46 |
Number of undirected edges | 12 | 13 | 5 | 0 | 3 | 0 |
Avg. number of outgoing edges | 2.4 | 2.5 | 2.8 | 3.1 | 2.1 | 0.9 |
Hamming distance | - | 19 | 96 | 117 | 56 | 70 |
Structural Hamming distance | - | 34 | 110 | 131 | 73 | 86 |