Skip to contents

This vignette demonstrates the use of GAMs for statistical analysis of tract profile data. The data we will use here contains tract profiles from diffusion MRI measurements in a group of patients with Amyotrophic Lateral Sclerosis (ALS) and a group of matched controls (Sarica et al. 2017).

We start by loading the tractable library:

Next, we will use a function that is included in tractable to read this dataset directly into memory.

Importantly, both the group (“ALS” or “CTRL”) and the subject identifier (“subjectID”) need to be factors for subsequent analysis to work properly.

df_sarica <- read_afq_sarica(na_omit = TRUE)
df_sarica
#> # A tibble: 93,377 × 8
#>    subjectID   tractID                 nodeID    fa    md   age group gender
#>    <fct>       <chr>                    <dbl> <dbl> <dbl> <dbl> <fct> <fct> 
#>  1 subject_000 Left Thalamic Radiation      0 0.174 0.965    54 ALS   F     
#>  2 subject_000 Left Thalamic Radiation      1 0.227 0.912    54 ALS   F     
#>  3 subject_000 Left Thalamic Radiation      2 0.282 0.874    54 ALS   F     
#>  4 subject_000 Left Thalamic Radiation      3 0.318 0.851    54 ALS   F     
#>  5 subject_000 Left Thalamic Radiation      4 0.341 0.830    54 ALS   F     
#>  6 subject_000 Left Thalamic Radiation      5 0.353 0.812    54 ALS   F     
#>  7 subject_000 Left Thalamic Radiation      6 0.357 0.802    54 ALS   F     
#>  8 subject_000 Left Thalamic Radiation      7 0.357 0.795    54 ALS   F     
#>  9 subject_000 Left Thalamic Radiation      8 0.359 0.790    54 ALS   F     
#> 10 subject_000 Left Thalamic Radiation      9 0.371 0.784    54 ALS   F     
#> # ℹ 93,367 more rows

First, let’s visualize the data. We use the plot_tract_profiles function, selecting to view both fractional anisotropy (FA) and mean diffusivity profiles in two tracts: the right corticospinal tract (CST) and the right superior longitudinal fasciculus (SLF), which are identified in the “tractID” column of this dataset.

plot_handles <- plot_tract_profiles(
  df          = df_sarica,
  y           = c("fa", "md"),
  tracts      = c("Right Corticospinal", "Right SLF"),
  group_col   = "group",
  save_figure = FALSE
)
plot_handles$fa

plot_handles$md

We can already see that ALS has a profound effect on the tract profiles of the CST, but does not affect SLF as much. We will use GAMs in order to quantify this in statistical terms. We start by fitting a GAM model to the data from the CST. Using the tractable_single_tract function, we select the Right CST data, and focus here only on FA. We use “group” and “age” as relevant covariates. Comparing group as a main effect, that will also be used to fit separate smooth functions for each category of subjects. The mgcv GAM functions use a parameter k to determine how many spline functions to use in fitting the smooth change of FA over the length of the tract. We use an automated strategy to find k.

cst_fit <- tractable_single_tract(
  df         = df_sarica,
  tract      = "Right Corticospinal",
  target     = "fa",
  regressors = c("age", "group"),
  node_group = "group",
  k          = "auto"
)

cst_summary <- summary(cst_fit)
cst_summary
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> fa ~ age + group + s(nodeID, by = group, bs = "fs", k = 9) + 
#>     s(subjectID, bs = "re")
#> 
#> Parametric coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 0.422545   0.056591    7.47  9.8e-14 ***
#> age         0.000858   0.000915    0.94    0.348    
#> groupCTRL   0.035936   0.015961    2.25    0.024 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Approximate significance of smooth terms:
#>                       edf Ref.df     F p-value    
#> s(nodeID):groupALS   7.97      8 445.5  <2e-16 ***
#> s(nodeID):groupCTRL  7.96      8 534.8  <2e-16 ***
#> s(subjectID)        42.42     46  29.9  <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> R-sq.(adj) =  0.562   Deviance explained = 56.7%
#> fREML = -13623  Scale est. = 0.0012706  n = 4734

Examining the summary of the resulting GAM fit object shows us that the k = 16 is sufficiently large to describe the spatial variation of tract profile data. In addition, we see that there is a statistically significant effect of group (with a p-value of 0.0244) and no statistically significant effect of age (p = 0.3483).

Running the same analysis on the data from SLF, we see that there is no significant difference between the groups in this tract, indicating that the effect observed in CST is rather specific to this tract.

slf_fit <- tractable_single_tract(
  df         = df_sarica,
  tract      = "Right SLF",
  target     = "fa",
  regressors = c("age", "group"),
  node_group = "group",
  k          = "auto"
)

slf_summary <- summary(slf_fit)
slf_summary
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> fa ~ age + group + s(nodeID, by = group, bs = "fs", k = 9) + 
#>     s(subjectID, bs = "re")
#> 
#> Parametric coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 0.4126030  0.0249891   16.51   <2e-16 ***
#> age         0.0000153  0.0004033    0.04     0.97    
#> groupCTRL   0.0063865  0.0073081    0.87     0.38    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Approximate significance of smooth terms:
#>                         edf Ref.df   F p-value    
#> s(nodeID):groupALS  7.98115      8 468  <2e-16 ***
#> s(nodeID):groupCTRL 7.98013      8 493  <2e-16 ***
#> s(subjectID)        0.00012     45   0       1    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> R-sq.(adj) =  0.624   Deviance explained = 62.5%
#> fREML = -15118  Scale est. = 0.0015159  n = 4785

This is in line with other work that we have conducted with this dataset using other methods (Richie-Halford et al. 2021).

If you are interested in exploring the ALS dataset even more, you can also see this data in an AFQ-Browser here (Yeatman et al. 2018).

References

Richie-Halford, Adam, Jason D Yeatman, Noah Simon, and Ariel Rokem. 2021. “Multidimensional Analysis and Detection of Informative Features in Human Brain White Matter.” PLoS Comput. Biol. 17 (6): e1009136.
Sarica, Alessia, Antonio Cerasa, Paola Valentino, Jason Yeatman, Maria Trotta, Stefania Barone, Alfredo Granata, et al. 2017. “The Corticospinal Tract Profile in Amyotrophic Lateral Sclerosis.” Hum. Brain Mapp. 38 (2): 727–39.
Yeatman, Jason D, Adam Richie-Halford, Josh K Smith, Anisha Keshavan, and Ariel Rokem. 2018. “A Browser-Based Tool for Visualization and Analysis of Diffusion MRI Data.” Nat. Commun. 9 (1): 940.