A recent Nature Microbiology study conducted a metagenomic examination of mice across the lifespan. They found that the microbiome undergoes an accumulation of stochastic events that are relatively irreversible. This means there is a microbial signal associated with aging, which they leveraged to build a machine learning-based age predictor. They also evaluated the impact of fasting and caloric restriction on the microbiome in these mice.

Their data are available here.

I was most interested in their Figure 5I, where they tested the hypothesis that caloric restriction/fasting “de-ages” the microbiome. They built a random forest regressor using mice fed ad libitum to predict the ages of mice on calorie restriction/fasting regimens. Surprisingly, the predictor guessed their ages were older than they actually were, disproving the hypothesis.

Let’s see if we can validate these results, and then test other models to see if this was an artefact of the algorithm they chose.

Note: I have to mention that this dataset was very easy to work with. The authors clearly put a good effort into making their data as user friendly as possible.

Validating the analysis

Loading data

Let’s load their genus-level data processed using kraken2.

## [1] "100 genera by 2997 samples"
##           genus DO_1D_3001_038w DO_1D_3001_044w DO_1D_3001_069w DO_1D_3001_097w
## 1 Schaedlerella     0.324345379     1.148270492      0.24210585       0.3213002
## 2   Bacteroides     0.046388330     0.044294182      0.10526272       0.1011501
## 3 Lactobacillus     0.037230830     0.127716986      1.60870176       0.3545356
## 4   Escherichia     0.005165769     0.005365897      0.01007851       0.0168542
## 5         Dorea     0.668607974     3.278830068      1.64581981       0.9003676

It looks like all of the data we need are pre-packed in the feature table. (It’s a lot of samples!)

Processing metadata

Let’s unpack the metadata from the column names, which appears to follow the format: Genotype_Diet_Mouse_Time. Of note, we need to subset to these timepoints: 5, 10, 16, 22 and 28 months.Then we’ll check the distributions of samples.

##         diet
## genotype  1D  20  2D  40  AL
##       DO 523 545 512 510 528

Out of the 2618 samples, 528 are “AL”, which corresponds to ad libitum fed mice, our control group. All of the others refer to some degree of calorie restriction or fasting.

Data processing

Now let’s check the metagenomic data, and see if any additional processing is needed.

The distributions look much better after log-transforming with a pseudcount added.

Building model

Now let’s seek to validate their analysis performed in Fig5I. That is, build a random forest regressor using the mice fed ad libitum and predict the ages of the calorie restriction/fasting mice. Remember, our goal isn’t to achieve high accuracy.

I’ve added regression lines, and lines indicating the median predicted age per diet group.

It’s quite clear that mice from the DR (dietary restriction) group are being predicted to have an older age than what is true. So, we’re able to reproduce the authors’ findings.

Extending the analysis

Part 1: Fixing pseudoreplicates

If we examined the data closely, we’ll have recognized the author committed a common error in machine learning: ignoring pseudoreplicates.

Since they were trying to make a point about a previous hypothesis, rather than build a generalizable model, I wouldn’t consider this a grievous case of overfitting. However, I would like to see how a properly engineered random forest would perform. For instance, was their upshift in predictions actually an artefact of overfitting?

So, let’s build a stratified random forest such that only 1 sample per mouse is selected per iteration. We’ll repeat 15 times and take the mean predictions across the stratifications.

Surprisingly, removing pseudoreplicates appears to have had a negligible impact on the results.

Let’s test other models and see if they reveal different patterns.

Part 2. Additional models

We’ll build these models:

ml.models = c("gbm","glmnet","ranger","svmLinear","svmRadial","xgbTree")

We’ll consider each of the 15 stratifications as an iteration and take their mean predictions.

There is some model-specificity in predictive patterns, but overall, these results are consistent with the original findings of the study.

We can also examine the MAE (mean absolute error, in weeks) of each of the models, and infer which of these models is telling a more accurate story:

GBM, elastic net, and XGBoost are better than random forest by a margin of ~1 week. An interesting detail worth noting is that these 3 models expanded the prediction range, such that we can now see ages upwards of 28 weeks across the groups.

Nonetheless, I don’t think we can make a conclusion that differs from the original study.

Conclusions

To conclude, we were able to reproduce the authors’ conclusions with this task. Calorie-restricted or fasting mice’s microbiomes have a signal consistent with an older microbiome. We also showed that this conclusion is robust to pseudoreplicates and model choice.