
to facilitate thymic health interpretation and potential clinical transla-
tion. By doing so, thymic health values range from 0 to 100 and can be
easily interpreted, such as a value of 50 ranking a participant exactly
in the median of the study population distribution. Further, using the
first and third quartiles in each respective study population, thymic
health was ranked into low (less than or equal to 25), average (25–75)
and high (greater than 75).
Summary statistics used mean, median or range for continuous vari-
ables and proportions or percentages for categorical variables. Overall
survival was calculated from the date of the CT examination (FHS) or
from the date of randomization (NLST) until the date of death. For
time-to-event analyses, we used a cutoff at 12 years of follow-up for both
FHS and NLST for all survival and incidence analyses except for lung
cancer incidence, for which data were only available for a follow-up of 6
years after randomization. If participants were alive or did not have an
event after 12 or 6 years of follow-up, respectively, they were censored
at that time. Association analyses used Wilcoxon rank sum test or lin-
ear regression as appropriate. For linear regression analyses, clinical
variables were z-score normalized and treated as predictors for thymic
health. The beta coefficients represent the change in thymic health
per one s.d. increase in each predictor. Time-to-event distributions
were analysed with the Kaplan–Meier estimator and Cox proportional
hazards models as appropriate for univariate and multivariate testing.
The Schoenfeld residuals were assessed for inspection of the propor-
tional hazards assumption. Initial Cox proportional hazards models
used thymic health as predictor, with adjustments for sex and age,
which violated the proportional hazards assumptions. We addressed
this by implementing sex- and age-stratified analyses in which age was
categorized into distinct bins using the age distribution of the studied
population (FHS, younger than 45 years, 45–48 years, 48–51 years, 51–54
years, 54–57 years, 57–60 years, 60–63 years, 63–66 years, 66–69 years,
69–72 years, 72–75 years and 75 years or older; NLST, 55–59 years, 60–64
years, 65–69 years and 70–75 years) and this stratification approach
was used in all sex- and age-adjusted analysis throughout the article if
not described otherwise. Both the Akaike Information Criterion and
Bayesian Information Criterion were inspected, and the model fit and
parsimony were improved using the implemented stratified models.
Unless explicitly stated otherwise, the following apply: box plots
show the median (centre line), interquartile range (25th–75th percen-
tiles; box), and whiskers extending to the minimum and maximum
values within 1.5× the interquartile range. Statistical comparisons
between groups were performed using two-sided Wilcoxon rank sum
tests with no adjustment for multiple comparisons. P values are inter-
preted against a significance level that is defined as less than 0.05. Cox
proportional hazards regressions were used to estimate HRs, and the
corresponding 95% CI are provided for all statistical values. In the for-
est plots, the centre of each box represents the estimated HR, and the
whiskers denote the corresponding 95% CI; arrowheads indicate that
the 95% CI extends beyond the visualized limits; shaded box size is for
visualization only and does not encode statistical weight. The overall
contribution of thymic health to uni- or multivariable models was evalu-
ated using likelihood ratio tests (χ² tests) comparing full models with
nested models excluding thymic health (type III test, two-sided). Statis-
tical significance of individual covariate coefficients was assessed using
two-sided Wald z-tests with no adjustments for multiple comparisons.
There were no missing values in all time-to-event analyses if not
otherwise described. In the case of missing values in the association
analyses, values were assumed to be missing completely at random,
and the analyses handled missing data by their row-wise exclusion.No
statistical methods were used to predetermine sample size. Statistical
analyses were done in R v.4.2.2 (R Project for Statistical Computing).
Inclusion and ethics statement
This retrospective secondary analysis of NLST data was reviewed and
approved by their corresponding review boards and approved under
NLST-367 and NLST-374. FHS analysis was reviewed and approved by the
Boston University Medical Center Internationl Review Board, protocol
number H-32132.
Reporting summary
Further information on research design is available in theNature Port-
folio Reporting Summary linked to this article.
Data availability
NLST data may be requested from the National Cancer Institute (https://
biometry.nci.nih.gov/cdas/nlst/), imaging data from the Imaging Data
Commons portal (https://portal.imaging.datacommons.cancer.gov/)
and the Thymic Health scores for NLST are available at Zenodo (https://
doi.org/10.5281/zenodo.18306999)
40
. Data from the FHS are available
from the online repositories BioLINCC and dbGap, or from the submis-
sion of a research proposal via FHS ResApp (https://www.framing-
hamheartstudy.org/fhs-for-researchers/research-application/). An
overview of public imaging data collections used for the development
of the deep learning model, which were downloaded from the Imaging
Data Commons (https://portal.imaging.datacommons.cancer.gov/),
can be found in Supplementary Table9.
Code availability
The software used in the publication is available on GitHub for aca-
demic, non-commercial use in our GitHub (https://github.com/
AIM-Harvard/thymus_health_deeplearning_system.git). Additional
technical details about both the development and evaluation of our
deep learning framework can also be found in the Supplementary
Information. The models’ weights are subject to intellectual property
obligations and cannot be shared publicly, but may be made available
through academic collaboration. For more details, please contact the
corresponding author.
39. Vasan, R. S., Enserro, D. M., Xanthakis, V., Beiser, A. S. & Seshadri, S. Temporal trends in
the remaining lifetime risk of cardiovascular disease among middle-aged adults across
6 decades: the Framingham Study. Circulation 145, 1324–1338 (2022).
40. Prudente, V. C. G. etal. Resources “Thymic health consequences in adults”. Zenodo
https://doi.org/10.5281/zenodo.18306999 (2026).
Acknowledgements S.B., V.P. and S.P. contributed equally as leading authors to this work.
K.L.L., N.J.B. and H.J.W.L.A. acted jointly as a supervisory team. We thank the National Cancer
Institute for collecting and making the data from the NLST accessible, and The Cancer Imaging
Archive and the Imaging Data Commons for making this and other imaging collections used
for developing our deep learning model available on their platforms. H.J.W.L.A. acknowledges
inancial support from NIH (HA: NIH-USA U24CA194354, NIH-USA U01CA190234, NIH-USA
U01CA209414 and NIH-USA R35CA22052; BHK: NIH-USA K08DE030216-01) and the European
Union–European Research Council (HA: 866504). K.L.L., J.M.M., Y.C. and J.C. received inancial
support from NIH-USA R01AG067457). S.B. acknowledges funding from the Deutsche
Forschungsgemeinschaft (DFG,German Research Foundation)—502050303. K.B.
acknowledges funding from Bayern Innovativ, German Federal Ministry of Research,
Technology and Space, Max Kade Foundation and Wilhelm-Sander Foundation. N.J.B.
acknowledges funding from the Lundbeck Foundation (R272-2017-4040), the Novo Nordisk
Foundation (NNF21OC0071483 and NNF23OC0085954) and Savvaerksejer Jeppe Juhl og
Hustru Ovita Juhl Research Stipend. M.J.-H. has received funding from CRUK, NIH National
Cancer Institute, IASLC International Lung Cancer Foundation, Lung Cancer Research
Foundation, Rosetrees Trust, UKI NETs and NIHR. The FHS is funded by a contract from the
National, Heart, Lung, and Blood Institute (75N92019D0031, contract number
75N92019D0031).
Author contributions Conceptualization: S.B., V.P., S.P., H.J.W.L.A. and N.J.B. Methodology:
S.B., V.P., S.P., A.K.A., Y.C., J.C., L.N., M.J.-H., C.A., C.S., K.L.L., J.M.M., H.J.W.L.A. and N.J.B.
Software: V.P. and S.P. Validation: S.B., V.P., S.P., Y.C., J.C., K.B., K.L.L., A.L., H.J.W.L.A. and N.J.B.
Formal analysis: S.B., V.P., S.P., Y.C., J.C., K.L.L. and A.L. Resources: H.J.W.L.A., K.L.L. and J.M.M.
Data curation: S.B., V.P., S.P., Y.C., J.C., K.L.L., J.M.M., B.F., M.T.L. and A.L. Writing—original draft:
S.B., V.P., S.P., H.J.W.L.A. and N.J.B. Writing—review and editing: all authors. Visualization: S.B.,
V.P., S.P., A.K.A., L.N., H.J.W.L.A. and N.J.B. Supervision: H.J.W.L.A., N.J.B., K.L.L. and J.M.M. Project
administration: H.J.W.L.A. and N.J.B. Funding acquisition: S.B., H.J.W.L.A., K.L.L. and J.M.M.
Competing interests M.J.-H. has consulted for Astex Pharmaceutical and Achilles
Therapeutics, is a member of the Achilles Therapeutics Scientiic Advisory Board and Steering
Committee, andhas received speaker honoraria from Pizer, Astex Pharmaceuticals, Oslo
Cancer Cluster, Bristol Myers Squibb and Genentech. M.J.-H. is listed as a co-inventor on a