Cross-Study Heterogeneity in T2D Blood Transcriptomics
Cross-study heterogeneity in T2D blood transcriptomics is the instability of differential-expression signals across independent blood RNA-seq cohorts.
Key Ideas
- Tkachenko et al. 2025 found very low concordance in T2D-associated effect sizes across eight blood RNA-seq datasets.
- Four of eight analyzed datasets had zero significant DEGs, while three datasets had substantial DEG counts.
- Only five genes, FBLN2, TPCN1, PC, SHANK1, and PLD4, were differentially expressed in the same direction across all three datasets that yielded substantial DEG counts.
- Principal component analysis showed strong dataset-level separation, indicating pronounced batch or cohort effects.
- Batch correction reduced dataset separation but did not create clear case-control separation, suggesting high inter-individual variability.
Sources of Heterogeneity To Track
- Blood cell-type proportions differ across individuals and datasets.
- Whole blood and PBMC differ in cellular composition, especially because whole blood includes granulocytes while PBMCs do not.
- Globin transcript abundance can add noise to whole-blood RNA-seq when globin depletion is not used.
- Library preparation, sequencing protocol, infection status, tuberculosis status, site, timepoint, sex, BMI, and other covariates can affect observed expression.
- Population structure and ancestry may contribute to expression differences, but Tkachenko et al. do not directly test ancestry-stratified effects.
Paper-Relevant Use
- This page supports a conservative interpretation of any single-cohort T2D PBMC signature.
- It should be linked when manuscript prose discusses replication, cohort confounding, sample-type differences, or ancestry-aware interpretation.
- It pairs with the evidence map as a limitation and claim-discipline page.
Open Questions
- Which heterogeneity sources can this project directly model versus only acknowledge?
- Are ancestry-associated immune differences robust after accounting for sample type, site, and batch?
- Do this project’s strongest signals align with the five cross-study concordant genes or the meta-analysis-only pathways?