Cell Differentiation Trajectory Analysis

This page documents the trajectory analysis pipeline. See Methods for the manuscript prose.

Cell Types Analyzed

Independent analysis for: CD4⁺ T cells, CD8⁺ T cells, B cells, NK cells, monocytes.

Pre-processing

Excluded populations (per cell type):

  • CD4⁺ T: Tregs, “Mixed” ethnicity cells
  • CD8⁺ T: MAIT cells, “Mixed” ethnicity cells
  • B cells: atypical/plasma/transitional ZEB2ʰⁱ, “Mixed” ethnicity cells
  • NK cells: “Mixed” ethnicity cells
  • Monocytes: CD14ʰⁱ activated/ISGʰⁱ, “Mixed” ethnicity cells

HVG selection: top 4,000 HVGs per cell type based on normalized dispersion and expression in ≥ 10 healthy + 10 T2D cells (Scanpy normalize_total, log1p, highly_variable_genes).

Trajectory Inference (Monocle3)

  • learn_graph on first 2 UMAP components and Level 4 cell annotations
  • Root nodes manually selected:
    • CD4⁺ T: CD4⁺ naive core
    • CD8⁺ T: CD8⁺ naive
    • B cells: naive resting B
    • Monocytes: CD14ʰⁱ homeostasis
    • NK cells: CD56ʰⁱ

Differential Expression along Trajectory (tradeSeq)

  • fitGAM with 5 knots; ethnicity as conditions; cell weights = 1; covariates: Z-scaled age, sex, virtual batch
  • Fitted separately for healthy and T2D cohorts
  • conditionTest (pairwise = TRUE, l2fc = 1) with FDR correction
  • Significance: FDR < 0.05

Expression Profile Clustering

  • predictSmooth (nPoints = 100) for significant genes
  • Gene-by-pseudotime matrix, row-wise Z-score normalized
  • Dissimilarity: 1 − Spearman’s ρ
  • Clustering: Ward’s D2 hierarchical
  • Visualization: ComplexHeatmap
  • Functional annotation: clusterProfiler GO enrichment per cluster

Software

  • Monocle3, tradeSeq, Scanpy, ComplexHeatmap, clusterProfiler