Ryan O Emerson, William S DeWitt, Marissa Vignali, Jenna Gravley, Joyce K Hu, Edward J Osborne, Cindy Desmarais, Mark Klinger, Christopher S Carlson, John A Hansen, Mark Rieder & Harlan S Robins
An individual’s T cell repertoire dynamically encodes their pathogen exposure history. To determine whether pathogen exposure signatures can be identified by documenting public T cell receptors (TCRs), we profiled the T cell repertoire of 666 subjects with known cytomegalovirus (CMV) serostatus by immunosequencing. We developed a statistical classification framework that could diagnose CMV status from the resulting catalog of TCRβ sequences with high specificity and sensitivity in both the original cohort and a validation cohort of 120 different subjects. We also confirmed that three of the identified CMV-associated TCRβ molecules bind CMV in vitro, and, moreover, we used this approach to accurately predict the HLA-A and HLA-B alleles of most subjects in the first cohort. As all memory T cell responses are encoded in the common format of somatic TCR recombination, our approach could potentially be generalized to a wide variety of disease states, as well as other immunological phenotypes, as a highly parallelizable diagnostic strategy.