Publications

Select recent publications

For the comprehensive list of publications, patents and scholarly works, visit PubMed, BioRxiv and Google Scholar.

Expand All +
  • 2012


  • 2013


  • 2014


  • 2015


  • 2016


  • 2017


  • 2018


  • 2019


  • We determined a significant fraction of the genome sequence of a representative of Thiovulum, the uncultivated genus of colorless sulfur Epsilonproteobacteria, by analyzing the genome sequences of four individual cells collected from phototrophic mats from Elkhorn Slough, California. These cells were isolated utilizing a microfluidic laser-tweezing system, and their genomes were amplified by multiple-displacement amplification prior to sequencing. Thiovulum is a gradient bacterium found at oxic-anoxic marine interfaces and noted for its distinctive morphology and rapid swimming motility. The genomic sequences of the four individual cells were assembled into a composite genome consisting of 221 contigs covering 2.083 Mb including 2,162 genes. This single-cell genome represents a genomic view of the physiological capabilities of isolated Thiovulum cells. Thiovulum is the second-fastest bacterium ever observed, swimming at 615 μm/s, and this genome shows that this rapid swimming motility is a result of a standard flagellar machinery that has been extensively characterized in other bacteria. This suggests that standard flagella are capable of propelling bacterial cells at speeds much faster than typically thought. Analysis of the genome suggests that naturally occurring Thiovulum populations are more diverse than previously recognized and that studies performed in the past probably address a wide range of unrecognized genotypic and phenotypic diversities of Thiovulum. The genome presented in this article provides a basis for future isolation-independent studies of Thiovulum, where single-cell and metagenomic tools can be used to differentiate between different Thiovulum genotypes.

    Journal Manuscript


  • Segmented filamentous bacteria (SFB) are host-specific intestinal symbionts that comprise a distinct clade within the Clostridiaceae, designated Candidatus Arthromitus. SFB display a unique life cycle within the host, involving differentiation into multiple cell types. The latter include filaments that attach intimately to intestinal epithelial cells, and from which "holdfasts" and spores develop. SFB induce a multifaceted immune response, leading to host protection from intestinal pathogens. Cultivation resistance has hindered characterization of these enigmatic bacteria. In the present study, we isolated five SFB filaments from a mouse using a microfluidic device equipped with laser tweezers, generated genome sequences from each, and compared these sequences with each other, as well as to recently published SFB genome sequences. Based on the resulting analyses, SFB appear to be dependent on the host for a variety of essential nutrients. SFB have a relatively high abundance of predicted proteins devoted to cell cycle control and to envelope biogenesis, and have a group of SFB-specific autolysins and a dynamin-like protein. Among the five filament genomes, an average of 8.6% of predicted proteins were novel, including a family of secreted SFB-specific proteins. Four ADP-ribosyltransferase (ADPRT) sequence types, and a myosin-cross-reactive antigen (MCRA) protein were discovered; we hypothesize that they are involved in modulation of host responses. The presence of polymorphisms among mouse SFB genomes suggests the evolution of distinct SFB lineages. Overall, our results reveal several aspects of SFB adaptation to the mammalian intestinal tract.

    Genome Research Manuscript


  • Interest in the expanding catalog of uncultivated microorganisms, increasing recognition of heterogeneity among seemingly similar cells, and technological advances in whole-genome amplification and single-cell manipulation are driving considerable progress in single-cell genomics. Here, the spectrum of applications for single-cell genomics, key advances in the development of the field, and emerging methodology for single-cell genome sequencing are reviewed by example with attention to the diversity of approaches and their unique characteristics. Experimental strategies transcending specific methodologies are identified and organized as a road map for future studies in single-cell genomics of environmental microorganisms. Over the next decade, increasingly powerful tools for single-cell genome sequencing and analysis will play key roles in accessing the genomes of uncultivated organisms, determining the basis of microbial community functions, and fundamental aspects of microbial population biology.

    Journal Manuscript


  • Genetic analysis of single cells is emerging as a powerful approach for studies of heterogeneous cell populations. Indeed, the notion of homogeneous cell populations is receding as approaches to resolve genetic and phenotypic variation between single cells are applied throughout the life sciences. A key step in single-cell genomic analysis today is the physical isolation of individual cells from heterogeneous populations, particularly microbial populations, which often exhibit high diversity. Here, we detail the construction and use of instrumentation for optical trapping inside microfluidic devices to select individual cells for analysis by methods including nucleic acid sequencing. This approach has unique advantages for analyses of rare community members, cells with irregular morphologies, small quantity samples, and studies that employ advanced optical microscopy.

    Methods in Enzymology Manuscript


  • How does one optimally determine the diffusion coefficient of a diffusing particle from a single-time-lapse recorded trajectory of the particle? We answer this question with an explicit, unbiased, and practically optimal covariance-based estimator (CVE). This estimator is regression-free and is far superior to commonly used methods based on measured mean squared displacements. In experimentally relevant parameter ranges, it also outperforms the analytically intractable and computationally more demanding maximum likelihood estimator (MLE). For the case of diffusion on a flexible and fluctuating substrate, the CVE is biased by substrate motion. However, given some long time series and a substrate under some tension, an extended MLE can separate particle diffusion on the substrate from substrate motion in the laboratory frame. This provides benchmarks that allow removal of bias caused by substrate fluctuations in CVE. The resulting unbiased CVE is optimal also for short time series on a fluctuating substrate. We have applied our estimators to human 8-oxoguanine DNA glycolase proteins diffusing on flow-stretched DNA, a fluctuating substrate, and found that diffusion coefficients are severely overestimated if substrate fluctuations are not accounted for.

    Physical Review E. Manuscript


  • Emerging technologies are bringing single-cell genome sequencing into the mainstream; this field has already yielded insights into the genetic architecture and variability between cells that highlight the dynamic nature of the genome.

    Nature Methods Manuscript


  • Complex tissues such as the lung are composed of structural hierarchies such as alveoli, alveolar ducts, and lobules. Some structural units, such as the alveolar duct, appear to participate in tissue repair as well as the development of bronchioalveolar carcinoma. Here, we demonstrate an approach to conduct laser microdissection of the lung alveolar duct for single-cell PCR analysis. Our approach involved three steps. (1) The initial preparation used mechanical sectioning of the lung tissue with sufficient thickness to encompass the structure of interest. In the case of the alveolar duct, the precision-cut lung slices were 200 μm thick; the slices were processed using near-physiologic conditions to preserve the state of viable cells. (2) The lung slices were examined by transmission light microscopy to target the alveolar duct. The air-filled lung was sufficiently accessible by light microscopy that counterstains or fluorescent labels were unnecessary to identify the alveolar duct. (3) The enzymatic and microfluidic isolation of single cells allowed for the harvest of as few as several thousand cells for PCR analysis. Microfluidics based arrays were used to measure the expression of selected marker genes in individual cells to characterize different cell populations. Preliminary work suggests the unique value of this approach to understand the intra- and intercellular interactions within the regenerating alveolar duct.

    Frontiers in Oncology Manuscript


  • The bootstrap can be used to assess uncertainty of sample estimates.

    Nature Methods Manuscript


  • Digital assays are powerful methods that enable detection of rare cells and counting of individual nucleic acid molecules. However, digital assays are still not routinely applied, due to the cost and specific equipment associated with commercially available methods. Here we present a simplified method for readout of digital droplet assays using a conventional real-time PCR instrument to measure bulk fluorescence of droplet-based digital assays. We characterize the performance of the bulk readout assay using synthetic droplet mixtures and a droplet digital multiple displacement amplification (MDA) assay. Quantitative MDA particularly benefits from a digital reaction format, but our new method applies to any digital assay. For established digital assay protocols such as digital PCR, this method serves to speed up and simplify assay readout. Our bulk readout methodology brings the advantages of partitioned assays without the need for specialized readout instrumentation. The principal limitations of the bulk readout methodology are reduced dynamic range compared with droplet-counting platforms and the need for a standard sample, although the requirements for this standard are less demanding than for a conventional real-time experiment. Quantitative whole genome amplification (WGA) is used to test for contaminants in WGA reactions and is the most sensitive way to detect the presence of DNA fragments with unknown sequences, giving the method great promise in diverse application areas including pharmaceutical quality control and astrobiology.

    JoVE Link


  • Recent work has underscored the importance of the microbiome in human health, and has largely attributed differences in phenotype to differences in the species present among individuals. However, mobile genes can confer profoundly different phenotypes on different strains of the same species. Little is known about the function and distribution of mobile genes in the human microbiome, and in particular whether the gene pool is globally homogenous or constrained by human population structure. Here, we investigate this question by comparing the mobile genes found in the microbiomes of 81 metropolitan North Americans with those of 172 agrarian Fiji islanders using a combination of single-cell genomics and metagenomics. We find large differences in mobile gene content between the Fijian and North American microbiomes, with functional variation that mirrors known dietary differences such as the excess of plant-based starch degradation genes found in Fijian individuals. Notably, we also observed differences between the mobile gene pools of neighbouring Fijian villages, even though microbiome composition across villages is similar. Finally, we observe high rates of recombination leading to individual-specific mobile elements, suggesting that the abundance of some genes may reflect environmental selection rather than dispersal limitation. Together, these data support the hypothesis that human activities and behaviours provide selective pressures that shape mobile gene pools, and that acquisition of mobile genes is important for colonizing specific human populations.

    Nature Manuscript


  • Many DNA binding proteins utilize one‐dimensional (1D) diffusion along DNA to accelerate their DNA target recognition. Although 1D diffusion of proteins along DNA has been studied for decades, a quantitative understanding is only beginning to emerge and few chemical tools are available to apply 1D diffusion as a design principle. Recently, we discovered that peptides can bind and slide along DNA—even transporting cargo along DNA. Such molecules are known as molecular sleds. Here, to advance our understanding of structure–function relationships governing sequence nonspecific DNA interaction of natural molecular sleds and to explore the potential for controlling sliding activity, we test the DNA binding and sliding activities of chemically modified peptides and analogs, and show that synthetic small molecules can slide on DNA. We found new ways to control molecular sled activity, novel small‐molecule synthetic sleds, and molecular sled activity in N‐methylpyrrole/N‐methylimidazole polyamides that helps explain how these molecules locate rare target sites.

    Journal Manuscript


  • Recent work revealed a new class of molecular machines called molecular sleds, which are small basic molecules that bind and slide along DNA with the ability to carry cargo along DNA. Here, we performed biochemical and single-molecule flow stretching assays to investigate the basis of sliding activity in molecular sleds. In particular, we identified the functional core of pVIc, the first molecular sled characterized; peptide functional groups that control sliding activity; and propose a model for the sliding activity of molecular sleds. We also observed widespread DNA binding and sliding activity among basic polypeptide sequences that implicate mammalian nuclear localization sequences and many cell penetrating peptides as molecular sleds. These basic protein motifs exhibit weak but physiologically relevant sequence-nonspecific DNA affinity. Our findings indicate that many mammalian proteins contain molecular sled sequences and suggest the possibility that substantial undiscovered sliding activity exists among nuclear mammalian proteins.

    Nucleic Acids Research Manuscript


  • We have developed hydrogel-based virtual microfluidics as a simple and robust alternative to complex engineered microfluidic systems for the compartmentalization of nucleic acid amplification reactions. We applied in-gel digital multiple displacement amplification (dMDA) to purified DNA templates, cultured bacterial cells and human microbiome samples in the virtual microfluidics system, and demonstrated whole-genome sequencing of single-cell MDA products with excellent coverage uniformity and markedly reduced chimerism compared with products of liquid MDA reactions.

    Nature Methods Manuscript


  • Recently, we showed the adenovirus proteinase interacts productively with its protein substrates in vitro and in vivo in nascent virus particles via one-dimensional diffusion along the viral DNA. The mechanism by which this occurs has heretofore been unknown. We show sliding of these proteins along DNA occurs on a new vehicle in molecular biology, a ‘molecular sled’ named pVIc. This 11-amino acid viral peptide binds to DNA independent of sequence. pVIc slides on DNA, exhibiting the fastest one-dimensional diffusion constant, 26±1.8 × 106 (bp)2 s−1. pVIc is a ‘molecular sled,’ because it can slide heterologous cargos along DNA, for example, a streptavidin tetramer. Similar peptides, for example, from the C terminus of β-actin or NLSIII of the p53 protein, slide along DNA. Characteristics of the ‘molecular sled’ in its milieu (virion, nucleus) have implications for how proteins in the nucleus of cells interact and imply a new form of biochemistry, one-dimensional biochemistry.

    Nature Communications Manuscript


  • Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.

    eLife Manuscript


  • We describe a simple, robust and high throughput single molecule flow-stretching assay for studying 1D diffusion of molecules along DNA. In this assay, glass coverslips are functionalized in a one-step reaction with silane-PEG-biotin. Flow cells are constructed by sandwiching an adhesive tape with pre-cut channels between a functionalized coverslip and a PDMS slab containing inlet and outlet holes. Multiple channels are integrated into one flow cell and the flow of reagents into each channel can be fully automated, which significantly increases the assay throughput and reduces hands-on time per assay. Inside each channel, biotin-λ-DNAs are immobilized on the surface and a laminar flow is applied to flow-stretch the DNAs. The DNA molecules are stretched to >80% of their contour length and serve as spatially extended templates for studying the binding and transport activity of fluorescently labeled molecules. The trajectories of single molecules are tracked by time-lapse Total Internal Reflection Fluorescence (TIRF) imaging. Raw images are analyzed using streamlined custom single particle tracking software to automatically identify trajectories of single molecules diffusing along DNA and estimate their 1D diffusion constants.

    JoVE Link


  • In many mammals, including humans, removal of one lung (pneumonectomy) results in the compensatory growth of the remaining lung. Compensatory growth involves not only an increase in lung size, but also an increase in the number of alveoli in the peripheral lung; however, the process of compensatory neoalveolarization remains poorly understood. Here, we show that the expression of α-smooth muscle actin (SMA)—a cytoplasmic protein characteristic of myofibroblasts—is induced in the pleura following pneumonectomy. SMA induction appears to be dependent on pleural deformation (stretch) as induction is prevented by plombage or phrenic nerve transection (P < 0.001). Within 3 days of pneumonectomy, the frequency of SMA+ cells in subpleural alveolar ducts was significantly increased (P < 0.01). To determine the functional activity of these SMA+ cells, we isolated regenerating alveolar ducts by laser microdissection and analyzed individual cells using microfluidic single-cell quantitative PCR. Single cells expressing the SMA (Acta2) gene demonstrated significantly greater transcriptional activity than endothelial cells or other discrete cell populations in the alveolar duct (P < 0.05). The transcriptional activity of the Acta2+ cells, including expression of TGF signaling as well as repair-related genes, suggests that these myofibroblast-like cells contribute to compensatory lung growth.

    Journal Manuscript


  • Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications.

    Nature Communications Manuscript

    Nature Methods News Article


  • Neuronal synapses contain dozens of protein species whose expression levels and localizations are key determinants of synaptic transmission and plasticity. The spectral properties of fluorophores used in conventional microscopy limit the number of measured proteins to four species within a given sample. The ability to perform high-throughput confocal or super-resolution imaging of many proteins simultaneously without limitation in target number imposed by this spectral limit would enable large-scale characterization of synaptic protein networks in situ. Here, we introduce PRISM: Probe-based Imaging for Sequential Multiplexing, a method that sequentially utilizes either high affinity Locked Nucleic Acid (LNA) or low affinity DNA probes to enable diffraction-limited confocal and PAINT-based super-resolution imaging. High-affinity LNA probes offer high-throughput, confocal-based imaging compared with PAINT, which uses low affinity probes to realize localization-based super-resolution imaging. Simultaneous immunostaining of all targets is performed prior to imaging, followed by sequential LNA/DNA probe exchange that requires only minutes under mild wash conditions. We apply PRISM to quantify the co-expression levels and nanometer-scale organization of one dozen cytoskeletal and synaptic proteins within individual neuronal synapses. Our approach is scalable to dozens of target proteins and is compatible with high-content screening platforms commonly used to interrogate phenotypic changes associated with genetic and drug perturbations in a variety of cell types.

    BioRxiv Manuscript


  • Combinatorial drug treatment strategies perturb biological networks synergistically to achieve therapeutic effects and represent major opportunities to develop advanced treatments across a variety of human disease areas. However, the discovery of new combinatorial treatments is challenged by the sheer scale of combinatorial chemical space. Here, we report a high-throughput system for nanoliter-scale phenotypic screening that formulates a chemical library in nanoliter droplet emulsions and automates the construction of chemical combinations en masse using parallel droplet processing. We applied this system to predict synergy between more than 4,000 investigational and approved drugs and a panel of 10 antibiotics against Escherichia coli, a model gram-negative pathogen. We found a range of drugs not previously indicated for infectious disease that synergize with antibiotics. Our validated hits include drugs that synergize with the antibiotics vancomycin, erythromycin, and novobiocin, which are used against gram-positive bacteria but are not effective by themselves to resolve gram-negative infections.

    PNAS Publication

    Science Drug Discovery News Article

    Nature Reviews Drug Discovery News Article


  • We reanalyze trajectories of hOGG1 repair proteins diffusing on DNA. A previous analysis of these trajectories with the popular mean-squared-displacement approach revealed only simple diffusion. Here, a new optimal estimator of diffusion coefficients reveals two-state kinetics of the protein. A simple, solvable model, in which the protein randomly switches between a loosely bound, highly mobile state and a tightly bound, less mobile state is the simplest possible dynamic model consistent with the data. It yields accurate estimates of hOGG1’s (i) diffusivity in each state, uncorrupted by experimental errors arising from shot noise, motion blur and thermal fluctuations of the DNA; (ii) rates of switching between states and (iii) rate of detachment from the DNA. The protein spends roughly equal time in each state. It detaches only from the loosely bound state, with a rate that depends on pH and the salt concentration in solution, while its rates for switching between states are insensitive to both. The diffusivity in the loosely bound state depends primarily on pH and is three to ten times higher than in the tightly bound state. We propose and discuss some new experiments that take full advantage of the new tools of analysis presented here.

    Nucleic Acids Research Manuscript


  • Lentiviral vectors are widely used for functional genomic screens, enabling efficient and stable transduction of target cells with libraries of genetic elements. Unfortunately, designs that rely on integrating multiple variable sequences, such as combinatorial perturbations or perturbations linked to barcodes, may be compromised by unintended consequences of lentiviral packaging. Intermolecular recombination between library elements and integration of multiple perturbations (even at limiting virus dilution) can negatively impact the sensitivity of pooled screens. Here, we describe a simple approach to prevent recombination between lentiviral vectors containing multiple linked variable elements, such as the recently reported CRISP-seq, Perturb-seq, and Mosaic-seq designs. We show that modifying the packaging protocol to dilute the perturbation library with a carrier plasmid increases the fraction of correct, single integrations from <60% to >90%, at the cost of reducing titer by 100-fold.

    BioRxiv Manuscript


  • Large-scale genetic screens play a key role in the systematic discovery of genes underlying cellular phenotypes. Pooling of genetic perturbations greatly increases screening throughput, but has so far been limited to screens of enrichments defined by cell fitness and flow cytometry, or to comparatively low-throughput single cell gene expression profiles. Although microscopy is a rich source of spatial and temporal information about mammalian cells, high-content imaging screens have been restricted to much less efficient arrayed formats. Here, we introduce an optical method to link perturbations and their phenotypic outcomes at the single-cell level in a pooled setting. Barcoded perturbations are read out by targeted in situ sequencing following image-based phenotyping. We apply this technology to screen a focused set of 952 genes across >3 million cells for involvement in NF-κB activation by imaging the translocation of RelA (p65) to the nucleus, recovering 20 known pathway components and 3 novel candidate positive regulators of IL-1β and TNFα-stimulated immune responses.

    BioRxiv Manuscript


  • The human immune system consists of many specialized cell subsets that simultaneously carry out a diverse range of functions using overlapping pathways and signals. Subset-specific immune profiling can resolve immune activity in autoimmune disease, cancer immunity, and infectious disease that may not be discoverable or detectable in analyses of crude blood samples. The activity of specific subsets may help predict the course of disease and response to therapy in certain patient populations. Here, we present a low-input microfluidic system for sorting immune cells into subsets and profiling their cellular states by gene expression analysis using full-length RNA-seq. Our system is robust and has the potential to make multiplexed subset-specific analysis routine in many research laboratories and clinical settings. We validate the device's technical performance by benchmarking its subset enrichment and genomic profiling performance against standard protocols. We make the added value of subset-resolved profiling over crude samples clear through ex vivo experiments that show subset-specific stimulated responses. Finally, we demonstrate the scalability of our device by profiling four immune cell subsets in blood from systemic lupus erythematosus (SLE) patients and matched controls enrolled in a clinical study. The results from our initial cohort confirm the role of type I interferons in lupus pathogenesis and further show that the canonical interferon signature for SLE is prominent in B cells, demonstrating the ability of our integrated analytical platform to identify cell-specific disease signatures.

    BioRxiv Manuscript


  • Transcriptional profiling of thousands of single cells in parallel by RNA-seq is now routine. However, due to reliance on pooled library preparation, targeting analysis to particular cells of interest is difficult. Here, we present a multiplexed PCR method for targeted sequencing of select cells from pooled single-cell sequence libraries. We demonstrated this molecular enrichment method on multiple cell types within pooled single-cell RNA-seq libraries produced from primary human blood cells. We show how molecular enrichment can be combined with FACS to efficiently target ultra-rare cell types, such as the recently identified AXL+SIGLEC6+ dendritic cell (AS DC) subset, in order to reduce the required sequencing effort to profile single cells by 100-fold. Our results demonstrate that DNA barcodes identifying cells within pooled sequencing libraries can be used as targets to enrich for specific molecules of interest, for example reads from a set of target cells.

    Nucleic Acids Research Manuscript


  • The rate of infection by methicillin-resistant Staphylococcus aureus (MRSA) has declined over the past decade, but it is unclear whether this represents a decline in S. aureus infections overall. To evaluate the trends in the annual rates of infection by S. aureus subtypes and mean antibiotic resistance, we conducted a 15-year retrospective observational study at two tertiary care institutions in Boston, MA, of 31,753 adult inpatients with S. aureus isolated from clinical specimens. We inferred the gain and loss of methicillin resistance through genome sequencing of 180 isolates from 2016. The annual rates of infection by S. aureus declined from 2003 to 2014 by 4.2% (2.7% to 5.6%), attributable to an annual decline in MRSA of 10.9% (9.3% to 12.6%). Penicillin-susceptible S. aureus (PSSA) increased by 6.1% (4.2% to 8.1%) annually, and rates of methicillin-susceptible penicillin-resistant S. aureus (MSSA) did not change. Resistance in S. aureus decreased from 2000 to 2014 by 0.8 antibiotics (0.7 to 0.8). Within common MRSA clonal complexes, 3/14 MSSA and 2/21 PSSA isolates arose from the loss of resistance-conferring genes. Overall, in two tertiary care institutions in Boston, MA, a decline in S. aureus infections has been accompanied by a shift toward increased antibiotic susceptibility. The rise in PSSA makes penicillin an increasingly viable treatment option.

    Journal of Clinical Microbiology Manuscript


  • Mutation data reveal the dynamic equilibrium between DNA damage and repair processes in cells and are indispensable to the understanding of age-related diseases, tumor evolution, and the acquisition of drug resistance. However, available genome-wide methods have a limited ability to resolve rare somatic variants and the relationships between these variants. Here, we present lineage sequencing, a new genome sequencing approach that enables somatic event reconstruction by providing quality somatic mutation call sets with resolution as high as the single-cell level in subject lineages. Lineage sequencing entails sampling single cells from a population and sequencing subclonal sample sets derived from these cells such that knowledge of relationships among the cells can be used to jointly call variants across the sample set. This approach integrates data from multiple sequence libraries to support each variant and precisely assigns mutations to lineage segments. We applied lineage sequencing to a human colon cancer cell line with a DNA polymerase epsilon (POLE) proofreading deficiency (HT115) and a human retinal epithelial cell line immortalized by constitutive telomerase expression (RPE1). Cells were cultured under continuous observation to link observed single-cell phenotypes with single-cell mutation data. The high sensitivity, specificity, and resolution of the data provide a unique opportunity for quantitative analysis of variation in mutation rate, spectrum, and correlations among variants. Our data show that mutations arrive with nonuniform probability across sublineages and that DNA lesion dynamics may cause strong correlations between certain mutations.

    Genome Research Manuscript


  • Genetic screens are critical for the systematic identification of genes underlying cellular phenotypes. Pooling gene perturbations greatly improves scalability, but is not compatible with imaging of complex and dynamic cellular phenotypes. Here, we introduce a pooled approach for optical genetic screens in mammalian cells. We use targeted in situ sequencing to demultiplex a library of genetic perturbations following image-based phenotyping. We screened a set of 952 genes across millions of cells for involvement in NF-κB signaling by imaging the translocation of RelA (p65) to the nucleus. Screening at a single time point across 3 cell lines recovered 15 known pathway components, while repeating the screen with live-cell imaging revealed a new role for Mediator complex subunits in regulating the duration of p65 nuclear retention. These results establish a highly multiplexed approach to image-based screens of spatially and temporally defined phenotypes with pooled libraries.

  • Microbial communities have numerous potential applications in biotechnology, agriculture, and medicine. Nevertheless, the limited accuracy with which we can predict interspecies interactions and environmental dependencies hinders efforts to rationally engineer beneficial consortia. Empirical screening is a complementary approach wherein synthetic communities are combinatorially constructed and assayed in high throughput. However, assembling many combinations of microbes is logistically complex and difficult to achieve on a timescale commensurate with microbial growth. Here we introduce the kChip, a droplets-based platform that performs rapid, massively parallel, bottom-up construction and screening of synthetic microbial communities. We first show that the kChip enables phenotypic characterization of microbes across environmental conditions. Next, in a screen of ~100,000 multi-species communities comprising up to 19 soil isolates, we identified sets that promote the growth of the model plant symbiont Herbaspirillum frisingense in a manner robust to carbon source variation and the presence of additional species. Broadly, kChip screening can identify multi-species consortia possessing any optically assayable function, including facilitation of biocontrol agents, suppression of pathogens, degradation of recalcitrant substrates, and robustness of these functions to perturbation, with many applications across basic and applied microbial ecology.

    PNAS Manuscript