Machine Learning for Healthcare 2016
Saban Research Institute
Program:
8:00 Breakfast
8:45 Welcome
9:00 Machine Learning Opportunities in the Explosion of Personalized Precision Medicine
We have reached the take off point in the generation of massive datasets from individuals and across populations, both of which are necessary for personalized precision medicine. I will give an example of my N=1 self-study, in which I have my human genome as well as multi-year time series of my gut microbiome genomics and over one hundred blood biomarkers. This is now being augmented with time series of my metabolome and immunome. These are then compared with hundreds of healthy people's gut microbiomes, revealing major shifts between health and disease. Multiple companies and organizations will soon be carrying out similar levels of analysis on hundreds of thousands of individuals. Machine learning techniques will be essential to bring the patterns out of these exponentially growing datasets
9:45 Machine learning that matters in healthcare: breaking down the silos
Quality of care, as would be reflected by the universal provision of standardized, evidence based and truly indicated care, has not improved to the degree one would have hoped. Similarly, while patient safety and medical errors have come into public awareness, advances in these areas have been slow, hard won, and unsupported by the kinds of smart, data driven engineering designs that have gone into other domains. The interest in applying machine learning to clinical practice is increasing yet the practical application of these techniques has been less than desirable. Clinicians continue to make determinations in a technically unsupported and unmonitored manner due to a lack of high-quality evidence or tools to support most day-to-day decisions. There is a persistent gap between the clinicians required to understand the context of the data and the engineers who are critical to extracting useable information from the increasing amount of healthcare data that is being generated. This talk focuses on the divide between the data science and healthcare silos, and posits that the lack of integration is the primary barrier to a data revolution in healthcare. I first discuss literature that supports the existence of this divide, and then I present recommendations on how to bridge the gap between practicing clinicians and data scientists.
10:30 Coffee Break and Discussion
10:45 Image-based Biomarkers and Prediction in Large Clinical Cohorts
To take full advantage of clinically relevant information implicitly captured in medical images, we develop robust algorithms for quantifying disease burden from patient scans. We then demonstrate how genetic and clinical variables can be used to predict anatomy and anatomical change through a semi-parametric generative model. Joint modeling of image and genetic data promises to provide insights into genetic factors and anatomical effects of the disease. We demonstrate the promise of this approach on large collections of brain scans of different patient cohorts
11:30 Comprehensive predictive modeling at the bedside
Early Warning Scores and other forms of predictive modeling present clinicians with real time estimates of the risks of imminent untoward events based on statistical models trained on legacy data sets. Nearly all such tools are based on static and intermittent data elements such as demographics, diagnoses, notes, vital sign measurements and lab test results. Continuous physiological monitoring such as EKG telemetry is another potential source, and has the potential advantage of higher data coverage. It introduces a new step in the modeling process, though, that of time series analysis of cardiorespiratory dynamics to detect signatures of illness. The University of Virginia group has investigated comprehensive approaches to predictive modeling that use static, intermittent and continuous data streams for early detection of subacute potentially catastrophic illness in infants and adults, in ICUs and on hospital floors.
Saban Research Patio
12:15 LUNCH
13:30 Spotlight Talks A
14:20 Posters A
15:20 Spotlight Talks B
16:10 Posters B
17:15 Improving the design and discovery of dynamic treatment strategies using reinforcement learning
Reinforcement learning offers a powerful paradigm for automatically discovering and optimizing sequential treatments for chronic and life-threatening diseases. This talk will introduce basics of reinforcement learning and then discuss several aspects of this work, including: How should we collect data to learn good sequential treatment strategies? How can we learn a representation of the data that allows generalization across patients? How can we use the data collected to discover sequential treatment strategies that are tailored to patient characteristics and time-dependent outcomes? The methods presented will be illustrated using results of our work on learning adaptive neurostimulation policies for the treatment of epilepsy.
Saban Research Patio
18:00 Dinner and Discussion
Day 2
Saban Research Institutte
8:30 Breakfast
9:15 Processed data to derive clinically useful information
Michael Pinsky, MD and Artur Dubrawski, PhD
It is often difficult to accurately predict who, when, and why patients would develop shock because signs of shock often occur late when organ injury is already present. Three levels of aggregation of information can be used to aid the bedside clinician in this task: analysis of derived parameters of existing measured physiologic variables using simple bedside calculations (Functional Hemodynamic Monitoring), using prior physiologic data of similar subjects during periods of stability and disease to define quantitative metrics of level of severity; and to use libraries of responses across large and comprehensive collections of records of diverse subjects whose diagnosis, therapies and course of treatment is already known to predict not only disease severity, but also the subsequent behavior of the subject if left untreated or treated with one of the many therapeutic options. A major pre-analysis problem is the cleaning of data to remove non-physiologic artifacts due to technical errors, which correspond to >70% of all clinical alerts. We have been developing algorithms that effectively isolate ~85% of all artifacts among alerts generated from physiologic time series of vital sign data. The next problem is to define the minimal monitoring data set needed to initially identify patients at risk across all possible processes and then specifically monitor their response to targeted therapies known to improve outcomes. To address these issues, we represented the vital sign data with highly multivariate feature sets and used machine learning algorithms to infer parsimonious predictive models for cardiorespiratory insufficiency. We describe the nature of the required data sets and modeling approaches used to detect, forecast, and track evolution of risk for this severe condition. These approaches jointly enable earlier identification of cardiorespiratory insufficiency and direct focused patient-specific management. To validate our methodology, we used both a porcine model of hemorrhage and human vital sign data collected in a trauma step-down unit. Our results show value of truly multivariate fused approach versus more traditional single vital sign thresholding at detection, and how it can also allow for reliable forecasting of cardiorespiratory insufficiency before its overt signs become apparent. Also, increasing resolution of signal processing from mean data collected at regular intervals to beat-to-beat and waveform analysis progressively improves the predictive value of the fused parameters. In addition, we show that using personalized reference data can further improve detectability and predictability of cardio-respiratory insufficiency, if such data is available. Finally, we demonstrate that temporal evolution of risk for cardiorespiratory insufficiency is a heterogeneous yet a systematic process. Most patients who develop this condition follow one of only a handful typical risk evolution trajectories, and they can be assigned to their most likely trajectory type well ahead of the onset, therefore enabling further gains in predictability.
10:00 Clinical Abstract Talks and Software Demos
10:45: Clinical Abstract Posters
11:45: Culture Trumps Data
Why is it so hard to drive change in healthcare? The data, technology, and insights exist but despite this it is so hard to move the needle in the right direction. What's the point of developing a technology if it is never going to be used due to business, cultural, or human behavior challenges. Understanding these issues can help you have greater impact. Learn how to ask the right questions that will yield the greatest impact.
12:30: LUNCH
14:00: Panel Discussion
Randall Wetzel
Lee Hartsell
Suchi Saria
John Guttag
Nigam Shah
14:45: Electronic Health Record Analysis via Deep Poisson Factor Models
Electronic Health Record (EHR) phenotyping utilizes patient data captured through normal medical practice, to identify features that may represent computational medical phenotypes. These features may be used to identify at-risk patients and improve prediction of patient morbidity and mortality. We present a novel deep multi-modality architecture for EHR analysis (applicable to joint analysis of multiple forms of EHR data), based on Poisson Factor Analysis (PFA) modules. Each modality, composed of observed counts, is represented as a Poisson distribution, parameterized in terms of hidden binary units. Information from different modalities is shared via a deep hierarchy of common hidden units. To explore the utility of these models, we apply them to a subset of patients from the Duke-Durham patient cohort. We identified a cohort of over 12,000 patients with Type 2 Diabetes Mellitus (T2DM) based on diagnosis codes and laboratory tests out of our patient population of over 240,000. Examining the common hidden units uniting the PFA modules, we identify patient features that represent medical concepts. Experiments indicate that our learned features are better able to predict mortality and morbidity than clinical features identified previously in a large-scale clinical trial.
15:30: A perspective on Machine Learning in Pediatric Intensive Care
The Laura P. and Leland K. Whittier Virtual PICU
The Laura P. and Leland K. Whittier Virtual Pediatric Intensive Care Unit (VPICU) is a team of doctors, machine learners, and engineers committed to developing real-time clinical decision support for the pediatric ICU. We will discuss our perspective on what needs exist in the ICU and how machine learning can meet these needs. We will highlight some of our recent machine learning work that aims to enable solutions to those needs.
16:00: Closing Remarks
16:15: Feedback Discussion Session
Invited Speakers
Bassam Kadry, MD Clinical Assitant Professor, Anesthesiology, Perioperative and Pain Medicine Stanford Medicine
Larry Smarr, PhD Professor of Computer Science and Information Technologies University of California, San Diego
Lawrence Carin, PhD Professor of Electrical & Computer Engineering Duke University
Polina Golland, PhD Professor of Electrical Engineering and Computer Science Massachusetts Institute of Technology
Joelle Pineau, PhD Associate Professor, School of Computer Science McGill University
Randall Moorman, MD Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics University of Virgina
Leo Celi, MD Assistant Professor Medicine Beth Israel Deaconess Medical Center
Michael Pinsky, MD Professor of Critical Care Medicine University of Pittsburgh
Artur Dubrawski, PhD Senior Systems Scientist, Robotics Institute Carnegie Mellon University
Program Chairs
Finale Doshi, PhD Assistant Professor in Computer Science, Harvard School of Engineering and Applied Sciences
James C. Fackler, MD Associate Professor Departments of Anesthesiology/Critical Care Medicine and Pediatrics Johns Hopkins University School of Medicine
David Kale PhD Student, Computer Science, Viterbi Dean's Doctoral Fellow, and Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California
Byron Wallace, PhD Assistant professor at the University of Texas at Austin
Jenna Wiens, PhD Assistant Professor of Computer Science and Engineering (CSE) at the University of Michigan
Senior Advisory Committee:
Carla Brodley, PhD Dean of the College of Computer and Information Science, Northeastern University
Michael Brudno, PhD Associate Professor and Canada Research Chair in Computational Biology, University of Toronto
Gari D. Clifford, PhD Associate Professor, Biomedical Informatics Emory University
Noémie Elhadad, PhD Associate Professor of Biomedical Informatics, Affiliated with Computer Science, Columbia University
Deborah Estrin, PhD Professor of Computer Science at Cornell Tech in New York City and a Professor of Public Health at Weill Cornell Medical College
Joydeep Ghosh, PhD Schlumberger Centennial Chair Professor of Electrical and Computer Engineering at The University of Texas at Austin
Russell Greiner, PhD Professor of Computer Science at the University of Alberta
John Guttag, PhD Dugald C. Jackson Professor MIT Department of Electrical Engineering and Computer Science
Milos Hauskrecht, PhD Professor of Computer Science, University of Pittsburgh
Eric Horvitz, PhD Technical Fellow and Managing Director, Microsoft Research
Isaac Kohane, MD, PhD Lawrence J. Henderson Professor of Pediatrics, Boston Childrens Hospital
Roger Mark, MD, PhD HST Faculty, Distinguished Professor in Health Sciences and Technology and Electrical Engineering and Computer Science, Massachusetts Institute of Technology
J. Randall Moorman, MD Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics
Raymond T. Ng , PhD Professor of Computer Science at the University of British Columbia
John Quinn, PhD Senior Lecturer in Computer Science at Makerere University
Christian Shelton, PhD Associate Professor at UC Riverside's Computer Science Department
Peter Szolovits, PhD Professor of Computer Science and Engineering in the MIT Department of Electrical Engineering and Computer Science
Nigam H. Shah, MBBS, PhD Associate Professor, Medicine - Biomedical Informatics Research, Stanford University
Mark S Wainwright, MD, PhD Founder’s Board Chair of Neurocritical Care, Professor in Pediatrics-Neurology, Neurology - Ken and Ruth Davee Department and Pharmacology, Northwestern
Randall Wetzel, MD Chairman, Department of Anesthesiology Critical Care Medicine - Children's Hospital Los Angeles
Chris Williams, PhD Professor of Machine Learning, School of Informatics, University of Edinburgh
Accepted Papers
Input-Output Non-Linear Dynamical Systems applied to Physiological Condition Monitoring
Konstantinos Georgatzis, Chris Williams, and Christopher Hawthorne, University of Edinburgh
Preterm Birth Prediction: Stable Selection of Interpretable Rules from High Dimensional Data
Truyen Tran, Wei Luo, and Dinh Phung, Deakin University; Jonathan Morris and Kristen Rickard, University of Sydney; Svetha Venkatesh, Deakin University
Mitochondria-based Renal Cell Carcinoma Subtyping: Learning from Deep vs. Flat Feature Representations
Peter Schüffler and Judy Sarungbam, Memorial Sloan Kettering Cancer Center; Hassan Muhammad, Weill Cornell Medical College; Ed Reznik, Satish Tickoo, and Thomas Fuchs, Memorial Sloan Kettering Cancer Center
Multi-task Learning with Weak Class Labels: Leveraging iEEG to Detect Cortical Lesions in Cryptogenic Epilepsy
Bilal Ahmed, Tufts; Thomas Thesen and Karen Blackmon, NYU; Carla Brodley, Northeastern
Doctor AI: Predicting Clinical Events via Recurrent Neural Networks
Edward Choi and Mohammad Taha Bahadori, Georgia Tech; Andy Schuetz and Walter Stewart, Sutter Health; Jimeng Sun, Georgia Tech
Diagnostic Prediction Using Discomfort Drawing with IBTM
Cheng Zhang, KTH Royal Institute of Technology; Hedvig Kjellström, KTH Sweden; Carl Henrik Henrik, Bristol University; Bo Bertilson, KI Karolinska Institutet
Learning Robust Features using Deep Learning for Automatic Seizure Detection
Pierre Thodoroff and Joelle Pineau, McGill University
Using Kernel Methods and Model Selection for Prediction of Preterm Birth
Ansaf Salleb-Aouissi, Columbia University; Anita Raja, Cooper Union; Ronald Wapner, Columbia Medical Center
gLOP: the global and Local Penalty for Capturing Predictive Heterogeneity
Rhiannon Rose and Daniel Lizotte, Western University
Uncovering Voice Misuse Using Symbolic Mismatch
Marzyeh Ghassemi, MIT; Zeeshan Syed, University of Michigan; Daryush Mehta, Jarrad Van Stan, and Robert Hillman, Masschussetts General; John Guttag, MIT
Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization
Shalmali Joshi, Suriya Gunasekar, and Joydeep Ghosh, UT Austin; David Sontag, NYU
Transferring Knowledge from Text to Predict Disease Onset
Yun Liu, MIT; Kun-Ta Chuang, Fu-Wen Liang, and Huey-Jen Su, National Cheng Kung University; Collin Stultz and John Guttag, MIT
Scalable Modeling of Multivariate Longitudinal Data for Prediction of Chronic Kidney Disease Progression
Joseph Futoma, Blake Cameron, Mark Sendak, and Katherine Heller, Duke University
Directly Modeling Missing Data in Sequences with RNNs: Improved Classification of Clinical Time Series
Zachary Lipton, UC San Diego; David Kale, USC Information Sciences Institute; Randall Wetzel, Children's Hospital LA
Deep Survival Analysis
Rajesh Ranganath, Princeton University; Adler Perotte, Noémie Elhadad, and David Blei, Columbia University
Deep Convolutional Neural Networks for Microscopy-Based Point of Care Diagnostics
Alfred Adama, Pius Mugagga, Rose Nakasi, and John Quinn, Makerere University
Clinical Tagging with Joint Probabilistic Models
Yoni Halpern, NYU; Steven Horng, Beth Israel Deaconess Medical Center; David Sontag, NYU
Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests
Narges Razavian, Jake Marcus, and David Sontag, NYU
A Non-parametric Bayesian Approach for Estimating Treatment-Response Curves from Sparse Time Series
Yanbo Xu, Suchi Saria, and Yanxun Xu, Johns Hopkins University
Accepted Clinical Podium Abstracts
Demonstration of a Chronic Kidney Disease Population Rounding Tool
Mark Sendak, Duke Institute for Health Innovation; Faraz Yashar, Lance Co Ting Keh, Ephori LLC; Blake Cameron, Joseph Futoma, Katherine Heller, and Uptal Patel, Duke
Precision Medicine in Point-of-Care Management of Surgical Complications
Zhifei Sun, Elizabeth Lorenzi, Ouwen Huang, Thomas Li, Christopher Mantyh, Katherine Heller, and Erich Huang, Duke
Performing an informatics consult
Nigam Shah, Stanford Center for Biomedical Informatics Research
MS Mosaic: Mobile technology and machine learning for multiple sclerosis research and patient care
Lee Hartsell and Katherine Heller, Duke University
Care Coordination using practice based evidences
Adrish Sannyasi, Splunk; Daniella Meeker, USC Keck School of Medicine
Same Decision Probability in Neurocritical Care
Fabien Scalzo, Arthur Choi, and Adnan Darwiche, UCLA
Patient Identification Using Plethysmography Structure Analysis
Jennifer Laine, Yale University
Real-time Detection and Exploratory Discovery of Anomalies for Pediatric Ventilator Management
Tanachat Nilanon and Yan Liu, USC; Justin Hotz and Robinder Khemani, Children's Hospital LA