Mining for Big Data in the Health Enterprise
Much has been said on the topic of Big Data and Healthcare IT. The Human Genome Project, completed in 2003, successfully mapped the human genome by sequencing over 3 billion base pairs. This massive effort took the better part of 13 years and was coordinated amongst twenty research institutes and universities. As the seminal Big Data in healthcare event, its achievement has been compared to putting a man on the moon. Now, with the widespread availability of open source and vendor tools, similar scale efforts can be applied to a wide variety of endeavors. The democratization of Big Data technology promises to revolutionize healthcare with new discoveries in 3D imaging, genomics and epidemiology. Big Data Analytics also promises to revolutionize population health as it is now possible to analyze risk cohorts and interventions at a granular level. While much of this is exciting and important, there is often a gap, and sometimes a chasm, between the research bench and the patient bedside. It is a perennial problem in all areas of research, yet fortunately it is being addressed at the policy level. Most of the federal mandate investments in Healthcare IT are focused not on new discovery, but on the application of Information Technology to consistently deliver quality care, share medical records and integrate ancillary services. Federal Meaningful Use incentives have driven technology adoption with near single-mindedness.
Yet demographics and quality measures do not tell the whole story. Healthcare has substrata of data that is only now being excavated, and this could provide an even greater benefit to organizations, clinicians and, ultimately, patients. Hospitals are bristling with network connected devices. Our clinical hallways and rooms are clogged with workstations, monitors and devices. Clinicians carry all kinds of smartphones, pagers, tablets and devices. Patients carry smart phones connected to guest WiFi networks. CT scanners, mobile infusion pumps, mobile monitoring units, pulse-oximeters, wheelchairs, and patient beds are increasingly mobile and network-enabled. Data is flowing to and from devices and the scores of clinical applications that are common in hospitals. The data flow contains orders, physiological measurements, events, and transactions; each is interfaced using the HL7 messaging protocol and recorded into the Electronic Health Record (EHR). As the system of record for patients, modern and interconnected EHRs allow providers access to an unprecedented volume of patient data with relative ease. EHR vendors continue develop new data connections and techniques for interpreting and presenting data.
“Text messages, emails, pages, phone calls and HER communications leave behind digital metadata recorded in logs”
However, EHRs and care delivery organizations are largely ignoring another source of potentially valuable information: system log. Nearly every network connected device emits a log. These logs are effectively a distributed sensor network, containing time-stamped metadata about events of all types. Administrators can vary the level of logging from almost nothing to painfully verbose, although the general practice is to log only what is needed for occasional troubleshooting, which is usually not enough to support continuous analysis. Logs are a hidden resource IT leaders rarely tap. Individually, these logs tell us about the immediate system and are typically exploited only by those with technical responsibility. When log data is combined with information from the EHR, human resources, facilities and other Information Technologies, context is provided to data that would be mostly otherwise meaningless. Only by shifting to a program of promiscuous log collection and aggressive correlation, can care delivery organizations hope to quantitatively answer complex usability and workflow questions.
Google, Facebook, Amazon and the other leading e-commerce firms have created entire markets on the collection of metadata. The current technology boom is in large part built on metadata analytics and management and has spawned new tools such as Hadoop and NoSQL databases. It is commonly understood that logs are ‘noisy,” that is, often fragmentary and ambiguous. Yet when combined with time correlated data sources a surprisingly complete activity picture can emerge from the noise. Most private industry purveyors of big data use this picture to advance marketing and sales. Consumer privacy implications of these technologies are disconcerting, however, when modeled and applied in healthcare this hyperawareness can be used to discern activity patterns that impact efficiency, patient safety and provider satisfaction.
From EHR access logs it is possible to infer which providers are involved with the care of a specific patient. Joined with wireless access point logs and device inventory data, we can arrive at an accurate chronology of one or more care episodes. Situational awareness speeds response in emergencies, improves workflow routing and can be used retrospectively. Issues with a patient could be detected in advance based on the deviations from statistic norms and personnel proactively alerted of an impending concern. This example demonstrates how seemingly disconnected data can be brought together, and collectively used to improve patient safety and care.
Good communication amongst the many providers and departments providing care is critical too. Increasingly, these conversations are not in person, and in many cases individual providers may not be personally acquainted. Fortunately, text messages, emails, pages, phone calls and EHR communications leave behind digital metadata recorded in logs. Of course one of the challenges is wading through the metadata and matching the appropriate parts, with application information from personnel, department and EHR systems. Yet out of these disparate data streams, the whole analysis often exceeds the sum of its parts. We can begin to identify not just the flows of information but of collaboration and shared practice. With an established baseline, it is possible to design interventions and deploy them simultaneously in trial runs. The ability to quickly measure the impact of an intervention against the baseline can indicate the best solution, and fosters an iterative design approach.
In all of the discussions of Big Data and healthcare, it’s important to not ignore the data we all already have and fail to exploit to the maximum. Now is the time to review what data you have, what you could have and to design a program collect, analyze, monitor and report
By Leni Kaufman, VP & CIO, Newport News Shipbuilding
By George Evans, CIO, Singing River Health System
By John Kamin, EVP and CIO, Old National Bancorp
By Elliot Garbus, VP-IoT Solutions Group & GM-Automotive...
By Gregory Morrison, SVP & CIO, Cox Enterprises
By Alberto Ruocco, CIO, American Electric Power
By Sam Lamonica, CIO & VP Information Systems, Rosendin...
By Sergey Cherkasov, CIO, PhosAgro
By Pascal Becotte, MD-Global Supply Chain Practice for the...
By Stephen Caulfield, Executive Director, Global Field...
By Shamim Mohammad, SVP & CIO, CarMax
By Ronald Seymore, Managing Director, Enterprise Performance...
By Brad Bodell, SVP and CIO, CNO Financial Group, Inc.
By Jim Whitehurst, CEO, Red Hat
By Clark Golestani, EVP and CIO, Merck
By Scott Craig, Vice President of Product Marketing, Lexmark...
By Dave Kipe, SVP, Global Operations, Scholastic Inc.
By Meerah Rajavel, CIO, Forcepoint
By Amit Bahree, Executive, Global Technology and Innovation,...
By Greg Tacchetti, CIO, State Auto Insurance