Healthcare Hit by Data Tsunami

Medical breakthroughs require advanced IT infrastructure

Healthcare organizations are waging battles against cancer, diabetes and heart disease while also struggling to beat the COVID-19 pandemic. Despite the fact all of these illnesses have different symptoms and causes they all have something in common. Big Data is being fed into models that can find potential treatment and cures.

Big Data for Better Health

Big data is an essential tool used to monitor and protect public health. Medical data is collected from millions of people, anonymized, and then pooled together for a basis of analysis.  The abundance of massive amounts of data makes it possible to reach statistically significant findings based on algorithmic models that can predict which populations are at risk for certain ailments.    

There are several examples of how the US is leveraging big data to improve the quality of healthcare. The National Institutes of Health in the United States (NIH) is monitoring over a million people, including collecting genetic information, biological samples, and other information about their medical condition to prevent and improve treatment of diseases.  The Center for Disease Control (CDC) is developing models to identify groups in the population that are at risk of obesity, thus allowing medical intervention at an early age. The US Army Veterans Medical Database has used big data to identify that diabetics are at a higher risk of developing mental illness than other populations.

Medical data is more abundant for research due to advances in Internet of Medical Technologies (IoMT). Most devices are connected to computer networks to collect and transmit medical data. Typically, medical devices that monitor patients, perform lab tests, or capture medical images, upload results to the a centralized datacenter for analysis.  In addition, more and more people are equipped with wearables to monitor their heart rates, sugar levels, UVA exposure and more.  As a result healthcare data is projected to grow faster than in manufacturing, financial services, or media with a compound annual growth rate (CAGR) of 36 percent through 2025. 

Technical Complexities of Mining Big Data

Algorithms that rely on huge data sets can require large amounts of computing power.  The accuracy of the model is directly related to the amount of data it can ingest and process quickly and reliably. Legacy data warehouses or other data stores often hit the wall of resources or simply take too long to access and execute analytics on the data. Adding more software nodes or hardware servers to process more data isn’t always practical or effective. Adding more software increases the complexity of data management without guaranteeing results. On the hardware side, adding additional CPU-based servers has limitations that can exhaust the financial resources, finite space, power, and cooling resources available in data centers. It’s counterproductive if a higher percentage of medical research costs need to be allocated to expensive IT equipment to accommodate a huge number of data points.

However, there are solutions that are specifically designed for segmenting and analyzing massive data stores. For example, data acceleration platforms that run on GPU processers can utilize parallel processing to significantly increase the amount of data that can be analyzed, while speeding up analytics and minimizing the requirement for large numbers of expensive servers.  This type of architecture can also reduce the time to reach conclusions which can save lives. One cancer researcher attributed a data base accelerator running on GPU as cutting years of cancer research by making data analytics run 100X faster.

Many legacy big data platforms such as Hadoop and Spark can become exceedingly expensive when they go to scale to process the huge numbers of datasets collected for medical research.  But it’s possible to decouple compute resources from storage so companies can scale only storage, only compute, or both – thereby greatly improving performance while reducing resource requirements. This shared-data architecture lets medical researchers scale out more easily and cost effectively to ingest and analyze larger volumes of data using the same resources. 

Public policy and healthcare organizations are becoming increasingly more data driven.  With the influx of new data points to support comprehensive and complex research, organizations’ ability to manage the data can directly impact the wisdom of their insights.  Smarter architectures designed for faster and more resource efficient processing and better and faster medical data analytics, can lead to quicker medical breakthroughs to reduce the spread of all types of disease and illness. 

About the Author

David Leichner is CMO at SQream. David has over 25 years of marketing and sales executive management experience garnered from leading software vendors including Information Builders, Magic Software and BluePhoenix Solutions. At SQream, David is responsible for creating and executing its marketing strategy and managing the global marketing team that forms the foundation for SQream’s product and market penetration.

Sign up for the free insideAI News newsletter.