Friday, August 2, 2013
BigData - Health Industry usages - Part 1
"Information is the oil of the 21st century, and analytics is the combustion engine.”
Big data makes it a reality to harness the power of deep analytics, on very huge data sets and streams in either
batch/real-time, using parallel programming concepts and cheap processing power of commodity servers. Big data is typically
identified by huge volumes of data, high velocity of data capture and high variety of data collection. Since there is a
limit on vertical scalability in terms of processing power - using a single very powerful server is no more relevant. It
tends to be much more expensive to scale up rather than scale horizontally.
As an example if you have a task to lift a weight of 300 lbs, it is very difficult and expensive to find and hire a single
person who can do that task rather than employing 6 average strength humans who can lift 50 lbs each. As one can find that
the more the weight to be lifted goes up, the more difficult and expensive one to find a single person to do that task.
We hear several things on how big data is going to make health services more personalized and more economical. We are
already implementing/implemented Meaningful Use I & II procedures to streamline the patients' data management.
There are several technologies that come into play to utilize the big data paradigm for health industry. Listed below are
some of the important procedures and technologies that will make health industry a pro-active, dynamic, economical,
personalized and self-aware entity that is going to change the way health related services will be made available to the
communities across the world.
In-Memory and CEP(complex event programming):
=============================================
Pro-active and Predictive Monitoring and Alert system - personalized:
=====================================================================
Currently when a patient is in ICU, various monitoring systems are hooked up to him/her to measure heart rate, blood
pressure, oxygen saturation,... and when any reading crosses a specific set of values, an alert is being sent to the
monitoring control room and medical staff will attend to that patient as needed. However none of this entire stream of
monitoring data is being collected or stored for any further analysis - reason being a need to have huge amounts of data
storage and subsequent very high processing power to analyze that at a later time. With the arrival of bigdata(Hadoop -
HDFS & MapReduce related bigdata ecosystem technologies like HBase, Hive, Sqoop, Pig, ZooKeeper, Mahout, R, CouchDB,
MongoDB, Neo4J,..Cloudera and Hartonworks platforms, Amazon Elastic cloud storage, Oracle Exadata, Exalytics, SQL server
2014, Attensity, OpenChorous and various NoSQL databases like ColumnOriented-DocumentedOriented-GraphOriented-Spatial ), it
is now possible for organizations to store as well as process huge amounts of various formats of data either
structured/semi-structured/unstructured in a batch or real-time fashion.
One can ask how it is going to help if we not only monitor the patient's vital information but also store it and analyze it
in real time. The system can store all this streams of monitoring data in an in-memory database and do analytics on it to
predict and pro-actively advise the medical staff that a serious condition manifestation may be possible in some time in
near future even though none of the monitoring levels crossed any threshold levels yet. This will give medical staff very
much needed few seconds or minutes of advanced warning to provide for better life saving services. The streams of
monitoring data thus saved in in-memory database can be persisted either fully or in a filtered way in a hdfs based
database for further analysis.
SmartPhones and Location Services:
==================================
The use of smart phones became so rampant across the world recently and now we can utilize the location services data
collected by various communication service providers in a very pro-active way to predict and provide better health
services. Recently Snowden from NSA generated some public discussion on various types of data being collected and used by
the US Government. This information is nothing new and rather trivial to the people following the advancement of
technologies and related applications development. Back to the topic, how we can use location services provided by the
smart phones for better health systems development.
Each smartphone having a built-in GPS receiver uses 3 GPS satellites to determine it's postion on the ground within
few feets of accuracy. Earlier cell phones can get location by using algorithms based on cell towers communication. We are
focussing on smart phones for this topic.
Pro-Active Disease Transmission Alert - at personal level:
=========================================================
Let us say a family member/friend of a registered member/patient of a clinic/hospital travelled to China when SARS
epidemic was high and returned back to US and either stayed at the same residence or have a close interaction somewhere
else with the registered member. With the help of bigdata, it is possible to pro-actively alert the registered member by
the health system by just monitoring the location data of cell phone used by the registered member and all other cell
phones which are physically close during any given point of time period.
One can ask how such information can be determined with high accuracy and where the BigData with Location services
come into play. Let us say the registered member's smart phone number is with a local clinic/hospital. Based on his/her
location/movements on a daily basis and possible location proximity and time spent nearby other cell phone members, the
health systems can pro-actively determine possible transmission rate of various viruses.
Each virus may have different effective spread distance and exposure time period. The travel data of the all the
cell phone members which are in close physical range can be gathered from travel organizations and the government. As you
can see the amount of data needed for such an analysis is very huge - you are literally storing/accessing cell phone
location data of all the people who are in a close proximity with the registered member in any given amount of time period
- say 1 day, 1 mont, 1 year or multiple years. Storing and Analysis on such huge data sets was not possible earlier. Now
with big data we not only can store huge amounts of high variety of data reliably but can also process it economically and
in near a real time fashion.
Pro-Active Food contamination Alerts - at personal level:
========================================================
We often see recalls of food items by various super market chains. We can use registered member(s) food purchasing
habits and food contamination alerts to personalize real-time alerts to the registered members. Let us say I shop and
Stop&Shop and I am a registered member with LIJ and my/my parent's credict card information is with the LIJ(in a secure
fashion). Now let us say a particular brand of milk product is being recalled by the Stop&Shop. The health system with the
help of big data can pro-actively monitor and determine the possible food poisoning by using food recall alerts from
Stop&Shop and my/myhousehold's food purchasing habits and recent purchase listings at Stop&Shop and can issue a pro-active
alert and possible remedies in case of consumption of such foods to me to stop using a specific food item(s) in near real-
time. Same can be applied to my eating out habits and local restaurants and related food poisoning alerts.
Evidence Based Medical Practices - personalized:
================================================
As more and more people around the world travel more and the societies become more and more diverse, it will become
more and more complex for a medical practioner to identify and attend to illneses of his/her patients who may be from over
20 nationalities. The patient set can be from different countries, different religions, different food habits, different
blood related diseases, different allergies, different age groups, different life styles,... and so on which in turn may
make diagnosis more complex for the medical practictioners. With the advent of huge data collection, storage, analytics and
now with big data - it is now possible to access and analyze such data either in structured/semi-structured/unstructured in
an economical and in near real time frame. As one saw how deep blue defeated Gary Casparov(chess game: logical analysis)
and recently Watson won the Jeopardy championship( hypothesis generation, massive evidence gathering, analysis, and
scoring), we are going to use it's presence in medical practice in a big way in near future. Already Watson based medical
analysis systems are ready to go live this year: for utilization management decisions in lung cancer treatment at Memorial
Sloan–Kettering Cancer Center in conjunction with health insurance company WellPoint. With the help of big data, now huge
amounts of patients information can be collected, processed and analyzed to identify various patterns and to determine most
relevant and economical treatment for a patient's specific condition. This way medical practice can become highly
personalized and at the same time more effective and economical. (by eliminating non-relevant tests and by providing
effective treatment option based on each patient's condition).
Over the time as the more data gets stored and advanced analytics gets implemented, a medical Watson system can become a
one stop answer to all medical practitioners and staff. As seen at Memorial Sloan–Kettering Cancer Center this year already
over 90% of nurses follow the guidance given by medical Watson system. The time is not far it to be similar for most of the
doctors to follow the guidance given by Medical Watson.
The practice of medical insurance can be further extended to areas like educational systems insurance by pro-actively
monitoring each individual right from pre-K at a very granular level and effectively predicting most relevant areas for
successful career paths and thereby making it possible for each individual to become a higher contributor to the society at
a later stage. This way evidence based courses/degrees can be offered to registered students in the educational insurance
system at a discounted price or for free.
Indirect usages: Determining patients satisfaction and sentiments, patients' behavior analysis, pro-active community
engagement for managing and controlling diseases spread, automated personalized travel medical advice and nutrition
suggestions, personalized medical advice based on family and community health data,....
Big Data Eco system: using Hadoop one can eliminate the need for expensive clustering solutions offered by various vendors
to support higher availability. using Hive one train their employees who are good with SQL to work on big data, using Sqoop
one can import data into Hadoop from RDBMs systems, using Mahout one can execute various data mining methods, using R one
can develop various statistical analytical programs, using PIG system administrators can interact with Hadoop, using
Hartonworks and Cloudera solutions one can accelerate big data implementations, using HBase one can implement a data
warehousing solution on top of Hadoop, using Neo4J one can implement graph based relationships, using Riak one can
implement key value pair analysis, using Amazon elastic cloud or Microsoft Azure one can go for cloud based solutions,
using Storm one can analyze streams of data in real time, ....and so on. Big data is not just one technology but rather a
set of technologies that work together to provide the needed solutions.
Labels:
big data,
bigdata,
CEP,
hadoop,
hbase,
health industry,
health systems,
in memory database,
obamacare,
patient,
personalized service
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment