Knowledge Discovery on Health Records

What: Conceptualization, Full Stack Development, Algorithm Implementation
With: Ria Maheshwari
Kartik Moudgil
Where: Undergraduate Research

In India, medical data is currently trapped in discontinuous, fragmented storage systems; it may not even be digitized. There is no centralized database for patient records. This will become a problem as smart technologies become more prevalent across sectors; thus a proper framework mechanism is needed to integrate the multivariate sources of medical data; store, process and allow remote access to it, allowing the creation of a platform for various healthcare providers and users.

The challenge for researchers is the existence of incomplete, inconsistent medical datasets. This is an information-based challenge, especially in a country like India, where patient records are still maintained primarily on paper, and only by larger hospitals it is stored digitally. Artificial intelligence has been the cause of various disruptions across multiple sectors, and the same is true for healthcare. AI requires an extensive cache of data; a central EHR subsystem is the perfect pair for such a predictive subsystem.

Time Frame: August 2017 - April 2018
Associated Acheivements: IEEE indexed Technical paper
Tools & Technologies: HTML5, CSS3, JavaScript, ChartJS, Ajax, Python, Web2py, Keras, Tensorflow


Steps of implementation:

1. Digitization
- Digitization of medical records (electronic) onto one standardized platform
- Consolidation, structuring and preprocessing of data

2. Complete system integration
- Building interfaces for data access/simple lookups based on specific parameters
- Setting up basic neural networks, analysis/testing of their accuracy and other parameters and integration with EMR subsystem
- Basic data visualization interfaces for users such as healthcare providers, doctors and research personnel

3. Prediction
- Using neural networks for predictions and and knowledge discovery
- Novel/deep trend analysis and high-level data visualization

System Architecture

Demonstration of a sample dataset

Step 1:
Digitize, consolidate and visualize medical data. The data was acquired in CSV format. It contains medical details of females(Pima Indians) above the age of 21.

Model Training

Step 2:
Using neural networks for pattern analysis, we set up a convolutional neural network using the following parameters:
1. Number of times pregnant
2. Plasma glucose concentration: 2 hours in an oral glucose tolerance test
3. Diastolic blood pressure (mm Hg)
4. Triceps skinfold thickness (mm)
5. 2-Hour serum insulin (mu U/ml)
6. Body mass index (weight in kg/(height in m))
7. Age (years)

Output parameters:
1. Diabetic (yes or no)

System UI

Step 3:
Allow finding large scale trends across various parameters such as demographics, age, location, time with the help of medical practitioners. We will select parameters and graphically represent any discovered trends and results.

System Architecture

System Architecture


A researcher or any other health-care provider would be able to access the system remotely; over the Internet to be given a dashboard allowing queries as per the access level authorized. The user can perform two basic functions: upload/synchronize/merge the data generated at his end, by existing EMR/EHR software, to be exported in common medical formats; these specifications have been standardized already.

Additional functions which can be performed include aggregative queries to research trends by physiological parameters, or even parameters such as location; for example to study the spread of a particular infectious disease; another function is to allow predictive queries, wherein common neural networks defined for a particular scenario can effectively predict the value of a particular physiological parameter based on the values/records provided.