How Data Science Helped in COVID Crisis
With the instigation of the development of Data Sciences and cloud computing, at the present day, we have manifold and diverse knowledge in understanding how pandemics progress. Originated initially in Wuhan province, China, coronavirus disease 2019 (COVID-19) is a human infectious illness caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) 178 million cases of COVID19 have been reported globally as of 18th June 2021. We today understand that not only medical factors but non-pharmaceutical measures can also help in the reduction of severity of active COVID-19 cases. These measures as social distancing, public closures and restricted mobility and placed social interventions could help in containing this pandemic. With Data Science into action, right from data-driven decision making for public interventions to statistical modelling in COVID research, the data science community has helped in understanding COVID-19 better.
As of the 21st century, our technological tools like math, statistics and computational resources have advanced and improved remarkably.
Our biological information has also increased as we are knowing much more about the molecular relations at the genomic level and can make use of predictive models, for determination of the likely outcomes of customizing the cellular realm. Due to variation in data it might be difficult to predict each and every possible outcome based on a set of internal and external features where an array of possible input factors are feeded. For crops, the prediction accuracy is higher as compared to complex biological systems, e.g. humans. But, the pharma industry is a great example of biotechnology as applied to human neurophysiology .
BioTech + Data Science
A biotechnologist is a research scientist who makes use of his statistical tools and technologies on molecular biology. If you are thinking that they are basically data scientists within a highly specific sector, you’re correct. Also, there is the ever-present trio that comes into the picture that is math, statistics, and programming. With regard to statistics, the focus is on biostatistics, which is a specialized form of statistics. But, a data scientist with good skills in statistics and mathematics, can jump into biostatistics easily.
How Data Science helps in Biotechnology?
From finding ways to cure cancer or any of the other fatal diseases to creating products that are safer for the environment, oceans, land, and air. While data science is continuing in its evolutionary path, it is not straying away from the fundamentals of :
- Asking queries
- Collection of data
- Data Preparation
- Model Selection
- Model Testing and Fine Tuning
- Model deployment in a larger production environment
- Monitoring and optimizing the model
Data Science in Genome Sequencing
Data scientists might provide some high level advancements in medical biotechnology by utilizing genome sequencing for predicting a disease. A unique health profile can be created or generated based on both genomic and lifestyle data. If this model is deployed on a health and fitness application where the user can be alerted if certain foods or activities increase their risk of a particular disease, then early detection may reduce medical costs.
Fig.1 – The various initiatives for data collection and visualizations took during the COVID-19 Crisis by various firms. Sources: Indian Chemical Engineer, Volume 62, 2020 – Issue 4. Taylor & Francis Online
The Data Science Footprint in accelerating COVID-19 Understanding
The global participation of the Data Science community has proven its mettle to understand, accelerate and optimize COVID-19 research. From data on mobility trends of people to optimization of mRNA vaccine delivery, the boundaries of data science have provided valuable insights in these tough times.
2.1 Understanding the Spread of COVID-19
To administer any solution delivery, it was necessary to be cognizant of how far the COVID-19 has spread. Insights on the number of active cases, deaths, recoveries, tests, and vaccine doses administered have been a key parameter to deliver COVID-19 care solutions. Communities like – Worldometer & John Hopkins – Coronavirus Resource Center, provided data and dashboards on metrics as such. These not only helped to know the spread of COVID-19 but also made a significant contribution to the localized policy framework by governments and policymakers.
Fig.2 – The COVID-19 Dashboard (Center for Systems Science and Engineering (CSSE) – John Hopkins University)
Sources: Coronavirus Resource Center, John Hopkins University (https://coronavirus.jhu.edu/map.html)
2.2 Statistical Analysis of COVID-19 Trends
Using the concepts of descriptive statistics (understanding the data) and inferential statistics (using data to predict an outcome), the testing and diagnostic modelling of COVID-19 is done, in turn increasing the testing accuracy. With inputs such as the underlying existing medical conditions of COVID patients, modelling could be done to examine their estimated recovery time. Another most prominent use of this branch was to test the associated hypothesis – “Where factors like population demographics, temperature, humidity, and precipitation affect COVID-19?”. The studies as such significantly helped for effective planning of COVID-19 strategies that we see in implementation now.
2.3 Mathematical Modelling – The decision of imposing lockdowns.
Lockdowns, being amongst one of the most successful ways to contain COVID-19 spread, was a direct result of “The Data Science Maths” in action. Predicting the number of patients and geospatially analyzing the risk zones using Auto-Regressive Integrated Moving Average (ARIMA) model, helped the government across the globe to assess areas where strict lockdowns would be required. The SEIR model (Susceptible-Exposed-Infectious-Recovered) has been developed using inputs such as lockdowns, active cases, isolated patients, and vaccination rates. These models predicted the effective reproduction number (R0) of COVID-19 patients, helping the government to assess the effect of lockdowns to lower the active cases.
2.4 AI-driven Radiological Studies – Predicting Covid Pneumonia in advance
“Flattening the Curve” – The most heard phrase during COVID-19 peak. With the constant inflow of new COVID-19 patients, medical infrastructure and healthcare services had been severely afflicted by the new coronavirus. Not only the lack of intensive care units emerged as a problem, but to make things worse the speed of medical diagnostics also got severely affected. This is when image recognition came into action.
A deep learning neural network-based system in no time took the job of classifying normal pneumonia vs. a COVID-19 caused pneumonia. With the CT scan of the lungs, the disease progress in the body could be predicted in advance. This method not only helped in lowering COVID-19 deaths by a huge margin but also helped in proper infrastructure management and medical resource allocation, to maintain the demand in these tough times.
Fig.3 – Deep Learning to predict advanced Pneumonia. The image in LHS shows the lung scan of a COVID-19 patient. The colored images on the right highlighted potential areas where the algorithm detected pneumonia. Sources: BBC
Fig.4 – Automatic examination of Chest X-Rays at Royal Bolton Hospital, UK, using Artificial Intelligence. Sources: BBC
2.5 The COVID-19 Drug Discovery, Transportation & Data Science
The genomic sequence of the SARS-CoV-2 was made available at the earliest possible time as the pandemic progressed. With the available datasets on inhibitors and protein ligands, the prediction on the effect of drugs over the genome, their binding affinity and the action of these drugs to treat COVID-19 were modelled. These AI methodologies in genomic sequencing helped to identify the 10 more prominent potential drugs out of tons of samples. Not only did this speed up the process of testing, but also provided an immediate pathway for the vaccines under development. For effective decision making over the supply chain of vaccination, the role of GIS, Data Sciences, and Machine Learning were prominent.
From tools to design super stable messenger RNA molecules (mRNA) for vaccine improvement to modelling COVID-19 spread, and helping governments in data-driven policy-making, the role of Data Sciences in the pandemic era has been unprecedented. Today, even with the limited amount of COVID-19 data available to be fed into data-hungry deep learning and machine learning models, the impact of the data science community has truly streamlined the way we are tackling COVID-19.