Six Types of Life Sciences Data and How to Process Them in MATLAB

By Rob Holt, MathWorks


Researchers worldwide use MATLAB® and Simulink® to synthesize genetic, image, time-series, microscopy, and other data for tasks and applications across the life sciences. See six examples of their work and learn how you can apply the same tools and techniques to your own projects.

Epidemiological Spread Analysis

Poliovirus virion.

Outbreaks of polio are still common in Africa and central Asia. A strong vaccination program both promotes herd immunity and enables poliovirus strains to be tracked and categorized as they spread. A group at the CDC, in collaboration with MathWorks consultants, has created a tool for labeling and tracking polio strains. The tool applies a combination of phylogenetic and cartographic data analysis, enabling researchers to study the evolution of any strain of polio. By combining this approach with spatial spread analysis, they can monitor polio spread and plan immunization campaigns.


High-Throughput Genotoxicity Analysis

Genotoxicity analysis based on high-throughput imaging flow cytometry can be a tedious and rate-limiting step in drug safety testing. Current approaches typically require an expert to identify cellular damage in individual images. This approach is slow, and clinicians often disagree in their evaluations. Eulenberg et al recently showed how the process can be accelerated significantly with DeepFlow, a deep neural network optimized for flow cytometry analysis and trained on thousands of manually labeled images to automatically grade cellular damage.

Mononucleated and binucleated cells. Left: bright-field images; right: nuclear fluorescence images.


Network-Based Identification of Candidate Drugs

Scanning electron microscope image showing the SARS-CoV-2 virus (yellow) emerging from the surface of cells (pink) cultured in the lab. (Image credit: NIAID-RML)

Many drugs fail at early-stage testing. One solution showing promise is the repurposing of drugs already demonstrated to be safe for human use. To identify candidate drugs, some researchers are exploring network analysis. For example, a group at the Cleveland Clinic used interactome network analysis to explore repurposable drugs and drug combinations to target SARS-CoV-2. They used phylogenetic analysis to determine that SARS-CoV-2 is most closely related to SARS-CoV, identifying targetable sequences in the process. They then incorporated interactome data into this network proximity analysis to propose candidate drugs and drug combinations for possible treatment.


Dose Calculation for Radiation Therapy Planning

Radiation therapy has been a mainstay of cancer treatment for decades. Treatment planning is vital to a successful clinical outcome. Most commercial packages for treatment planning are proprietary and closed-source, however, limiting their value for developing advanced treatment planning technology. A collaboration between the German Cancer Research Center and LMU Munich created matRad, an open-source toolkit that can be used to simulate a variety of beam geometries, modalities, and energies. matRad can also incorporate clinical constraints and objectives, enabling clinicians to optimize the intensity and distribution of the radiation dose.

The matRad 2.10.0 interface, with workflow, plan, optimization, and visualization controls.


Medical Image Classification for Outcome Prediction

Tumor heatmap image. 

Manual evaluation of pathology can be slow and expensive— annotation of a complete pathological slide image can take several hours. Further, pathologists sometimes disagree about how a particular image should be classified. Deep learning has been applied in digital pathology to accelerate diagnosis and remove human error. Kather et al published a paper detailing the use of deep learning on gigapixel histology images to accurately predict microsatellite instability, a key metric for gastrointestinal cancer outcome prediction. The group used transfer learning with resnet18 and trained the network on gastrointestinal images from the Cancer Genome Atlas database.


FDA-Approved Software Development

To acquire regulatory approval, medical devices and software must be robust, and testing and documentation must be updated for every release. Medviso acquired FDA approval for its quantitative cardiac image analysis software, developed for clinical use. The software, Segment CMR, uses time-domain x-ray CT images of a complete cardiac cycle to calculate health metrics such as myocardial mass and ejection fraction. The package includes regression tests and automatically generates the reports and documentation required for regulatory compliance.

3D geometrical reconstruction of the human left ventricle from MR images. 

Published 2021

Products Used