Prediction of gene expression levels using Deep learning tools on histopathological images with Spatial transcriptomics data

Roy R.
2 min readOct 16, 2020
Illustration of a visualization of mRNA expression data of a tissue sample. credit to 10xgenomics https://spatialtranscriptomics.com/technology/

Background

The field of image analysis has developed greatly over the past years. Tools that are researched in this field could be used for many tasks; I will focus in this project on a task from the medicinal and genetical worlds.

An interesting sub-field of bio-engineering research which will be discussed in here is the exploration of links between gene expression and visual features in cell morphology.

“Spatial transcriptomics” (denoted ‘ST’ from now) is an analysis method that provides spatial gene expression information in different spots within a biopsy sample. For each spatial location and its gene expression level, a fraction of the biopsy image is linked. Then, computer vision deep learning tools could be used to analyze these images coupled with gene expression data.

Project

In this project, the task at hand was to extract gene expression values given a histopathologic biopsy image .

The main goal was to improve the ability to produce an accurate-as-possible approximation of the gene expression levels of all of the genes given biopsy images, While the secondary goal was to do so while not having to rely on a large amount of ST data.

To achieve the main goal, a few different methods were tested, including non-negative matrix factorization and autoencoders. With all the tested methods, deep learning networks were applied for the purposes of prediction and dimensionality reduction, as will be explained later. To achieve the secondary goal, tools for dataset augmentation were used in the pre-processing phase of the project.

Code

Code for this project can be found here

Credits

Project supervisor — Leon Anavy (CS faculty @ Technion).

Datasets and more information on spatial transcriptomics — 10xGenomics

--

--