(1193-B) A Comparative Analysis of Compound Clustering and HIT Prediction Variability in Cell Painting Assays Using Supervised and Unsupervised Deep Learning and CellProfiler Methods.

Monday, February 5, 2024

2:00 PM - 3:00 PM EST

Location: Exhibit Halls AB

Abstract: The Cell-Painting assay is widely considered to be an extremely informative initial phenotypic assay for the identification of bioactive substances and in the field of drug discovery. Its critical feature is the examination of multi-channel image data, which is essential for identifying biological activity and linking it to specific mechanisms of action.

For tasks such as predicting mechanisms of action or clustering compounds, image encoding is an essential initial step. The most commonly used methods for obtaining these encodings include CellProfiler software[1], supervised deep learning for classification [2], and unsupervised deep learning for image clustering [3].

The aim of this study is to demonstrate that the choice of method can lead to significantly different encodings of the phenotypic features of cells. This is evident by the notable variation in the similarities of clusters between methods. Even minor changes in image resolution can greatly impact predictions. Particularly noteworthy are the differences observed when comparing deep learning methods to the non-deep learning CellProfiler approach. Additionally, we show that the choice of dimensionality reduction methods applied during the visual analysis of assay results also substantially influences the formation of clusters in two or three-dimensional space.

Considering the aforementioned insights, our study aims to demonstrate that employing multi-model ensembles could be the most effective strategy for identifying HITs (High Throughput Screening Hits). This approach potentially enhances the accuracy of predictions compared to using a single method in isolation. Additionally, we explore the use of the multi-model approach to identify the specific channel most significantly impacted by a given compound.

The study utilized 12 plates, each comprising 384 wells, for high-throughput screening (HTS). The screening involved 8 control substances and 2,000 lesser-known compounds tested across two distinct cell lines. The data for this part of the study was provided by IBCH. Currently, the study is being expanded to assess the analyzed methods on the JUMP Cell Painting dataset [4]. This extension includes an examination of 21 plates, which span two cell lines and encompass 56 mechanisms of action.

For both the training of the encoders and inference, the data was processed using the Biolify.AI CellFusion platform.

1. Carpenter, Anne E., et al. "CellProfiler: image analysis software for identifying and quantifying cell phenotypes." Genome biology 7 (2006): 1-11.
2. Wong, et al. "Deep representation learning determines drug mechanism of action from cell painting images." Digital Discovery 2.5 (2023): 1354-1367.
3. Doron, Michael, et al. "Unbiased single-cell morphology with self-supervised vision transformers." bioRxiv (2023): 2023-06.
4. Chandrasekaran, Srinivas Niranj, et al. "JUMP Cell Painting dataset: morphological impact of 136,000 chemical and genetic perturbations." bioRxiv (2023): 2023-03.

This project was performed in collaboration with the Centre for Chemical Biology IBCH PAS, financially supported by POIR.04.02.00-00-C004/19-00 project and the Polish Ministry of Education and Science (previously MNiSW, decision no DIR/WK/2018/06) for POL-OPENSCREEN project and the involvement in the joint international project EU-OPENSCREEN ERIC.

Jan Gruszczynski, n/a

Co-founder & Researcher
BIOLIFY.AI & Poznan University of Medical Sciences
Poznan, Wielkopolskie, Poland