GeneDisco: A Benchmark for Experimental Design in Drug Discovery

Published in ICLR, 2022

In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal associations between biological mechanisms and disease pathologies. With billions of potential hypotheses to test, the experimental design space for in vitro genetic experiments is extremely vast, and the available experimental capacity - even at the largest research institutions in the world - pales in relation to the size of this biological hypothesis space. Machine learning methods, such as active and reinforcement learning, could aid in optimally exploring the vast biological space by integrating prior knowledge from various information sources as well as extrapolating to yet unexplored areas of the experimental design space based on available data. However, there exist no standardised benchmarks and data sets for this challenging task and little research has been conducted in this area to date. Here, we introduce GeneDisco, a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery. GeneDisco contains a curated set of multiple publicly available experimental data sets as well as open-source implementations of state-of-the-art active learning policies for experimental design and exploration.

Recommended citation: Mehrjou A, Soleymani A, Jesson A, Notin P, Gal Y, Bauer S, Schwab P. GeneDisco: A Benchmark for Experimental Design in Drug Discovery. arXiv preprint arXiv:2110.11875. 2021 Oct 22.

Counterfactuals uncover the modular structure of deep generative models

Published in ICLR, 2020

Deep generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) are important tools to capture and investigate the properties of complex empirical data. However, the complexity of their inner elements makes their functioning challenging to assess and modify. In this respect, these architectures behave as black box models. In order to better understand the function of such networks, we analyze their modularity based on the counterfactual manipulation of their internal variables. Experiments with face images support that modularity between groups of channels is achieved to some degree within convolutional layers of vanilla VAE and GAN generators. This helps understand the functional organization of these systems and allows designing meaningful transformations of the generated images without further training.

Recommended citation: "@article{besserve2018counterfactuals, title={Counterfactuals uncover the modular structure of deep generative models}, author={Besserve, Michel and Sun, R{\'e}my and Sch{\"o}lkopf, Bernhard}, journal={arXiv preprint arXiv:1812.03253}, year={2018} }"

Kernel-Guided Training of Implicit Generative Models with Stability Guarantees

Published in Arxiv, 2019

The modern implicit generative models such as generative adversarial networks (GANs) are generally known to suffer from issues such as instability, uninterpretability, and difficulty in assessing their performance. If we see these implicit models as dynamical systems, some of these issues are caused by being unable to control their behavior in a meaningful way during the course of training. In this work, we propose a theoretically grounded method to guide the training trajectories of GANs by augmenting the GAN loss function with a kernel-based regularization term that controls local and global discrepancies between the model and true distributions. This control signal allows us to inject prior knowledge into the model. We provide theoretical guarantees on the stability of the resulting dynamical system and demonstrate different aspects of it via a wide range of experiments.

Recommended citation: Mehrjou, Arash. (2009). "Kernel-Guided Training of Implicit Generative Models with Stability Guarantees." Arxiv.

Improved Bayesian information criterion for mixture model selection

Published in Pattern Recognition Letters, 2016

In this paper, we propose a mixture model selection criterion obtained from the Laplace approximation of marginal likelihood. Our approximation to the marginal likelihood is more accurate than Bayesian information criterion (BIC), especially for small sample size. We show experimentally that our criterion works as good as other well-known criteria like BIC and minimum message length (MML) for large sample size and significantly outperforms them when fewer data points are available.

Recommended citation: Mehrjou, Arash, Reshad Hosseini, and Babak Nadjar Araabi. "Improved Bayesian information criterion for mixture model selection." Pattern Recognition Letters 60 (2016): 22-27

Automatic malaria diagnosis system

Published in ICRoM, 2013

Malaria Diagnosis is normally accomplished by visual microscopy which is time consuming and offers low accuracy because of the operator\'s tiredness and lack of profession in job. To overcome this liability, we designed an automatic system which is able to take photos of blood smears automatically in high rate using motorized sophisticated microscope. After gathering enough samples for microscopy, Image processing task is launched which is the core of our job in this paper. Finally, results of the whole process are reported to physician for prescribing the best cure. Because of the importance of correct diagnosis stage of Malaria and parasite type in curing process, this system can attract a great deal of attention in Malaria Diagnosis task.

Recommended citation: Mehrjou, Arash, Tooraj Abbasian, and Morteza Izadi. "Automatic malaria diagnosis system." In 2013 First RSI/ISM International Conference on Robotics and Mechatronics (ICRoM), pp. 205-211. IEEE, 2013.