PhD topics

James Cussens: Suggested PhD topics

Learning causal models with latent variables

The graph above (which is known as the Verma graph) represents a causal model where X₂ is a cause of X₄ (via X₃) but both X₂ and X₄ have a common cause L. If L is a latent variable, i.e. not one observed in data, then learning such a causal model is challenging, since we never get to see L! One approach to this difficult learning task is constraint-based learning where an algorithm infers the existence of latent variables from patterns of conditional independence between the observed variables.

An alternative approach is score-based learning where each candidate causal model (including those with latent variables) has a 'score' (typically penalised log-likelihood) and the task is to find a causal model with high (preferably maximal) score. There has been some recent progress in score-based learning for causal models with latent variables much of which has taken advantage of algebraic representations of causal models (see refs). In this existing work only quite small causal models have been learned. The goal of this project would be develop algorithms which can accurately learn larger causal models, even when there are latent variables. Developing algorithms which can incorporate prior knowledge would be advantageous.

References

Zhongyi Hu and Robin Evans. Towards standard imsets for maximal ancestral graphs, March 2023.
Bryan Andrews, Gregory F. Cooper, Thomas S. Richardson and Peter Spirtes. The m-connecting imset and factorization for ADMG models, July 2022.
Rui Chen, Sanjeeb Dash and Tian Gao. Integer Programming for Causal Structure Learning in the Presence of Latent Variables Proceedings of the 38th International Conference on Machine Learning (ICML21), PMLR 139:1550-1560, 2021.
Kari Rantanen, Antti Hyttinen and Matti Järvisalo. Maximal Ancestral Graph Structure Learning via Exact Search, Proc. UAI 2021.

Learning directed acyclic graphs (DAGs)

When there are latent variables (see project above) learning causal models is particularly hard. If there are no latent variables the problem is easier and so there has been more progress. Nonetheless finding the 'best' directed acyclic graph (DAG) to explain some data is an NP-hard problem so it remains an active area of research.

I have been using integer programming to attack this problem using a system called GOBNILP, which has been used by many other researchers. GOBNILP is built on top of the SCIP constraint integer programming system.(A Python version has also been developed.) The basic idea of this approach is to encode the problem of learning a DAG as a constrained optimisation problem that a solver like SCIP (suitably extended) can solve. This has the advantage that constraints representing prior knowledge (e.g. variable A must/mustn't be a cause of variable B) can be incorporated reasonably easily.

Since DAG learning is NP-hard there is always work to be done, e.g. increasing the size of learning problem where we can (reliably) find an optimal DAG in reasonable time, or handling user-supplied constraints more efficiently. The goal of this project is apply ideas from constraint and integer programming to DAG learning. This will suit someone who likes being able to apply theoretical ideas (from e.g. polyhedral geometry, see 2nd reference below) to design more efficient machine learning software.

References

James Cussens. Branch-Price-and-Cut for Causal Discovery Proc. 2nd Conference on Causal Learning and Reasoning (CLeaR 2023), PMLR, 2023.
James Cussens, Matti Järvisalo, Janne H Korhonen and Mark Bartlett. Bayesian network structure learning with integer programming: Polytopes, facets and complexity Journal of Artificial Intelligence Research 58:185-229, 2017.
James Cussens. Bayesian network learning with cutting planes. Proc. 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), 2011.

Combinatorial optimisation in machine learning

There is a growing interest in methods which combine solving combinatorial optimisation (CO) problems and machine learning. In some cases machine learning is used to learn the best strategies for to solving an NP-hard problem. In other cases, CO methods (e.g. integer linear programming (ILP)) are used to solve hard machine learning problems. In yet another combination CO solvers have been included as a layer in a deep learning architecture. The goal of this project is to explore this area further to allow further fruitful cross-fertilisation.

References

Quentin Cappart, Didier Chételat, Elias B. Khalil, Andrea Lodi, Christopher Morris and Petar Velickovic. Combinatorial Optimization and Reasoning with Graph Neural Networks Journal of Machine Learning Research 24(130):1-61, 2023
Bridging the Gap between Machine Learning and Optimization
Dimitris Bertsimas and Jack Dunn. Optimal classification trees Machine Learning 106:1039-1082, 2017
Claudio Gambella, Bissan Ghaddar and Joe Naoum-Sawaya. Optimization problems for machine learning: A survey, European Journal of Operational Research 290(3):807-828, 2021.