The graph above (which is known as the *Verma graph*)
represents a causal model where X_{2} is a cause of
X_{4} (via X_{3}) but both X_{2} and
X_{4} have a common cause L. If L is a *latent*
variable, i.e. not one observed in data, then learning such a
causal model is challenging, since we never get to see L! One
approach to this difficult learning task is *constraint-based
learning* where an algorithm infers the existence of latent
variables from patterns of conditional independence between the
observed variables.

An alternative approach is *score-based learning* where
each candidate causal model (including those with latent
variables) has a 'score' (typically penalised log-likelihood) and
the task is to find a causal model with high (preferably maximal)
score. There has been some recent progress in score-based learning
for causal models with latent variables much of which has taken
advantage of algebraic representations of causal models (see
refs). In this existing work only quite small causal models have
been learned. The goal of this project would be develop algorithms
which can accurately learn larger causal models, even when there
are latent variables. Developing algorithms which can incorporate
prior knowledge would be advantageous.

- Zhongyi Hu and Robin Evans. Towards standard imsets for maximal ancestral graphs, March 2023.
- Bryan Andrews, Gregory F. Cooper, Thomas S. Richardson and Peter Spirtes. The m-connecting imset and factorization for ADMG models, July 2022.
- Rui Chen, Sanjeeb Dash and Tian Gao. Integer Programming for Causal Structure Learning in the Presence of Latent Variables Proceedings of the 38th International Conference on Machine Learning (ICML21), PMLR 139:1550-1560, 2021.
- Kari Rantanen, Antti Hyttinen and Matti Järvisalo. Maximal Ancestral Graph Structure Learning via Exact Search, Proc. UAI 2021.

When there are latent variables (see project above) learning causal models is particularly hard. If there are no latent variables the problem is easier and so there has been more progress. Nonetheless finding the 'best' directed acyclic graph (DAG) to explain some data is an NP-hard problem so it remains an active area of research.

I have been using *integer programming* to attack this
problem using a system
called GOBNILP,
which has been used by many other
researchers. GOBNILP is built on top of
the SCIP constraint integer
programming
system.(A Python
version has also been developed.) The basic idea of this
approach is to encode the problem of learning a DAG as a
constrained optimisation problem that a solver like SCIP
(suitably extended) can solve. This has the advantage that
constraints representing prior knowledge (e.g. variable A
must/mustn't be a cause of variable B) can be incorporated
reasonably easily.

Since DAG learning is NP-hard there is always work to be done, e.g. increasing the size of learning problem where we can (reliably) find an optimal DAG in reasonable time, or handling user-supplied constraints more efficiently. The goal of this project is apply ideas from constraint and integer programming to DAG learning. This will suit someone who likes being able to apply theoretical ideas (from e.g. polyhedral geometry, see 2nd reference below) to design more efficient machine learning software.

- James Cussens. Branch-Price-and-Cut for Causal Discovery Proc. 2nd Conference on Causal Learning and Reasoning (CLeaR 2023), PMLR, 2023.
- James Cussens, Matti Järvisalo, Janne H Korhonen and Mark Bartlett. Bayesian network structure learning with integer programming: Polytopes, facets and complexity Journal of Artificial Intelligence Research 58:185-229, 2017.
- James Cussens. Bayesian network learning with cutting planes. Proc. 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), 2011.

There is a growing interest in methods which combine solving combinatorial optimisation (CO) problems and machine learning. In some cases machine learning is used to learn the best strategies for to solving an NP-hard problem. In other cases, CO methods (e.g. integer linear programming (ILP)) are used to solve hard machine learning problems. In yet another combination CO solvers have been included as a layer in a deep learning architecture. The goal of this project is to explore this area further to allow further fruitful cross-fertilisation.

- Quentin Cappart, Didier Chételat, Elias B. Khalil, Andrea Lodi, Christopher Morris and Petar Velickovic. Combinatorial Optimization and Reasoning with Graph Neural Networks Journal of Machine Learning Research 24(130):1-61, 2023
- Bridging the Gap between Machine Learning and Optimization
- Dimitris Bertsimas and Jack Dunn. Optimal classification trees Machine Learning 106:1039-1082, 2017
- Claudio Gambella, Bissan Ghaddar and Joe Naoum-Sawaya. Optimization problems for machine learning: A survey, European Journal of Operational Research 290(3):807-828, 2021.