I’m currently studying for a Master of Philosophy in Technology Policy, at the University of Cambridge. I’m excited to build on my technical understanding of AI, and take a more holistic look at how AI will affect business and society!
In general, I’m interested in how emerging technologies will change the world we live in, and how they can best be integrated with society. These questions touch on technical, economical, and policy issues, so I’m actively expanding my horizons on all these fronts.
I was previously a Research Assistant at the Computer Science Department of the University of Oxford, supervised by Daniel Kroening and Tom Melham. I worked on making robust Machine Learning systems, how we can verify this robustness, and how we can simplify or explain a system’s behaviour.
MPhil Technology Policy, 2021
Master of Computer Science, 2019
1st Class Honours
University of Oxford
BA Computer Science, 2018
1st Class Honours
University of Oxford
Existing methods for testing DNNs solve the oracle problem by constraining the raw features (e.g. image pixel values) to be within a small distance of a dataset example for which the desired DNN output is known. But this limits the kinds of faults these approaches are able to detect. In this paper, we introduce a novel DNN testing method that is able to find faults in DNNs that other methods cannot. The crux is that, by leveraging generative machine learning, we can generate fresh test inputs that vary in their high-level features (for images, these include object shape, location, texture, and colour). We demonstrate that our approach is capable of detecting deliberately injected faults as well as new faults in state-of-the-art DNNs, and that in both cases, existing methods are unable to find these faults.
We cannot guarantee that training datasets are representative of the distribution of inputs that will be encountered during deployment. So we must have confidence that our models do not over-rely on this assumption. To this end, we introduce a new method that identifies context-sensitive feature perturbations (e.g. shape, location, texture, colour) to the inputs of image classifiers. We produce these changes by performing small adjustments to the activation values of different layers of a trained generative neural network. Perturbing at layers earlier in the generator causes changes to coarser-grained features; perturbations further on cause finer-grained changes. Unsurprisingly, we find that state-of-the-art classifiers are not robust to any such changes. More surprisingly, when it comes to coarse-grained feature changes, we find that adversarial training against pixel-space perturbations is not just unhelpful: it is counterproductive.
Policies trained via Reinforcement Learning (RL) are often needlessly complex, making them more difficult to analyse and interpret. In a run with n time steps, a policy will decide n times on an action to take, even when only a tiny subset of these decisions deliver value over selecting a simple default action. Given a pre-trained policy, we propose a black-box method based on statistical fault localisation that ranks the states of the environment according to the importance of decisions made in those states. We evaluate our ranking method by creating new, simpler policies by pruning decisions identified as unimportant, and measure the impact on performance. Our experimental results on a diverse set of standard benchmarks (gridworld, CartPole, Atari games) show that in some cases less than half of the decisions made contribute to the expected reward. We furthermore show that the decisions made in the most frequently visited states are not the most important for the expected reward.