Dr Daniel Lawson

CLARITY

I am very proud of our work on CLARITY - short for Comparing simiLARITY matrices. This developing methodology is for comparing anything, to anything. As you will see in the Applications, we have already explored comparing Genes to Language, Culture to Economics, and Methylation biomarkers to Gene Expression.

You can compare anything to anything, as long as you can measure them both for the same set of items. The approach works by decomposing a similarity matrix into a "structure" and "relationship" in one dataset - a sort of soft clustering - and asking which elements of the structure are present in a second dataset.

Bayesian Epidemic Modelling

I have been part of a University-wide team to apply high quality modelling to understand and predict the Epidemic. Our released work addressed bed capacity modelling in the South-West of England but of course out interests are much wider.

At the institute of Statisticial Sciences, our main role is to improve the quality of statistical tools that can be deployed in practice. To this end I'm leading a team of Undergraduate students exploring the application of Machine Learning tools to learn summary statistics using Approximate Bayesian Computation.

Bayesian Clustering

For a long time my research has used Bayesian Clustering to understand Genetics but lately I have been exploring the relationship between the now-standard approach to clustering we deployed in FineSTRUCTURE to that of the Stochastic Block Model and its many variants. With Prof Patrick Rubin-Delanchy I am looking into how Spectral methods can be used to perform clustering-like tasks, as described in CLARITY above. With Prof Robert Allison I am exploring more model-based approaches.

FineSTRUCTURE

FineSTRUCTURE is a whole pipeline that deserves, and has, its own FineSTRUCTURE website. It is a sophisticated modelling tool that uses Data Science ideas - of identifying computational questions that can be answered, and wrapping them up in a statistical modelling framework that means something. The FineSTRUCTURE algorithm was developed in 2012 but is still the most accurate way to estimate fine-scale variations in Ancestry.

High Profile applications include:

Population genomics of the Viking world in Nature,
The fine-scale genetic structure of the British population in Nature,
Genomic analyses inform on migration events during the peopling of Eurasia in Nature,
The UK10K project identifies rare variants in health and disease in Nature,

Genomic Architecture

Genomic Architecture is a description of how the whole genome comes together to construct a complex trait, such as height, education, body-mass-index, and so on. The relationship is extremely rich and of course depends on all sorts of variables such as cultural practice, personal circumstances, and so on.

My work focusses on population structure and how this has confounded previous analyses, as well as methods to limit this confounding. Key outputs include:

A tool called PCAPRED to correct small studies for population structure in the UK Biobank.
I have built Simulation tools to understand the effect of traits evolving prior to the Out-of-Africa Bottleneck.

I contributed to a key review in Nature Genetics Reviews called "Genetic architecture: the shape of the genetic contribution to human traits and disease".

badMIXTURE

badMIXTURE is an important tool to compare the output of some claimed mixture to another dataset that may or may not show this mixture. It works by comparing mixtures generated using genome-wide unlinked markers (with tools such as ADMIXTURE) to results from FineSTRUCTURE above. These are theoretically the same if the mixture is true.

badMIXTURE is published under the title A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots. As always, it turned out to be important to understand the details of what the models were doing in order to make the software appropriate for the complexity that is genetic data.

badMIXTURE is the spiritual precursor to CLARITY, which expands this idea to a much wider range of models.

Cultural dynamics

Did you know that Religious change preceded economic change in the 20th century? Damian Ruck wrote this up in The Conversation.

We also established the Cultural prerequisites of socioeconomic development by structuring the changes into a coherent model.

In both cases this uses a large worldwide dataset consisting of several time-points, hundreds of countries and millions of questionaire results. Sense making is done through dimensionality reduction to understandable variables, which can be modelled with Time Series methodology.

Wind Energy market models

We want Renewable energy to replace conventional fossil fuels. But how can governments use markets to make this happen? In Performance comparison of renewable incentive schemes using optimal control we showed that there are real implications to the choices made in market manipulation - for the same amount of support given to the industry, some schemes are markedly better than others!

Historical Dynamics

A real out-there application of Mathematics is Historical Dynamics. We found that Apparent strength conceals instability in a model for the collapse of historical states. We tried very hard to make "qualitative data sets" from history, to assess whether our mathematical model was making consistent predictions or not.

The implications are truly fascinating: the empires and great states of the past may have failed not because of some external accident or event, but simply because human nature (game theory) says that Human political systems will evolve to an unstable tipping point!

Data Science Toolbox

Data Science Toolbox is a truly unique experience. It contains everything a mathematician needs to know to do Data Science. Tought as part of the MSc Mathematics of Cybersecurity program, it is carefully integrated to use Cyber Security examples throughout, ensuring that students learn their data and models as well as the core Data Science that is needed to analyse it. We cover everything from Exploratory Data Analysis to calibrated models, Statistics to Machine Learning, R studio to High Performance Parallel processing.

There is no doubt that Data Science is a hard topic that needs both Big Picture and gory details to be implemented and understood correctly. This 2-semester course lets students do exactly that.

Teaching-level Talks:

See my Talk at the LMS Summer School 2025

OFFICE

Daniel Lawson
School of Mathematics
University of Bristol,
Fry Building, GA.06
Woodland Road
Bristol, BS8 1UG.

Tel: +44 (0) 117 456 0044

EMAIL

dan.lawson [at] bristol.ac.uk

Alternative email address: danjlawson2000 [at] yahoo [dot] com

DATA SCIENCE METHODOLOGY

GENETICS

DATA SCIENCE APPLICATIONS

TEACHING

OPPORTUNITIES

CONTACT

OFFICE

EMAIL