One goal of our Climate Action Plans Explorer is to make it easier for good ideas around cutting carbon to be shared and replicated between local areas. For this to happen, the service should be good at helping people in one area identify other areas that are dealing with similar situations or problems.
Currently the climate plans website shows the physical neighbours on a council’s page, but there’s every chance that councils are geographically close while being very different in other ways. We have been exploring an approach that identifies which authorities have similar causes of emissions, with the goal that this leads to better discovery of common approaches to reducing those emissions.
The idea of automatically grouping councils using data is not a new one. The CIPFA nearest neighbours dataset suggests a set of councils that are similar to an input authority (based on “41 metrics using a wide range of socio-economic indicators”). However, this dataset is not open, only covers councils in England rather than across the UK, and is not directly focused on the emissions problem.
This blog post explores our experiment in using the BEIS dataset of carbon dioxide emissions to identify councils with similar emissions profiles. A demo of this approach can be found here (it may take a minute to load).
Using the ‘subset’ dataset in the BEIS data (which excludes emissions local authorities cannot influence), we calculated the per person emissions in each local authority for the five groupings of emissions (Industry, Commercial, Domestic, Public Sector and Transport). We calculated the ‘distance’ between all local authorities based on how far they differed within each of these five areas. For each authority, we can now identify which other authorities have the most similar profile of emissions.
We also wanted to use this data to tell more of a story about why authorities are and are not similar. We’ve done this in two ways.
The first is converting the difficult to parse ‘emissions per type per person’ number into relative deciles, where all authorities are ranked from highest to lowest and assigned a decile from one to ten (where ten is the highest level of emissions). This makes it easy to see at a glance to see how a council’s emissions relate to other authorities. For instance, the following table shows the emissions deciles for Leeds City Council. This shows a relatively high set of emissions for the Commercial and Public Sector, while being just below average for Industry, Domestic and Transport.
|Emissions type||Decile for Leeds City Council|
|Industry Emissions Decile||4|
|Commercial Emissions Decile||7|
|Domestic Emissions Decile||4|
|Public Sector Emissions Decile||9|
|Transport Emissions Decile||4|
The second story-telling approach is to put easy-to-understand labels on groups of councils to make the similarities more obvious. We’ve used k-means clustering to try and identify groups of councils that are more similar to each other than to other groups of councils. Given the way that the data is arranged, there seemed to be a sweet spot at six and nine clusters, and as an experiment we looked at what the six clusters looked like.
How ‘Urban Mainstream’ Industrial emissions differ from other local authorities using a raincloud plot.
Using tools demonstrated in this jupyter notebook, we looked at the features of the six clusters and grouped these into three “Mainstream” clusters (which were generally similar to each other but with some difference in features), and three “Outlier” clusters, which tended to be smaller, and much further outside the mainstream. Reviewing the properties of these labels, these were relabelled into six categories that at a glance gets the broad feature of an area across.
|Label||Description||Authority count||Lower Tier Land Area %||Lower Tier Population %|
|Urban – Mainstream||Below average commercial/industry/transport/ domestic emissions. High density.||165||14%||45%|
|Rural – Mainstream||Above average industry/transport/domestic emissions. Low density.||122||44%||26%|
|Urban – Commercial||Above average commercial/public sector, below average domestic/transport. High density.||66||4%||22%|
|Rural – Industrial||Above average transport/ industry/ domestic emissions. Low density.||43||37%||7%|
|Urban – High Commercial||Very high commercial/public sector emissions.||7||1%||2%|
|High Domestic Counties||Very above average domestic and transport emissions in county councils.||2||–||–|
This data is not tightly clustered, and the number of clusters could be expanded or contracted, but six seemed to hit a good spot before there were more clusters that only had a small number of authorities. The map below shows how these clusters are spread across the country. This map uses an exploded cartogram approach, where authorities with larger populations appear bigger. The authority is then positioned broadly close to their original position (so the blank space has no meaning).
Joining these different approaches allowed us to build a demo where for any local authority, you can get a short description of the emissions profile and cluster, and identify councils that are similar. This demo can be explored here (it may take a minute to load). The description for Croydon looks like this:
This is not the only possible way of crunching the numbers.
For example, the first thing we did was adjust the emissions data to be per person. This helps simplify comparison between areas of different sizes, but how many people are in an area is information that is relevant in helping councils find similar councils.
The BEIS data also breaks down those five categories by more variables (which might better separate agriculture from other kinds of industry for instance): an alternative approach could make more use of these.
We are considering using multiple different measures to help councils explore similar areas. This could include situation features like flood risk, deprivation scores and EPC data on household energy efficiency, but could also include, for example, that some councils are more politically similar to each other, and may find it easier to transfer ideas.
The datasets and processing steps are available on GitHub.
Image: Max Böttinger