This blog post is part of a series investigating different demographics and uses of mySociety services. You can read more about this series here

The FixMyStreet section of the Explorer mini-site helps explore the relationship between demographic features and FixMyStreet reports.

In one use case, it maps the location a report was made to a ‘neighbourhood’ sized area, and then in turn to sets of statistics measured against those areas — most importantly, the indices of multiple deprivation.  These areas are Lower Super Output Areas (LSOAs) in England and Wales, Data Zones (DZ) in Scotland and Lower Output Areas (LOA) in Northern Ireland (although NI is not covered separately in the Explorer site due a relative lack of data). These can be seen as equivalent to census tracts in the US and each LSOA has a population of around 1,500 people, while Data Zones have around 500-1000 people.

While this statistical unit feels neighbourhood-sized and so is used to examine data for effects that may result from being in the same neighbourhood, the approach has the significant problem that what people on the ground perceive as their “neighbourhood” is unlikely to exactly overlap onto the statistical unit. On the edge of a LSOA, even a 50m radius around a home will cross into another statistical area.

Making the problem worse is that the idea of a neighbourhood is very variable. People can disagree with each other about the boundaries of their area. Claudia Coluton, Jill Korbin, Tsui Chan, Marilyn Su (2001) found that when citizens were asked to draw the boundaries of their neighbourhood these very rarely aligned with US census tracts. As the gif in this tweet shows a set of citizen-drawn boundaries for Stoke Newington in East London, and while there is a clear core, there is substantial disagreement between residents about the size of this area.

Laura Macdonald, Ade Kearns and Anne Ellaway (2013)  found that residents in West Central Scotland had a different perception of how well placed they were for ‘local’ amenities compared to the geographic distance. This reflects that what was viewed as local from the outside might not be viewed the same way by locals: there is a context gap that just cannot be bridged at this scale of analysis.

Understanding of neighbourhood effects is often positioned in terms of guardianship of a home area, and this means that certain kinds of reports might be more apparent in areas where these boundaries are less clear — leading to conflict. Joscha Legewie and Merlin Shaeffer (2016) used New York 311 calls to demonstrate that complaints about blocked driveways, noise from neighbours and drinking in public were more frequent on the boundaries of areas with differing demographics. This can also be seen in the idea that complaints about dog fouling are used for score-settling between neighbours in Chicago. Complaints can be about conflicts as well as actual problems reported.

In a related problem, Alasdair Rae and Elvis Nyanzu show in some areas the most deprived 10% of areas and the least deprived 10% are not far from each other. This means that relationships between reports and the features of deprivation might be harder to detect. The less homogenous the area, the greater the chance that features affecting how likely a person is to report will result in reports in a LSOA that is substantially different from their ‘home’ area.

This blog post is exploring a potential problem with the explorer minisite methodology. A big part of what the explorer site is doing is trying to show how much different kinds of reports are “explained” by different local features — but because of various forms of fuzziness the differences it detects may be less sharp than actually exists. In general, however, not detecting things that are there is a better problem to have than the opposite.

