Sunday, 27 June 2010

Hackney statistics - don't try this at home

Last week Blood and Property published this story: Hackney's 1600 'ghost' workers and 28 'ghost' claimants

The Office of National Statistics and Hackney Council both supplied explanations for why these figures were not useful.

The statement from the ONS says that the reasons for the discrepancies are 'horribly teccy' so I may have misunderstood them. But from what I did understand, the discrepancy of 28 'ghost' claimants was explained, but I'm still not clear on the 2% leap in the size of the Hackney North working population.

So, what did Hackney Council say?

In response to the claim that there were 28 'ghost' Job Seekers Allowance claimants in Hackney following boundary changes, Hackney Council sent these clipped comments:

Sentence 1: "Data showing total number of claimants for Hackney borough boundaries based on 1991 ward boundaries, however parliamentary constituency data based on recent 2010 revisions."

Sentence 2: "Hence difference in claimant numbers for Hackney vs Hackney constituencies."

Sentence 3: "Hoped that ONS will update JSA claimant numbers (for Hackney) from 1991 boundaries soon."

What I think this means: The ONS separates its data into different categories:

JSA claimants per parliamentary constituency
JSA claimants per local authority

And when the ONS calculates the number of JSA claimants for the whole of Hackney it does so by looking at the borough's 1991 ward boundaries.

However, when the ONS calculates the number of JSA claimants for Hackney's two parliamentary constituencies, Hackney North and Hackney South, it calculates the number of claimants using the latest 2010 revisions.

Before the 2010 constituency changes, figures for Hackney North and Hackney South added up to equal the borough total. But they didn't in March this year. There were 28 fewer claimants across the borough when adding Hackney North and Hackney South than there were from the ONS borough-wide figure.

The latest figures show a discrepancy of 20 claimants between the combined total of Hackney North and South and the ONS local authority total. So it is not a one-off glitch. The new constituency figures show a rosier view - fewer JSA claimants.

So the explanation for the discrepancy is this difference between using 1991 and 2010 areas - this is explained in greater detail by the ONS below.

Then Hackney Council took a look at the claim that 1600 ghost workers appeared in Hackney North after the boundary changes:

Sentence 1 (I think this is just the Council describing what I did): "Based on extrapolated working age population combined with averaging monthly data."

Sentence 2 (What the council thought of extrapolation): "Caution against this: constituency rates use mid-2007 population estimates as denominators, whereas borough rates use mid-2008 population estimates as denominators - data on population size and economic activity based on surveys and estimates, always have certain error term."

Sentence 3: "ONS population estimates in Hackney likely underestimating population compared to administrative data (eg via GP surgeries) and GLA population estimates."

I think this is saying that my attempt to compare the population sizes pre and post constituency boundary change, is dodgy. Which is probably true but I don't think that this bit of the council response explains why. Here the council 'caution' against comparing borough data with constituency data. But I didn't use borough-wide data to come up with the 1630 ghost worker figure, just constituency data.

I used the number of JSA claimants in Hackney North - which in January 2010 was 4,402 supplied by ONS. The ONS also say what percentage of the working population this is. In this case, Jan 2010, it was 6.3%. So, using these figures, the Hackney North working population was 69,873 - (4,402/0.063=69,873).
and, in..
February 2010 - 4,450 (6.4%) - (4,450/0.064=69,531)
March 2010 - 4,336 (6.2%) - (4,336/0.062=69,935)
April 2010 - 4,727 (6.3%) - (4,727/0.063=75,031)
May 2010 - 4,637 (6.2%) - (4,637/0.062=74,790)

For Hackney South:
January - 5503 (7.6%) - (5503/0.076)=72,407)
February - 5,594 (7.7%) - (5,594/0.077=72,649)
March - 5,510 (7.6%) - (5,510/0.076=72,500)
April - 4,908 (7.1%) - (4908/0.071=69,126)
May - 4,959 (7.2%) - (4,959/0.072=68,875)

The discrepancy appeared between February and March 2010 when the working population in Hackney North increased by about 5000 people. Meanwhile the Hackney South working population shrunk by about 3300. It seemed fair to assume that, because these changes were caused by constituency boundary changes, that any increase in the north should have been mirrored by a decrease in the south.

But the figures show that while Hackney North's working population increased by 5000, about 7%, 1600 of these did not come from Hackney South. An unexplained 2%. I'm not sure the council has answered this question (but I'm not sure!).

Then there's the ONS answer. This seems to confirm the Council's explanation about the 28 'ghost' JSA claimants. But I couldn't quite identify the explanation for the 1600 mystery increase in Hackney North's working population - but that doesn't mean it wasn't there. I suspect it is in the bold sentence at the end. But the bold sentence at the end is only for bold people.

ONS: "I am afraid there are some horrible teccy reasons, regarding the way in which figures for geographic areas are created, how systems change over time, when published population estimates get incorporated into statistical sources etc that are behind these discrepancies - but I will try to explain some of it.

"Firstly, the JSA figures. The older system used for aggregating JSA data to higher levels did so by aggregating to a layer of ward geography that was frozen at a point in time - and then aggregating from these wards to higher geography levels. This is the way in which the local authority JSA counts are created and the way in which the old parliamentary constituencies are created. So for JSA, LA and old PC were consistent.

"However, this method is not considered to be 'best practice' for currently produced aggregation - this is because ward boundaries change, so the frozen wards are out of date and no longer align with a real geography and a few other technical issues. The current best practice is to aggregate to a static statistical building block, known as output areas, which were designed using data relating to the last census. There are around 34,000 of the Lower Layer Super Output Areas (LSOA) - they don't necessarily exactly align with existing geography boundaries - but they are static, of similar size and provide a 'consistent' building block for statistical purposes. They also don't change, get re-used, get used on a temporary basis etc like a geography such as postcodes does. The idea is that if all statistics for all geographies are built from these building blocks, then they may not be precisely accurate with all those geography boundaries - but if everyone does it in the same way, they will be inaccurate in consistent ways and give comparable statistics. The problem is that people are only moving to this best practice as and when new system and requirements are built, simply because of the inefficiency of rewriting all the numerous existing systems in place.

"So - for JSA, LA and old PC are both created from aggregating frozen ward based information. For the 2010 Parliamentary Constituencies they are built on the new basis from LSOAs. So the LA and PC2010 are built on a slightly different geographic building block basis.

"The other figures that you quote will have similar although slightly different issues involved............including population estimate calculation being optimised to calculate LA level figures, which are later modelled to LSOAs, which are then used to construct PCs; employment estimates being based on a combination of direct postcode to LA lookups and PCs constructed from LSOAs; and an added complicaton that survey estimates only reweight to population estimates once a year, so the latest survey estimates are currently based on last year's published population estimates rather than the most recently published."

A character building sentence!


  1. I m glad that I went through your blog and I would like to congratulate you on your good work..

    Term papers

  2. Thank you... although I'm not sure I believe you.