12 Mar 2015

Does GCHQ really only use ‘a fraction’ of its data?

Digging into the content of today’s intelligence and security committee, its findings seem at odds with the picture presented by documents from the whistleblower Edward Snowden (pictured below), seen by this programme last year.

The documents showed that GCHQ created a programme called Mastering the Internet. The agency reckoned around 25 per cent of the world’s internet traffic flowed through the UK (there’s a good map here), and Mastering the Internet aimed to give the spies access to as much of it as possible.

The agency set targets for how many gigabytes of information could be accessed, and issued weekly updates on how much had been achieved. It set up special relationship teams to work with the private comms companies that own the fibreoptic cables that carry the traffic.

Edward Snowden Gives First Interview In Russia

Edward Snowden (Getty Images)

The programme was heralded as a massive success. On just one fibreoptic cable, GCHQ was able to access a trillion gigabytes of data per second. (It’s important to note, this is “access” to information, it doesn’t mean GCHQ actually went ahead and pulled in the traffic).

So it is confusing to read the committee’s assertion that: “GCHQ’s bulk interception systems operate on a very small percentage of the bearers [communications companies that run fibreobtic cables] that make up the internet. It cannot therefore realistically be considered blanket interception.”

Perhaps the explanation for the discrepancy is that a small number of bearers could in fact be carrying the majority of the UK’s internet traffic. There seems to be some semantic shenanigans over the definitions of “blanket” and “bulk”.

The ISC then goes on to say that “only a fraction” of the traffic to which GCHQ has access ends up actually being pulled in by GCHQ, and only a “tiny percentage” of what’s pulled in is actually analysed by its officers.

Again, this seems at odds with what the leaked documents show about the GCHQ system called Tempora. It was described in the documents as a “large scale” system which pulled in about 30 per cent of the traffic from a fibreoptic cable and stored the content of messages for three days, and the metadata (who messaged whom, when and where, for example) for a month.

Perhaps the explanation for this discrepancy is that when the ISC talks about “a fraction” of traffic pulled in, they mean a third. As for the “tiny percentage” of traffic that’s actually read by intelligence officers, that makes sense: human eyes could never be expected to work with anything but a miniscule portion of a the trillions of gigabytes to which GCHQ has access.

Follow @geoffwhite247 on Twitter