Deconstructing the Virgin Media Censorship Infrastructure
Recently members of the UK government has been pushing towards increasing censorship of the Internet by forcing IPS to block ‘esoteric websites’, among other things by default. Once these filters have been around for a while, I suspect it will eventually not be possible to ‘opt out’ of the censorship, probably causing a major increase in the purchase of overseas VPNs, and users making themselves less safe by using random proxies for their internet traffic.
The ‘esoteric’ censorship currently isn’t in place for my ISP, but some UK ISPs have already been forced to block requests to specific web sites. I would not be surprised if the same, or similar blocking framework will be used for the next round of censorship, so I have decided to investigate how the censorship of these “copyright infringing” sites currently work.
No DNS changes
I’ve seen comments that some ISPs simply return DNS entries pointed to a server hosting a blocking page ( which is easy to circumvent by using Google DNS or other DNS providers, and should not work if DNSSEC is implemented correctly ), Virgin Media do not do this, the same DNS entries are returned both inside and outside Virgin Media’s network:
From Virgin Media’s DNS server:
thepiratebay.pe. 83685 IN A 194.71.107.27
From Google DNS:
thepiratebay.pe. 19785 IN A 194.71.107.27
As both records are identical we can see they are not changing DNS records.
Hijacking IPs
As Virgin Media are still allowing traffic for blocked sites to hit the correct IP addresses, they route traffic for specific IPs to a server that does a 302 redirect to ‘assets.virginmedia.com’:
* About to connect() to thepiratebay.pe port 80 (#0) * Trying 194.71.107.27... * Adding handle: conn: 0x12755a0 * Adding handle: send: 0 * Adding handle: recv: 0 * Curl_addHandleToPipeline: length: 1 * - Conn 0 (0x12755a0) send_pipe: 1, recv_pipe: 0 * Connected to thepiratebay.pe (194.71.107.27) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.33.0 > Host: thepiratebay.pe > Accept: */* > < HTTP/1.1 302 Found < Location: http://assets.virginmedia.com/site-blocked.html < Content-Type: text/html; charset=UTF-8 * no chunk, no close, no size. Assume close to signal end
HTTPS connections to the site time out via VM, I suppose this is slightly better than returning a certificate for a different domain name, or otherwise trying to intercept HTTPS traffic.
The http://assets.virginmedia.com/site-blocked.html page that users are redirected to serves a HTML page tell you the site you are trying to visit is blocked by court order, and includes some ‘Omniture SiteCatalyst’ javascript which records data about users that hit the blocking page.
Bypassing
Obviously any restrictions put in place can be bypassed using a VPN connection, SSH tunnel or proxy ( or even googling for ‘sitename proxy’ ), making the blocking pointless. but I did find another interesting way to bypass the blocking.
In the HTTP requests to the blocked site, if the ‘Referer’ header is set to ‘assets.virginmedia.com’, then you are actually served the ‘blocked’ site:
> GET / HTTP/1.1 > User-Agent: curl/7.33.0 > Host: thepiratebay.pe > Accept: */* > Referer: assets.virginmedia.com > < HTTP/1.1 200 OK < X-Powered-By: PHP/5.4.21 < Cache-Control: no-store, no-cache, must-revalidate < Cache-Control: post-check=0, pre-check=0 < Pragma: no-cache < Content-Type: text/html;charset=UTF-8 < Transfer-Encoding: chunked * Server lighttpd is not blacklisted < Server: lighttpd
I guess this is to allow VM staff to view the site ( perhaps on their internal version of the blocking page they have a link to visit the site anyway, to allow them to verify if the site should be actually blocked ), but it’s interesting that it works for anyone using the Virgin Media network – meaning something as simple as the RefControl Firefox plugin could be used to circumvent the blocking.