How The Internet Knows Everything About You

Note: This post was written in 2014.

Visiting a website seems simple doesn’t it – open a browser, type in the address and watch it load. Perhaps you are the organised sort and like to store sites in your bookmarks – one click and there you go. What most users don’t know, however, is that when you’re visiting one website, more often than not you’re actually visiting several.

The invaluable firefox plugin Lightbeam helps provide a visual synonpsis of these hidden website visits.

Let’s say I visit a website, for example popular news website http://www.bbc.co.uk/. The homepage loads and I’m met with news summaries and articles and lots of images and links. My address bar reads: http://www.bbc.co.uk/ so that’s where I’m visiting, right? Yes, my computer is connected to bbc.co.uk, but according to Lightbeam I am also connected to the following URLS (this review was also done with the firefox addon AdBlock Plus installed:

  • bbc.com
  • bbci.co.uk
  • bbcimg.co.uk
  • effectivemeasure.net
  • gemius.pl
  • googletagservices.com
  • 2o7.net
  • scorecardresearch.com

Some of the URLs seem self explanatory: bbc.com, bbci.co.uk and bbcimg.co.uk are used to store various scripts and images that power the main site. But what about the rest?

effectivemeasure.net – data collection agency, using data for marketing purposes:

“On our journey, we collected a large amount of data on an amazing range of audiences across the emerging markets, such as media consumption, site activity, consumer profile, sentiment and purchasing intent.”

A good chunk of that data comes from bbc.co.uk users.

gemius.pl – Polish site of a digital consultancy. Here’s what they say in their mission statement:

“We run our studies basing on our own innovative and complete methodology and technological infrastructure. We employ around 300 experts, including IT specialists, mathematicians & researchers, who are always ready to support your business. Monthly, we analyze 300 billion events occurring in the virtual reality i.e. page views and clicks made by the internet users!”

Yes, that’s an exclamation point on their mission statment. Classy. Some of those “page views” and “clicks” are coming from you when you use bb.co.uk.

googletagservices.com – It’s never long before stumbling upon a Google product when performing an analysis of internet privacy. Google tag services is part of Google’s Tag Manager service:

“Google Tag Manager is free and easy, leaving more time and money to spend on your marketing campaigns. You manage your tags and configure your mobile applications yourself, with an easy-to-use web interface, rather than forcing you or your IT department to write or rewrite code.”

It is very kind of them to share such a useful feature for free. Oh, wait, in return Google tracks all user behaviour on bbc.co.uk.

2o7.net – This site redirects to Abode’s hosted analytics service. The premium service purports to:

“enable companies to personalize and improve the performance of their websites, apps, social networking pages, and marketing activities. Adobe services collect and analyze information, such as clicks made by visitors when they use a company’s websites, apps, or social networking pages or view marketing emails or advertisements.”

So our use of bbc.co.uk is also being tracked and stored by Adobe. In answer to the question “Does Adobe share my personal information”, found on the site’s privacy policy page, we read YES:

“We may share or publish aggregate information that doesn’t specifically identify you, such as statistical information about visitors to our websites or statistical information about how customers use our applications and hosted services.”

You may not be referred to as “Tom Smith” but your actions on bbc.co.uk as USER281748696084 are logged, tracked and stored by a multitude of companies, many of which operate outside of the UK, the country in which the BBC and many of its website users reside.

scorecardresearch.com – Scorecard Research is, you’ve guessed it, another market research company

“ScorecardResearch conducts research by collecting Internet web browsing data and then uses that data to help show how people use the Internet, what they like about it, and what they don’t…..participating websites agree to deploy a special code throughout their sites. Again, no personally identifiable information is ever transmitted by, or linked to, the web tags.”

There’s that reference to “personally identifiable” information again. So it’s ok, we’re all anonymous then? Unfortunately, that’s not the case.

How Websites Know Who We Are

When you access the internet, you’ll be using an IP (Internet Protocol) address. Each IP address is unique, so even if you share a broadband router with other members of your household, once you connect to the internet you will get a unique IP address.

IP addresses are random numbers, so what’s the problem? An IP address can tell websites (and other websites tracking those websites):

– Your approximate location

– Your ISP (Internet Service Provider) aka those people you pay too much to every month for poor download speeds.

Your ISP knows you name and address and can (and do) give that information to police organisations when requested.

When you visit our example site, bbc.co.uk, the BBC, Adobe, Google and all those other companies store your IP address and correlates it with your user behaviour.

It’s ok, I delete my cookies!

Actually, it’s not. Even if you delete your cookies every 5 minutes, your IP address remains the same and software can easily piece together your different cookie sessions, providing a complete collection of your user data.

But it’s just one website, what’s the problem?!

If each website tracked and stored your ‘anonymous’ user data discretely, then there wouldn’t be much of a problem, but this is not the case. Most of us use more than one website. Let’s visit theguardian.com and see what Lightbeam shows us:

the guardian bbc lightbeam comparison

The little island off to the left is this site, internetfolks.com. See how we’re not tracking any data. The large ring of triangles around the “g” logo from The Guardian shows how many third party cookies this site uses. This example makes The BBC look tiny in comparison.

Here is a list of all the third party websites theguardian.com makes you visit without your knowledge:

  1. facebook-web-clients.appspot.com
  2. guardian-notification.appspot.com
  3. optimizely.com
  4. quantservice.com
  5. googleadservices.com
  6. guardianapps.co.uk
  7. chartbeat.com
  8. chartbeat.net
  9. googleapis.com
  10. ajax.googleapis.com
  11. google.com
  12. imrworldwide.com
  13. outbrain.com
  14. mathtag.com
  15. wunderloop.net
  16. dqwufkbc3sdtr.cloudfront.net
  17. ophan.co.uk
  18. stumbleupon.com
  19. guardianapis.com
  20. giurn.co.uk
  21. twitter.com

and now for some familiar faces:

2o7.net

and

scorecardresearch.com

As you can see in the Lightbeam diagram above, bbc.co.uk is connected with theguardian.com by 2 third party sites: Adobe’s analytics service via 2o7.net and market research company scorecardresearch.com

Now Adobe and Scorecard Research know everything you do not only on bbc.co.uk, but also on theguardian.com.

Let’s visit YouTube.com:

youtube lightbeam

As you can see, the network is growing. If you check Lightbeam after a day of normal web usage you will be astounded by the amount of data these tracking companies collect about you. And it’s not just commercial sites that do it either: .gov websites do it too.

The interconnectivity of seemingly distinct websites and major dominators like Google mean that a lot of user data is stored by single companies. Google alone has access to the most data because of the startling number of tracking scripts and cookies they so ‘freely’ distribute. From fonts to website analytics to a search engine to online videos – it’s almost impossible to visit a single website and not come across a Google cookie.

But they don’t know my name!

Oh yes they do. Aside from the fact that your ISP can give your name (or the name of the person on your Internet bill) out if required, all you need to is register just 1 website with your name and/or address and they will all know it. Got a gmail or hotmail or yahoo email account? Perhaps you buy on eBay or Amazon? Or maybe you file your tax returns online. The data holders only need to cross reference your data as USER 357876969696 with a few other sites to find out who you are and where you live. And if you are an honest user of any social netowrk like facebook or LinkedIn, they also know where you work, who you live with, what you like, what you will buy….the list goes on. In fact, powerful market research companies today can predict your actions more accurately than your partner can.

What’s the problem? Apart from the fact that nobody likes to be spied on, the misuse of your personal usage data can cause a lot of harm. Crime predicting technologies are being developed to rival the pre-cog wailing in Minority Report. At the moment, they are limited to geographical crime data, but what happens when they start tracking online criminal behaviour? If you watch a Dexter marathon on Netflix, will you go on some sort of list?

prism

As we know with the recent revelations about the PRISM surveillance program, governments don’t like to miss out on some free data collection. Couple your website history with mobile phone records and your bank account and there’s very little left to the imagination when it comes to tracking people enmass. If the Internet knows everything about you, so does the government.

We can take action

Privacy and anonymity are not lost causes, but require web users to be more observant. Remember the basic premise of the Internet:

If it’s free, YOU are the product

From social networks to search engines, the web’s ‘free’ services have a hidden cost to your individual privacy. Use the, enjoy them, but don’t think for a second that you are anonymous or secure. If you’re happy for the government, Google and your next door neighbour to know it, put it online. If not, keep it to yourself.

Remember that your browser (Mozilla Firefox, Internet Explorer, Google Chrome etc) knows EVERYTHING about your website usage and those companies will share that data with marketing companies, government and anyone that wants it. Those with technical know-how are encouraged to pioneer a truly private and secure browser than is either paid for or runs on donations much like wikipedia does.

Make the effort to delete your browser cookies as often as possible, even though your data can still be tracked. You can also set your website browser to block third party cookies, although many websites will not allow you to do this.

lavabit

Some pioneers are taking a stand against information disclosure, such as Ladar Levison, creator of the now defunct secure email service Lavabit. Levinson refused to disclose user data to the government (who allegedly were requesting Edward Snowden’s email correspondance) and instead shut down the service after being threatened with fines. A new encryption service from the Dark Mail technical alliance looks set to reanimate privacy for email customers.

The Internet is a remarkable tool; while it can be used by those in positions of power to monitor and control, it also posseses an undeniably democratic function. Anything is possible.

Want to browse the web privately? We recommend using the Brave browser. Download Brave for free and help support this site.