The amount of information we leave behind when using the Internet and technologies originating from it is quite surprising to an average user. I don't think I could effectively cover all of it, but here's some of the knowledge I've gathered from work in ethical hacking.
Le Browser
Fingerprint:
A browser (as others have mentioned) has a fingerprint through which a decent amount of data can be recovered:-
- Browser Toolkit (Often the browser itself) with a version.
- The Host System OS Information
- Info on the expected return type, supported compression methods, etc.
- And of course the IP.
Observe two such fingerprints below:-
Firefox: (v22.0)
root@kali:~# nc -lvp 80
listening on [any] 80 ...
GET / HTTP/1.1
Host: 192.168.1.9
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1
Connection: keep-alive
Google Chrome: (v27.0.1453.116 m)
root@kali:~# nc -lvp 80
listening on [any] 80 ...
GET / HTTP/1.1
Host: 192.168.1.9
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cookies
Tracking cookies are all to famous for me to explain here. I found the two following references adequate to cover the basics of the same.
- http://www.prontomarketing.com/files/2012/07/WHITEPAPER-The-Myth-Of-Accurate-Conversion-Tracking-Using-Google-Analytics-Summary-Ver-1.pdf
- http://www.postaffiliatepro.com/features/tracking-methods/
Client sided script based tracking
Javascript for example runs on the client side, and while it can't open a /bin/sh backdoor to your system or access files, it can request pages, etc using AJAX. Since it's on your local network, it can access intranet hosts. This can have a number of applications that an attacker can exploit depending on exact scenarios (find which router they use, get license keys, access identifying info stored on the LAN).
While the exact reference for the same seems to have 404'd, please use the following as a POC reference. http://code.google.com/p/jslanscanner/
Click Jacking
Although using things like your camera, microphone or built-in geo location tracking are supposed to require explicit user permission, click jacking is one of the vectors that an attack can exploit to get you to bypass this security measure.
Documented uses include:
- Tricking users into enabling their webcam and microphone through Flash
- Tricking users into making their social networking profile information public
- Making users follow someone on Twitter
- Sharing links on Facebook
Offensive Security
Client Sided Vulnerabilities Exploitation
Client sided vulnerabilities in browsers and/or browser plugins and/or local software allow a remote attacker to gain browser-level access privileges on the victim machine. Thereafter, any permitted files, resources, global cookies, can be accessed directly. Privilege escalation is also possible to obtain root.
Reference: The IE Aurora vuln is a good example of this. http://www.metasploit.com/modules/exploit/windows/browser/ms10_002_aurora
Server Exploitation
If the hacker cracks a server that has authority to say use your webcam, then the next time your userID is encountered it is possible to access your resources as per the privileges given to the server by you.
Government agencies and ISPs are known to track visitors to sites blocked by them.
Man In The Middle
Good ol' MITM attacks can steal sensitive information from users whose cryptographic protocol utilized is too weak (yes, I said it, if your kung fu no good) or if it is absent. This can happen in a local network, a routing point, a tor exit node or a VPN node that has been compromised. I'm pretty sure Google will be able to answer this better.
tl;dr:
There's a lot to cover here, and I'm certain that I've missed out on a major portion of it, but as you can see there is definitely a traceable jet trail left behind if the tracking was implemented as a precautionary measure.