Besides IP addresses, how else could one be identified?

Question

OP: I'm curious how else an actor seeking to identify someone online could accomplish this task besides just using an IP address. What methods would they employ? What knowledge must they have of the target? What technology would they need access to?

Edit: Let's break identity into domains and lets discuss each domain then. First, the client's computer. How can an attacker identify a client from his computer? I.E. what is the digital fingerprint they leave behind? (I am assuming that the attacker does not have physical access to the computer. I am inferring what digital forensics from a particular computer are exposed to the web from normal internet use)

Second, the client's cyber persona (all the email, social media and such accounts aggregated). Assuming an attacker can sprawl the web, how can they effectively assemble the pieces of a persona into a whole package that can be used to identify a particular individual.

Finally, trust on the web. Who does the client have to trust to access commonly used services such as Google and Facebook. What are the points of attack that can leak secure information to untrusted third parties.

I've included a picture: enter image description here

Edit2: After further refinement, we have arrived at a chain connected to a computer user: Real Life Persona and Geography >>> Digital Fingerprint >>> User's Cyber Persona. An attacker shall be defined as an entity that works his way backwards first by aggregating the Cyber Persona linking it to digital fingerprints and from there to the RLPG.

The deepest question can then be broken down into first what access, tools and techniques must an attacker possess to rebuild the user's cyber persona. Second, how can they link that cyber persona to the user's digital fingerprints. And finally, how can both sets of information definitely identify a RLPG profile?

Someone's face? Really, you gotta be more clear in what situation you're imagining. I'd assume something online since you're talking about an IP address, but even then you might be talking about website visitors, someone who stole your computer, or an attack on your website with a DNS amplification attack, to name just a few. — Luc, Jul 03 '13 at 13:58
Have a look at Maltego for identifying and linking people with online entities. http://www.paterva.com/web6/products/maltego.php — Henning Klevjer, Jul 03 '13 at 16:38
This question is really way too broad and as such a clear candidate to be put on hold. I have two suggestions how to narrow it down: 1) Define _"identify"_. Do you mean here identifying a unique Internet user, or identifying an Internet user as a real-life persona? And 2) What data do we have access to? Meaning, are we an online service to which some user subscribes, enters a certain amount of personal or imaginary data and we're supposed to correlate this data with actual people, or are we talking of taking an IP address "from ether" (e.g. from a simple web server access log)? — TildalWave, Jul 03 '13 at 23:32
And for what it's worth, I don't think your question needs more incentive in the form of a bounty to attract better answers. What it needs is specificity. As it stands now, one could write a lengthy book on the subject and still not cover all of the possible answers to your question(s). — TildalWave, Jul 03 '13 at 23:44
@CharlesHoskinson, you need to limit this more. 'How can an attacker identify a client from his computer?' His fingerprints? DNA residue? He scratched his name into the screen for teh lulz? Are these acceptable answers? No? Please! Limit the question and you'll get better answers!! — NULLZ, Jul 05 '13 at 01:47
This is ridiculous and obviously a troll answer. I will edit to remove this FUD. — Charles Hoskinson, Jul 05 '13 at 07:41
I don't understand your last comment. We're working with you to make this question answerable within the scope of our Q&A. I hope I just misunderstood it? Regardless, I'm afraid your question is still too broad ever since you included the word "attacker" in it, because we can only speculate to what attacker's scheme "the victim" will be susceptible to, and what information an attacker could gain knowledge of, by, for example, gaining complete remote access to victim's computer. In short, we need to know what attacks are acceptable and what information the target is protecting. — TildalWave, Jul 05 '13 at 11:21
I am assuming an arbitrary computer user who has accessed the internet and is using normal services such as Google and Facebook. The attacker is not a spy with access to the physical hardware, but rather remote agent who is in the middle of the communications between services like Google and Facebook and also has access to the information left behind in the public domain. I am asking about the vectors this attacker may use to identify and understand the user. — Charles Hoskinson, Jul 05 '13 at 17:59

score 18 · Answer 1 · answered Jul 01 '13 at 17:45

18

In case of web browsing your software configuration usually provides a pretty unique fingerprint that can be tracked as you browse. Check out the Panoptclick project.

Also every piece of information you post to different sites will contain information about you. For example the time of your postings will give a clue about which time zone you are in even if you don't provide any personal information.

answered Jul 01 '13 at 17:45

buherator

1,730
1
9
15

2

Yes, fonts are the worst. 1 in 3094513 chance someone has the same font combination as me... – stoicfury Jul 05 '13 at 20:09

u2702 · Answer 2 · 2013-07-01T18:13:59.430

7

The technology is pretty simple, your system and your browser sends lots of interesting information with http requests. All the server has to do is capture and log those attributes. Some combination of those could be use to correlate requests.

Here's a starting point:

TCP stack attributes

Browser capabilities

Browser cookies

Flash cookies (managed by the Flash plugin, not the browser)

Browser version

Browser user-agent

[Edit] HTML5-enabled browsers can also send a location. It's a browser setting that the user can control (you should check the default of your browsers). Devices with GPS can pull that information and send back lat/lng: http://www.w3schools.com/html/html5_geolocation.asp

edited Jul 01 '13 at 18:13

answered Jul 01 '13 at 17:44

u2702

2,086
10
11

2

It's important to note that no browser will collect or send GPS data without explicit permission from the user every time. If it could be collected automatically or silently, that'd be a major problem. To my knowledge no one has found an exploit that can do so so far; it's possible someone could in the future, but probably pretty unlikely. – Anorov Jul 02 '13 at 00:12

score 5 · Answer 3 · answered Jul 05 '13 at 17:59

An interesting but potentially endless question, even with the edits.

I'm going to assume for the sake of interest that the attacker's capabilities and knowledge are virtually limitless... many of these brainstorming ideas are going to take more than a single person on a shoestring budget.

Computer Identity

Here's a vague list:

Identifiers for location of host computer - IP address is the obvious one. Below that, depending on what you are snooping and how, I'd suspect you can get low enough to see what path the routing is taking to get a better sense of physical location. Getting far with this approach will likely require hacking some areas of the internet provider.
MAC address - generally identifies an aspect of the host hardware (like the network interface card) - can be spoofed.
keys for privileged communication - can't easily be gathered without hacking the other end point, but possible.
a lot of information can be gained about the computer of the local network supporting it based on how it communicates. Varies by the type of communication, but it's everything from the obvious browser-type that is part of most HTTP commucations (you don't see a lot of UNIX systems running IE, for example...), to more nuances - like the fact that in my experience Windows machines use certain protocols ever so slightly differently than UNIX.
volume and speed of transmissions is going to give some sense of how the computer is connecting to the net - if you see video streaming for example, you can pretty sure the computer isn't dialing in by modem.

Social Media & Identity

I suspect it depends. Certainly the same aspect that lets us "find friends" on Facebook, Google+, LinkedIn, etc is a pretty great way of aggregating the identities of a given human. How much and what info you can glean from there has a lot to do with how the given individual is using the Net.

I've generally suspected that gaining enough information to generate a fairly accurate list of identities, email addresses, websites/blogs, and other public behavior of a person would be relatively easy. Many folks maintain consistent usernames from site to site, and pictures and friend links make aggregation easier. From there I'd think it's pretty easy to get an idea of a subset of the places the person may work, shop, visit or otherwise interact with, given the blending of socializing, shopping and marketing that is available at this point.

A fairly recluse friend of mine pointed out recently that he'd been "outed" on the internet. While the friend, himself, did not have an account on most social sites, his friends did, and there was no avoiding the pictures of "this is me with X" getting posted around, so that a fairly accurate shadow identity of my friend was formed, without his consent or involvement. He was essentially able to research himself based on what everyone else was mentioning about him. It included things like books he liked, places he'd been, and things he'd worked on.

I've generally thought, too, it would be pretty easy to intuit a person's patterns - when do they wake, when do they work, what are they usually doing when they tweet/post/etc. And glean enough information do a pretty awesome spear phishing attack or other deft social engineering.

Trust

IMO - far too much. Certainly many services can start as a low-sharing relationship - most Net services require an email, and the email may require a phone number. But the way data builds, private data can grow quickly.

The biggest challenge is that once data is shared, it can't be unshared. At a minimum, you need to trust the service, any transmission mechanism, and any cases where the service trusts or depends on an outside entity.

score 3 · Answer 4 · answered Jul 06 '13 at 17:32

The amount of information we leave behind when using the Internet and technologies originating from it is quite surprising to an average user. I don't think I could effectively cover all of it, but here's some of the knowledge I've gathered from work in ethical hacking.

Le Browser

Fingerprint: A browser (as others have mentioned) has a fingerprint through which a decent amount of data can be recovered:-

Browser Toolkit (Often the browser itself) with a version.
The Host System OS Information
Info on the expected return type, supported compression methods, etc.
And of course the IP.

Observe two such fingerprints below:-

Firefox: (v22.0)

root@kali:~# nc -lvp 80 
listening on [any] 80 ... 
GET / HTTP/1.1 
Host: 192.168.1.9 
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate DNT: 1
Connection: keep-alive

Google Chrome: (v27.0.1453.116 m)

root@kali:~# nc -lvp 80
listening on [any] 80 ...
GET / HTTP/1.1
Host: 192.168.1.9
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8

Cookies Tracking cookies are all to famous for me to explain here. I found the two following references adequate to cover the basics of the same. - http://www.prontomarketing.com/files/2012/07/WHITEPAPER-The-Myth-Of-Accurate-Conversion-Tracking-Using-Google-Analytics-Summary-Ver-1.pdf - http://www.postaffiliatepro.com/features/tracking-methods/

Client sided script based tracking Javascript for example runs on the client side, and while it can't open a /bin/sh backdoor to your system or access files, it can request pages, etc using AJAX. Since it's on your local network, it can access intranet hosts. This can have a number of applications that an attacker can exploit depending on exact scenarios (find which router they use, get license keys, access identifying info stored on the LAN). While the exact reference for the same seems to have 404'd, please use the following as a POC reference. http://code.google.com/p/jslanscanner/

Click Jacking Although using things like your camera, microphone or built-in geo location tracking are supposed to require explicit user permission, click jacking is one of the vectors that an attack can exploit to get you to bypass this security measure. Documented uses include:

Tricking users into enabling their webcam and microphone through Flash

Tricking users into making their social networking profile information public

Making users follow someone on Twitter

Sharing links on Facebook

Offensive Security

Client Sided Vulnerabilities Exploitation Client sided vulnerabilities in browsers and/or browser plugins and/or local software allow a remote attacker to gain browser-level access privileges on the victim machine. Thereafter, any permitted files, resources, global cookies, can be accessed directly. Privilege escalation is also possible to obtain root. Reference: The IE Aurora vuln is a good example of this. http://www.metasploit.com/modules/exploit/windows/browser/ms10_002_aurora

Server Exploitation If the hacker cracks a server that has authority to say use your webcam, then the next time your userID is encountered it is possible to access your resources as per the privileges given to the server by you. Government agencies and ISPs are known to track visitors to sites blocked by them.

Man In The Middle Good ol' MITM attacks can steal sensitive information from users whose cryptographic protocol utilized is too weak (yes, I said it, if your kung fu no good) or if it is absent. This can happen in a local network, a routing point, a tor exit node or a VPN node that has been compromised. I'm pretty sure Google will be able to answer this better.

tl;dr:

There's a lot to cover here, and I'm certain that I've missed out on a major portion of it, but as you can see there is definitely a traceable jet trail left behind if the tracking was implemented as a precautionary measure.

I had a hard internal debate between Rohan and Beh's answers. I wish I could do a split award. Good work guys — Charles Hoskinson, Jul 10 '13 at 18:57
@CharlesHoskinson Yep, his answer was pretty good. Thanks btw. ^_^" — Rohan Durve, Jul 12 '13 at 05:21

score 2 · Answer 5 · answered Jul 01 '13 at 17:51

It depends on the context.

Is this just a random Joe Schmoe on the internet? If so they probably use that same username on more than one site you can use google or Spokeo to find other uses of that username and hopefully some social media accounts.

Is this a person who knows how to hide their identity online? If so you probably won't find anything online and it's time to start looking at motive and questioning people. If that person doesn't know you personally and they did a good job of hiding their identity online and they didn't steal or break any of your stuff just let it go man because it's gone and it's not worth it just for internet revenge.

Did this person steal from you? vandalize your websites or damage you in any way? If so look for the person who would gain the most from what happened, start asking questions and involve law enforcement.

Do you own the attacked equipment? If you do look at the http headers and any connections made to your equipment from that person they may have slipped up. You may get lucky and see something interesting that can identify that person or the machine they're on.

Good Luck!

Let's assume the attacker does not have access to the computer and is a non-government entity. Let's also assume that the attacker has never met the victim — Charles Hoskinson, Jul 01 '13 at 17:56
Then go with the first point above. Look at the username associated with the identity try to find that username somewhere else on the net. That's a start at least. Hopefully you get lucky and find a facebook account. Don't stop at just the username though. Look at all of the information you have right now, individually think about each piece of information you have and think about how you can use that piece of information to get more information. When you find more information repeat the above process. — Four_0h_Three, Jul 01 '13 at 18:02
Also if someone puts their real name, age and state in their profile you can use a service like beenverified to find lots of information on them such as address, phone number, criminal records and lots of other information.. How's your brother William doing? ;) — Four_0h_Three, Jul 04 '13 at 15:49

score 2 · Answer 6 · answered Jul 05 '13 at 13:53

Here is a list of things that can be used to identify you and track you on different websites and such.

Your usernames (they are not going to be completely different from website to website).
Your location
Your browser and the settings and browser version (anything browser related)
The language you speak (if you visit chinese forums for example, it is easy to know that you are chinese or can speak it or have some relationship with it)
Your profile image (can't be completely different all the time)
Your email
Any cookies you have on your computer
Your OS version

I didn't include ip because you said except ip.

You kinda included it by saying you didn't include it... – NULLZ Jul 10 '13 at 01:22 — NULLZ, Jul 10 '13 at 01:22

score 2 · Answer 7 · answered Jul 05 '13 at 14:03

2

I notice nobody else has mentioned this one: Evercookie

answered Jul 05 '13 at 14:03

shieldfoss

141
6

score 1 · Answer 8 · edited Jul 01 '13 at 18:15

1

When you browse anything on the Internet a lot of information is logged:

Your IP address
Your location
Your browser version
Your user agent etc..

So there is possible way available to prevent these type of tracking

Use any premium vpn

edited Jul 01 '13 at 18:15

Rory Alsop

61,367
12
115
320

answered Jul 01 '13 at 17:52

Jijo John

129
3

Please note that private mode won't hide any of the information you mention in the first bulletpoints. Also, you have to pay for the VPN somehow and money is easy to trace (especially if you are law enforcement). – buherator Jul 01 '13 at 17:56
To really prevent detection use Tor and Privoxy preferably running a live OS such as Tails and don't log into any personal accounts while browsing anonymously. – Four_0h_Three Jul 01 '13 at 17:57
Unless you live in bitcoin land. What about things like Mac addresses? – Charles Hoskinson Jul 01 '13 at 17:57
@CharlesHoskinson Good point with bitcoin. MAC address lives at layer 2 - it won't leak further than the next switch of the network. – buherator Jul 01 '13 at 18:18
My understanding is that bitcoin is traceable. Better to use a gift card bought with cash (check the fine print, not all gift cards are created equal as far as internet use goes). VPN in conjunction with a Virtual Machine can be used to provide a unique signature that is different from your usual browsing, just heed the cautions about not logging in or doing anything in the VM/VPN that would link you or your behaviors to your outside-of-VM/VPN self. – pseudon Jul 02 '13 at 23:16
VPN (premium or not) is an indicator. Sure, the physical address might not be obvious, but it can be traced down if someone puts enough efford into it. Also, a VPN will not hide IP addresses (it only provides another IP than your "home-IP"), it will not hide your user agent, etc. In fact, it will not change a darn thing besides giving you another IP via a 3rd party system located who-knows-where on this planet. As said, all that can be tracked down. Ask Assange, DotCom, Snowden and even the PirateBay guys who all privately used VPNs. Didn't help them hiding their location on the long run... – e-sushi Jul 06 '13 at 18:08

score 1 · Answer 9 · answered Jul 03 '13 at 13:37

Given a known email address and perhaps a web site login (handle), you may find significant amounts of information through website searches. For example, people tend to use the same user ID's repeatedly across the internet, and in a recent investigation I was involved in, a known user ID led to a YouTube channel that included a video of a mobile device review, where reflections of the reviewer and also their precise location at the time was revealed during the video. What can be determined varies on a case by case basis.

score 1 · Answer 10 · answered Jul 06 '13 at 17:57

While I agree to 99% of the answers posted here, I would like to add two words that have not been dropped yet: Behavioral Analysis.

Combine that with the multitude of information-snippets that your browser and your computer leak, and you've got a pretty good idea what is used by governmental companies and institutions to track individuals (beyond the usual user tracking you might know from advertisers and metrics-collecting companies). Practically, all one needs is the correct set of “triggers” and you'll be collecting metadata which will enable you to narrow things down to specific individuals.

score 0 · Answer 11 · answered Jul 06 '13 at 17:53

There are many factors to this. Randomness is only as good at the time the Internet was not yet designed. Today, whenever a person lay his hands into the keyboard to surf the net, he is "associating" and attaching a part of his personal identity to the network. Thus, it creates a link between your physical identity to the network realm and creating an identifiable "digital persona". This digital persona creates a link of raw information which can be used to identify your "personal identity".

If you are a member of a terrorist group and you dropped a bomb in the US soil, then decided to post it to Facebook by using multiple VPN connections believing you can eliminate an array of trace back, then you're just fooling yourself.

The question of course of how easy OR how hard it is to do the process of identification and verification will vary on three (3) different things.

The raw data presented by your online and physical activities.

a. Timestamps, names, addresses, emails, userid's, passwords, gender, online shoppings, bank transactions, the promo you signed-up in the mall, the search keywords you use and even typing patterns can be used to identify a person's identity.
The raw data presented by the machine and external entities.

a. Timestamps, IP addresses, browser and cache information, chat histories, logs, temp files, mac addresses, digital signatures, machine fingerprint, network path etc. The result of your dmesg contains a unique identifier of your machine's hardware serials etc.
The skill-level, motivation and tools of the attacker/person/organization to identify you.

a. Establishing a link between your physical identity to your digital identity requires skills by the attacker or the organization. b. Organizations can identify and track you down by consolidating the raw data presented above and through the use of advance analytics not including their connections. c. Malicious attacker can craft malwares which can aid in data consolidation and analysis. d. Combining and linking these data, patterns and behaviors will come up with a result that can match a person.

All of the above mentioned are components to identification.

Anonimity can only be achieved through constant randomness. The "anonymyous" remains anonymous because it acts through a multiple persona. It moves. It never re-use connections. It never allow patterns and behaviors for study.

*“Anonimity can only be achieved through **constant randomness**.”* isn't all that correct, as constant randomness is a **constant** and therefore allows dynamic fingerprinting functionalities to track it. — e-sushi, Jul 06 '13 at 18:02
Damn, the talk about a digital persona makes your answer cooler than mine. :/ — Rohan Durve, Jul 06 '13 at 18:15