Fraud looms on the minds of development teams for any website, but fraud prevention doesn’t have to be all-consuming, or even expensive. With a highly accurate identifier it becomes much easier for developers to triage suspicious traffic and restrict access to users attempting to hack into accounts, make fraudulent purchases or simply spam your website. The key to identifying those most likely to commit fraud is either by past activity, or by associating specific patterns of use with a higher likelihood of fraudulence.
Browser fingerprinting technologies are a cornerstone technology for developer-led fraud prevention that cuts through spoofing attempts to accurately identify users, and it can do this without requiring additional permissions from the user. Our open source browser fingerprinting library has over 12K stars on Github and is used by 8,000+ websites. Fingerprinting techniques on their own have been found to be over 90% accurate in correctly identifying a unique user in the browser, and when used in conjunction with usage history, fuzzy matching, and probability engines, this accuracy can be further improved.
Here’s an analogy: let’s say you’re a detective in a large city trying to find one specific car suspected of being a bank robber’s getaway vehicle, as captured by a security camera. To find this car your plan is to go to a busy intersection and take note of all the details of passing cars until you find one that matches the vehicle on the security camera. Ideally, you would like to be able to uniquely identify the car, such that only one vehicle in the city matches your description, otherwise you’ll have to question multiple drivers.
If the security camera caught some basic details (or signals) about the car, you’ll be able to narrow your search considerably:
With these signals, you may be able to uniquely identify the vehicle right away, especially if any of the specifics are particularly rare. However, in a city with millions of drivers, there may be hundreds of red Ford F-150 trucks with standard-issue tires. The more standard the combination of signals, the harder it is to get a unique match.
In those cases, you hope that your camera may have gotten lucky and matched on a more unique signal about the vehicle:
One of these signals may quickly narrow down your search. A red Ford F-150 truck with a local company’s logo could very well be unique, even in a large city.
There is, of course, the most uniquely identifiable element of a car - the license plate. License plates serve the express purpose of uniquely identifying a car, but they aren’t useful if the vehicle owner removes their plates or swaps them with fakes. It’s important to have a backup for when this method of identification fails.
By assembling a broad and comprehensive set of identifiers you can narrow the list of suspects to make singling out a bad-actor much easier.
Fingerprinting works almost exactly the same as the example above. Now, you are trying to identify a visitor to a web or mobile application (suspect) by capturing signals passed via the visitor’s browser or device (car) captured through a fingerprinting function (security camera).
A lot of signals can be captured by the browser, including:
When a visitor lands on a webpage, the fingerprinting function collects signals and compiles them into a hash that can be stored. Any time this visitor returns to the website, their fingerprint can be compared to past visit history to identify suspicious behavior or past fraudulent activity.
For a fingerprint to be useful as a method of identifying visitors, it needs to have a high accuracy. Our FingerprintJS Pro tool has a 99.5% accuracy rate, which means for every 1,000 visits, 995 are correctly associated with a unique identifier.
For the 5 out of 1,000 that are not correctly identified, they can be described as either false positives or false negatives:
To reduce false positives and false negatives, your fingerprint should combine not only many signals, but the right combination of signals that balance both uniqueness and stability. If a signal is highly unique, it will reduce your chances of a false negative, whereas stable signals will reduce your chances of a false positive.
This framework can also be used to remove signals from your fingerprinting function altogether. If a signal has both low uniqueness and low stability, it is likely to change or be spoofed frequently, and doesn’t contribute meaningfully to uniqueness. To our car example, this might be whether a car has a dirty windshield - you cannot count on this signal to improve your chances of finding the correct car. In the world of browser fingerprinting, current battery level is a poor signal, and so while it is accessible, we remove it from our fingerprinting function.
Special consideration should be given to perfectly unique identifiers that are not always available.
Cookies work by storing a unique identifier hash in the browser when a visitor first lands on your website. When a visitor has a cookie that matches a previous visit record in your database, you can be certain that these two visitors are the same. However, cookies are a very easy identifier to conceal:
One of the main advantages of fingerprinting is that it is stateless. A well-implemented fingerprint can remain stable through multiple sessions, incognito browsing, uninstalling or reinstalling apps, or clearing cookies. For that reason, using the two methods in conjunction with one another can give a higher % accuracy than either identification method alone.
FingerprintJS Pro achieves its high rate of accuracy by using fingerprinting, cookies and additional machine learning techniques that incorporate IP address and geolocation. One challenge is keeping up with changes in available signals as new browser versions are released. Anytime Chrome or Safari is updated, for example, identification techniques need to be re-evaluated to determine if further tweaks need to be made to keep accuracy high. Our team is constantly looking to improve our accuracy by iterating on the signals, algorithms, and techniques used.
An important first principle for dealing with fraud is that only a small percentage of visitors are responsible for the majority of fraud cases. A developer team aiming to reduce fraud on their website will need to find ways to isolate these visitors, verify their identity through authentication, and blacklist them as needed. However, you will want to avoid putting up these roadblocks for the rest of your ‘trusted’ traffic, as additional authentication can be detrimental to the user experience, slowing your users’ ability to access their account, make purchases, and engage with your website.
Account takeover is a common form of fraud where malicious users try to log in to other users’ accounts, and is an excellent use-case for fingerprinting technology. Additional security at login can make account takeover much more difficult, though the type of authentication used may depend on the suspicious behavior your website experiences most often:
For bot or brute force attacks (one user or a network of bots trying many combinations of usernames/passwords):
For phished accounts (a user obtained someone else’s legitimate login information through a scam or social engineering):
In each of these cases, an accurate fingerprint allows your team to pinpoint users that are attempting fraudulent behavior, while keeping your trusted users unhindered. The type of authentication needed can be incorporated into your website by using existing workflows without having to fundamentally change the architecture of your site.
It is also important to note that users intending to commit fraud are much more likely to use techniques to conceal their identity, including using incognito mode, browsing via a VPN, and disabling cookies. That’s also where fingerprinting shines, as it can associate these users without needing easily concealed identifiers like cookies and IP addresses.
Our FingerprintJS open source library as well as our Pro API are intended for browser fingerprinting - they can accurately identify visitors to a website using all modern mobile and desktop browsers. However, if you want to identify users of a native mobile app, you will need to use a device fingerprinting function that is made specifically for each mobile operating system. The signals available for mobile app developers are different from signals that can be retrieved in the browser, and vary between iOS, Android, and other mobile operating systems.
Our team recently launched Fingerprint Android, our first open source library for identifying unique Android devices. You can read more about how our Fingerprint Android library works in our article about its launch. Get Involved We would love to hear your questions and get feedback from the developer community on our fingerprinting technology.
Here are a few ways you can get involved
To make an accurate browser fingerprint, you need to gather as many signals as possible. In this article, we go over some of the techniques used to generate signals that vary between site visitors enough to be useful for browser fingerprinting.
Many websites are losing significant revenue from account sharing and may not even know the extent of the problem. In this article, we go over why subscription sharing might be costing your business more than you thought, and how to prevent it.