What is a Browser Fingerprint?
Browser fingerprinting is a method of tracking the web browser through the configuration and setting information that the browser can see on the website.
Browser fingerprints are like fingerprints on our hands, with individual recognition.
Formally because of the folds of the human skin, a unique human fingerprint is formed that is different for everyone.
The same is true for the browser fingerprint. Obtain the recognizable information of the browser and perform some calculations to get a value. Then this value is the browser fingerprint.
Recognizable information includes language, time zone, user agent, geographic location, etc. The information you select determines the accuracy of the browser fingerprint.
Obtaining fingerprints has no real meaning for the website, and more importantly, the user information corresponding to the fingerprints.
As a website, collecting user browser fingerprints and recording user operations is a valuable behavior, especially for scenarios where there is no user identity.
For example, on a content distribution website, user A likes to browse military content, and the browser fingerprint can record this interest. Then the user can push military information to user The next time without logging in to the website.
At the moment when personal PCs are so popular, this is also a way of content distribution.
For users, establishing a connection between personal surfing behavior and browser fingerprints more or less infringes on user privacy.
Fortunately, this method has limited infringement on users’ privacy, and abusive user behavior will also overdraft the user’s favor on the site.
The development of browser fingerprinting
First generation: stateful
The main focus is on the user’s cookie, which requires the user to log in to obtain effective information.
Second generation: browser fingerprint
Make users more distinguishable by continuously increasing the feature value of the browser.
For example UA, a browser plug-in, Canvas, AudioContext information, etc.
Third generation: user behavior big data
By collecting user behaviors and habits to build feature values and even models for users, real tracking technology can be realized. This part of the current implementation is relatively complicated and is still being explored.
How to get the browser fingerprint
Get browser fingerprint from HTTP
Entropy is the average amount of information contained in each message received. The higher the entropy, the more information can be transmitted, and the lower the entropy, the less information is transmitted.
Browser fingerprints are synthesized from the feature information of many browsers, and the information entropy of feature values is also different.
You can check your browser fingerprint ID and basic information here.
Browser fingerprints can also be simply divided into ordinary fingerprints and advanced fingerprints.
Ordinary fingerprints can be understood as parts that are easy to find and easy to modify.
For example, HTTP header
“Accept-Encoding”: “gzip, deflate, br”,
“Sec-Ch-Ua”: “\” Not A;Brand\”;v=\”99\”, \”Chromium\”;v=\”90\””,
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36”,
Use the tool to query.
In the above, we can see the browser’s Accept-Language and User-Agent, through which the language information of the browser can be obtained.
This HTTP header entity information may be generated by your current operating system language or the language information set by the browser.
This header is not necessarily accurate. Some websites will ignore this header and use ip to determine the language of the user’s region.
User-Agent contains information about the browser and operating system.
For example, I am currently using MacOS and I am using version 77 of Chrome.
If the UA is deliberately faked in the header, the webpage can also get the real UA through navigator.userAgent.
Other basic information, such as IP, physical address, geographic location, etc. can also be obtained.
Use the tool to query.
Other ways to get browser fingerprints
In addition to fingerprints obtained from HTTP, browser characteristic information can also be obtained in other ways.
1. User-agent string for each browser
2. HTTP ACCEPT header sent by the browser
3. Screen resolution and color depth
4. System setting time zone
5. Browser extensions/plugins installed in the browser
For example, Quicktime, Flash, Java or Acrobat, and the versions of these plug-ins
6. Fonts installed on the computer, font fingerprint technology
8. Browser cookie
9. Hash of the image generated by Canvas fingerprint
10. Hash of the image generated by the WebGL fingerprint
11. Whether the browser is set to “Do Not Track”
12. the system platform
For example Win32, Linux x86
13. System language
For example, cn,en-US
14. Does the browser support touch screen
15. Hardware concurrency
After getting these values, you can perform some calculations to get the specific information entropy of the browser fingerprint and the UUID of the browser.
The integrated fingerprint information can greatly reduce the collision rate and improve the accuracy of the client UUID.
Fingerprints also have rank. Some feature values with larger information entropy will have larger ranks.
The information described by ordinary fingerprints is still not unique enough, after all, there are still very many people using MacOS in the United States.
Advanced fingerprints can further narrow this range, and can almost directly determine a unique browser identity.
Types of browser fingerprints
Canvas is a dynamic drawing tag in HTML5, and it can also be used to generate pictures or process pictures.
The same HTMLCanvasElement element drawing operation, on different operating systems and different browsers, the content of the pictures produced is not exactly the same.
In terms of image format, different browsers use different graphics processing engines, different image export options, and different default compression levels.
At the pixel level, the operating systems each use different settings and algorithms for anti-aliasing and sub-pixel rendering operations.
Even with the same drawing operation, the CRC check of the generated picture data is different.
Canvas has been supported by almost all major browsers and can be accessed through most PCs, tablets, and smart phones.
Use the tool to query.
The WebGL object (canvas.getContext(“WebGL”)) can be obtained through the HTMLCanvasElement element, and the user’s hardware information can be obtained through this object.
Graphics card name, Graphics card model, Graphics card manufacturer, etc.
For example: ANGLE (NVIDIA GeForce GTX 1050 Ti Direct3D11 vs_5_0 ps_5_0), Google Inc.
Since the hardware is generally not replaced at will, some computers have not been replaced when they are scrapped.
There are also many types of computer hardware. Although the collision rate is very large, it can still be used as part of the user’s fingerprint.
The more information collected from a user, the more it can represent the user’s unique fingerprint, which cannot be ignored.
Use the tool to query.
Be randomly generated, processing, recycling, such as raising the tone, pitch change, audio division, and even can be called Web version of Adobe Audition.
AudioContext fingerprint principle is as follows:
Method 1: Generate an audio information stream (triangular wave), perform FFT transformation on it, and calculate the SHA value as a fingerprint.
Method 2: generating an audio stream (sine wave), the dynamic compression processing, MD5 value is calculated.
Both methods are cleared before the audio output to the audio device, the user simply unaware it was acquired fingerprint.
AudioContext fingerprint basic principles:
The subtle differences in the hardware or software of the host or browser cause differences in the processing of audio signals.
The same type of the browser on the same device produces the same audio output.
Different machines or different browsers generate audio output will be different.
It can be seen from the above that AudioContext and Canvas fingerprint principles are very similar.
Both use the difference in hardware or software. The former generates audio, the latter generates pictures, and then calculates different hash values as identification.
Use the tool to query.
WebRTC (Web Real-Time Communication) is the ability for the browser to have real-time audio and video communication.
It provides three main APIs to allow JS to obtain and exchange audio and video data in real-time, MediaStream, RTCPeerConnection, and RTCDataChannel.
Of course, if you want to use WebRTC to obtain communication capabilities, the user’s real IP must be exposed (NAT penetration), so RTCPeerConnection provides such an API.
You can get the user’s IP address directly by using JS.
The browser fingerprints mentioned above are all obtained from the same browser.
However, many feature values are unstable. For example, UA and canvas fingerprints will be completely different when opened in different browsers on the same device.
The same set of browser fingerprint algorithms is not available on different browsers.
Cross-browser fingerprinting is a stable browser feature that can obtain the same or similar value even on different browsers.
How to protect the browser fingerprint
If you do not have enough professional knowledge or change browser information very frequently, almost 100% of your browser fingerprints will be leaked. Of course, this is not necessarily a bad thing.
The leaked privacy is very one-sided, and it can only be said that it has leaked part of the user’s behavior when browsing the web.
Insufficient value, user behavior does not correspond to actual accounts or specific people, and the value generated is limited.
Beneficial use, use browser fingerprinting can produce user isolation part black, part to prevent brush votes or malicious behavior.
But even so, browser fingerprints have some preventive measures.
Do Not Track
In the HTTP header, you can declare such a flag “DNT” means “Do Not Track”, if the value is 1, it means not to track my webpage behavior, and 0 means it can be tracked.
Even if there is no cookie, you can tell the server that I don’t want to be tracked and don’t record my behavior through this flag.
The bad news is that most websites currently do not abide by this agreement and completely ignore the “Do Not Track” signal.
Through the above-mentioned understanding of browser fingerprints, it is not difficult to find that the more features your browser has, the easier it is to be traced.
On the contrary, if you want to deliberately hide certain browser features or make magic changes, then congratulations, your browser may have a unique browser fingerprint, and you don’t need to deliberately calculate it, you can directly communicate with other people distinguish.
Therefore, an effective method is to popularize the feature values as much as possible.
For example, the most widely available combination on the market is Window 10 + Chrome.
Then you change the UA to this combination is an effective method, and at the same time try to avoid the website from acquiring feature values with very high information entropy, such as canvas fingerprints.
ClonBrowser browsers do a lot of work on this to prevent them from being used to track ClonBrowser users.
In response to Panopticlick and other fingerprinting experiments, the ClonBrowser browser now includes some patches.
To prevent font fingerprints, by restricting the fonts that can be used by the website and Canvas fingerprints, it is prevented by detecting the reading of HTML5 Canvas objects and requiring user approval.
For example, the code for obtaining Canvas fingerprints above, in ClonBrowser, you can choose to close, noise, block and other settings to adapt to your business scenarios.
In summary, these measures make the ClonBrowser browser a powerful defense tool against fingerprints to protect your privacy.