Introduction
In today’s digital age, vast amounts of data are collected from Internet users every time they interact with the Internet and its services. Whether users know it or not, each mouse click, keystroke, and interaction leave behind data. Organizations then scoop up this data to build profiles on users. This constant stream of information has raised concerns about privacy and the ethical use of data.
This document provides a detailed description of the data, and the methods used to collect it. Through this exploration, this paper aims to shed light on the complexities of online data collection and encourage readers to actively protect their digital privacy.
What is Data?
Data can be formally defined as digital information generated, processed, or collected. This includes everything from a simple email to payment information from Amazon. On a technical level, this information, at its core, can be broken down into binary code of ones and zeros. One singular binary unit is called a bit. This is the smallest form of data. Moving larger from a bit, there are bytes, which are made up of eight bits. Kilobytes (KB) and gigabytes (GB) comprise one thousand and one billion bytes, respectively. You may have heard of these when trying to download something, as they are commonly associated with file sizes and storage limits.
Now, with that in mind, researchers estimated that in 2023, there was one hundred and twenty-three zettabytes of data generated globally, a substantial portion of it being Internet user data. One zettabyte is equivalent to one trillion gigabytes of data.
Types of Data Collected
Data collected from internet users falls into three primary categories:
1. Personal data: This includes identifiable information like names, addresses, and emails. Users often provide this information directly when creating accounts or making purchases online.
This figure shows examples of personal data [2].
2. Behavioral data: This type tracks a user’s activities on the internet, such as which websites they visit, what products they view, and how long they stay on specific pages.
This figure provides examples of behavioral data [6].
3. Metadata: This data is collected in the background without direct input from the user. It includes details like the user’s IP address, browser type, and location. Unlike personal data, metadata doesn’t directly identify a person but can be used to infer patterns and behaviors.
This figure shows examples of Metadata [1].
Explicit vs. Implicit Data Collection
Data collection falls into two categories: explicit and implicit. Explicit data collection occurs when the user knowingly provides their information. This often happens when filling out forms, registering accounts, and making online purchases. In these cases, the user is aware that they provide information such as their name, email address, and payment details.
Implicit data collection is usually done in the background without the user’s direct input. It tracks things like browsing habits, location information, and device details.
Methods of Data Collection
There are multiple methods of data collection. One of the most common is data collection via cookies. Cookies are small text files stored on a user’s device when they visit a website. They allow websites to remember essential information such as user preferences, login details, and site history. While these things can all improve user experience, they also enable tracking of user activity across multiple sites [5]. This data can be sold to third-party entities like advertisers, allowing them to deliver targeted ads.
Another primary data collection method is the deployment of tracking pixels or web beacons. These mechanisms are invisible images embedded into digital content (i.e., web pages, emails, etc.). These pixels report data such as which emails were opened, which links were clicked, and how long a user spent on a page [3].
Finally, modern data collection is also very prevalent in mobile applications. Apps on smartphones usually request access to personal information such as camera roll access, location, and contacts.
Why Does it Matter?
The collection of user data has significant privacy implications. Companies often claim that data collection improves user experience by providing features such as personalized advertisements and recommendations. However, some users may be unaware of how much data is really being collected and shared on their behalf. This leads to concerns about data privacy, especially in cases of surveillance and data breaches, where sensitive personal information about users is exposed.
When using the Internet, it is important to know that companies are constantly generating data that they collect and share with third-party entities, such as advertisers and add brokers.
How Can Users Protect Their Privacy
Users looking to protect their data and digital privacy have many methods available. Some of these methods are as follows:
- Ad-Blockers: Ad-blockers guard your internet privacy by restricting advertisements and blocking trackers from monitoring digital activity [4].
- Virtual Private Networks (VPN): Virtual private networks allow users to mask their IP address and hide their location from their internet service provider (ISP) and others from tracking their online activity [4].
- Managing Cookies: When accessing a site on the web, the user will often be met with a pop-up regarding cookie usage on the site. Usually the user can select “Accept all cookies” or “Manage cookies.” Selecting all cookies will give the site full access to your cookies. The “”manage cookies”” option can allow users to pick and choose what data they want to share with the site.
- Updating Browser Privacy and Security Settings: All modern browsers have privacy and security settings that prevent online companies from gathering data. This can be one of the easiest and most effective changes users can make [4].
As mentioned, these are just a few of the ways users can protect themselves on the web. There are many other ways in which one can maintain their digital privacy.
Conclusion
Data collection of internet users has become a common practice in the modern digital world. From behavioral patterns to metadata and personal information, a data trail will always be generated as users interact with and access the Internet and its services. While the data can benefit business and user experience, it also raises significant privacy concerns. As the data collection methods become more advanced and the mountains of data grow higher, it becomes increasingly crucial for users to understand what is happening behind their screens.
With a better understanding of data and the ways in which it is extracted, individuals can take proactive steps to defend their digital privacy. As the digital landscape continues to evolve, it is vital that web users stay ahead of the curve, stay informed, and exercise caution when traversing the Internet, as it gives them power over their data in an increasingly data-driven world.
References
[1] admin. (2019, May 24). What is Metadata? – Privacy Proficient. Privacy Proficient. https://privacyproficient.com/what-is-metadata/
[2] Barney, N. (2024). data marketplace (data market). WhatIs; TechTarget. https://www.techtarget.com/whatis/definition/data-marketplace-data-market
[3] Eckersley, P. (2009, September 21). How Online Tracking Companies Know Most of What You Do Online (and What Social Networks Are Doing to Help Them). Electronic Frontier Foundation. https://www.eff.org/deeplinks/2009/09/online-trackers-and-social-networks
[4] Shelest, D. (2024, May 18). How to protect your privacy online: 15 essential ways in 2024 [Review of How to protect your privacy online: 15 essential ways in 2024]. Onerep. https://onerep.com/blog/how-to-protect-your-privacy-online
[5] Simplilearn. (2023, September 1). What Is Data collection: methods, types, tools, and Techniques. Simplilearn. https://www.simplilearn.com/what-is-data-collection-article
[6] What is Behavioral Data & Why is it Important? | Fullstory. (2024). Fullstory.com. https://www.fullstory.com/blog/behavioral-data/

Leave a comment