Windows 10 was a huge success after it launched in 2015. As of March 2017, it was running on over a quarter of Desktop computers worldwide, only trailing Windows 7, which was released six years prior. Microsoft appeared to have listened to feedback about Windows 8 and sought to mix the best of Windows 7 and 8 while pushing the OS into the future.
Despite the success of the launch, there was one area in which Microsoft appeared falter: privacy. Windows 10 included mandatory and opt-out data collection for “product improvement and personalization” purposes. They further complicated the situation by being unclear about what data they were collecting and for what purpose, leading to claims that Windows 10 was a “privacy nightmare”.
With the advent of the Windows 10 Creator’s Update though, Microsoft seems to have decided that the time is right to be more transparent about their data gathering activities.
What Privacy Concerns?
Gathering data through Windows is nothing new. In fact it has been happening in Windows since 2009 through the Customer Experience Improvement Program (CEIP). The biggest difference though between Windows 10 Telemetry and the CEIP is that the data gathering is no longer opt-in.
Once you were upgraded to Windows 10, Microsoft began gathering data on your use of your Windows computer and even linked that data to your Microsoft account. Although the use of an opt-out approach is generally frowned upon, Microsoft took it a step further by complicating the privacy options, and giving you little choice in the matter.
If these privacy changes had happened pre-2013, then there’s a good chance that they may have been overlooked. However, in that year Edward Snowden leaked documents from the NSA that laid bare the mass surveillance of U.S. citizens and internet users around the world.
Microsoft, PRISM, and the NSA
PRISM was one of the most controversial of all the programs that the NSA undertook — gathering user data from some of the largest technology companies including Facebook, Yahoo, Google, and Microsoft. In fact some Microsoft software that people believed to be secure like Skype, Hotmail, and even Word, were susceptible to surveillance.
The timing of the addition of mandatory data gathering, with little explanation, and only two years after the initial revelations was particularly problematic. This led many to aggressively question Microsoft’s data gathering, even going so far as to develop tools to disable Windows telemetry, or to suggest abandoning Windows all together for a more secure Linux-based OS.
Unfortunately Microsoft decided to stay silent on the matter which only made the fears appear more rational.
The third-party tools that were designed to disable the data collection, actually proved a security risk to users. Services like Windows Update and malware protection rely on connecting to Microsoft servers. With all connections blocked, users became unable to patch critical security holes.
Fortunately, Microsoft has decided to remedy this potentially dangerous situation by finally granting Windows users clear and granular privacy settings in the Windows 10 Creator’s Update. To coincide with the update’s release, they also published an in-depth guide to the different areas of data collection on TechNet.
In the Creator’s Update there has also been a simplification of data collection levels down to either Basic or Full. A companion TechNet post listed every point of data collected at the Basic level along with Technical Information.
In the TechNet post Microsoft broke down the Full level data collection into nine distinct categories:
- Common Data
- Device, Connectivity, and Configuration data
- Product and Service Usage data
- Product and Service Performance data
- Software Setup and Inventory data
- Content Consumption data
- Browsing, Search and Query data
- Inking, Typing, and Speech Utterance data
- Licensing and Purchase data
Microsoft has so far only published descriptions and example data for each category at the Full collection level.
For diagnostic events at either Basic or Full level, Microsoft collects a header of what they term “common data” which includes:
- OS Version
- Device Type (Mobile, Desktop, Server)
- User ID (Not recorded at Basic collection)
- Diagnostic Level (Basic or Full)
- Device ID
Browsing, Search, and Query Data
Given the uproar after Congress voted to allow ISPs to sell your internet history, it’s clear that people believe their browsing and search data to be very personal. This makes the Browsing, Search and Query data type particularly contentious.
- Microsoft browser data — Text typed into the address bar, text selected for Cortana search, autocomplete, URLs, and page titles.
- On-device file query — Search type, number of items found, file extension of item opened, App ID of opening app, search scope.
When searching locally on your device only metadata about the search is collected, presumably in order to make sure that search helps you find what you are looking for more efficiently. The good news about the browsing and online search history is that if you don’t want to be involved, you can just use another browser as it applies only to either internet Explorer or Edge.
Its worth keeping in mind that Microsoft only tracks searches on their browsers, and also allows you to opt-out. This is in stark contrast to Google who monitor all searches on their platform in order to provide you ads.
Inking, Typing, and Speech Utterance
With the rise of smart home devices along with the high-profile legal case involving Amazon’s Echo device in 2016, people have become very aware of the potential for their devices to be always listening to them. The Inking, Typing, and Speech Utterance category aims to clarify exactly what Microsoft does with the data it processes.
- Type of pen used (highlighter, ballpoint, pencil), color, stroke, height, width, and how long it is used.
- Text of speech recognition results.
- Ink strokes written, text before and after ink insertion, recognized text.
- Whether the user is known to be a child.
- Confidence and Success/Failure of speech recognition.
The post makes it clear that any ink strokes that are converted to text are stripped of information that could reconstruct the content or associate it to a user. If collection of voice data outlined here seems oddly brief, that would be because the main voice input — Cortana — is governed by a separate data collection policy.
Content Consumption Data
After the Windows 10 launch, Microsoft appeared to stay purposefully silent on the matter of data collection. Then they entered the foray by publishing some interesting usage statistics over on their blog. Among all the data included was that there had been “over 82 billion photos viewed within the Windows 10 Photo app”. This did little to calm people’s worries.
In an effort to rectify this, the Content Consumption data type is explicit that it “includes diagnostic details about Microsoft applications that provide media consumption functionality (such as Groove Music), and is not intended to capture user viewing, listening or reading habits.”
There are four categories of Content Consumption:
- Movies — Video width, height, and color palette. Encoding type and streaming instructions.
- Music & TV — URL of songs being downloaded, media type, and local media library statistics.
- Reading — Name of the app accessing Windows Store books, book language, and time spent reading.
- Photos App — File source (SD card, network device, OneDrive), image/video size and resolution, and media view (full screen or collection view).
Under the Content Consumption data type Microsoft isn’t tracking what you consume but rather how you consume it.
And the Rest
In addition to the more controversial categories of data collection, Microsoft also provided information on some of the less disputed categories.
Device, Connectivity, and Configuration Data
As the name suggests, this data type is all about the type of device you are using, how it connects to the internet, and how it is configured. The TechNet post gives a comprehensive list of the data collected, but the highlights are:
- Device properties — Operating system, OEM name, serial numbers & hardware configurations
- Device capabilities — Touchscreen support, cameras, wireless capabilities, voice input devices
- Among Device Preferences and settings, you’ll find references to user settings, encryption status, default app choices, language preferences, Windows Update settings
- Network information — Network type, access point manufacturer, model, and MAC address, paid or free network
- Device peripherals
Although this list seems particularly long and potential invasive, it’s not all that different to data that can be gathered by specification tools like Belarc Advisor. In many ways it’s not even that different to the data your browser can leak to the websites you visit.
Product and Service Usage Data
The original purpose behind the CEIP was to “[help] Microsoft identify which Windows features to improve“. By tracking which features users spent most of their time with, or even which had the most problems, Microsoft was able to focus their efforts in useful ways. The Product and Service Usage category is an extension of that purpose.
Product and Service Performance Data
This category primarily covers information used for diagnostics and device health. When an app crashes or something unexpected happens, this is the data that may help get to the bottom of it.
- Device Health and Crash data — Error codes/messages, system generated log files, user generated files indicated as potential cause of crash, crash and hang dumps.
- Device Performance and reliability data
There is a lot of data nested underneath Device performance and reliability which may make you feel uneasy. However, a closer look shows that very little being recorded is sensitive or personal information. Instead, it is nearly all related to the health of the hardware and software configuration of your device.
Software Setup and Inventory Data
While updating to Windows 10, some users noticed that Microsoft was removing apps that weren’t installed through the Windows Store. This led to several Reddit threads where the mood was best summed up by u/pcg79:
For the last few iterations of Windows, Microsoft would check your upgrade eligibility and would warn you of any potential issues before you went ahead. Instead, Windows 10 was making the decision for you to remove the potentially problematic apps. This fueled speculation that Microsoft was collecting data on which applications were installed on your computer.
In the future, could Microsoft remove apps they don’t approve of?
- Installed Applications and Install History — App (driver, update package, name, ID), Product, install date, method, install directory, MSI package and product codes.
- Installation type — Clean install, repair, restore, OEM, upgrade, update.
- Device update information — Information about Windows Update including machine ID, number of applicable updates, update download and size.
Although the TechNet post does little to assuage those fears, Microsoft has at least admitted that they are tracking which applications you have installed on your computer.
Licensing and Purchase Data
In a world of online shopping and app stores, you perhaps already suspected that this information was being stored and collected. The data collected for Licensing and Purchasing allows Microsoft to verify that you are running a legitimate copy of Windows, as well as providing you with account information.
- Purchase History — Product name, price, time of purchase, and payment method.
- Entitlements — Subscriptions, license types and details, and DRM details.
What Can You Do?
The publication of Microsoft’s data collection categories was timed to coincide with the release of the Windows 10 Creator’s Update. As explained in a blog post by the Windows and Devices Group EVP Terry Myerson and Privacy Officer Marisa Rogers, the update give you more control over your privacy.
As outlined in the post, Microsoft has improved the information you see about the privacy settings throughout Windows, by including descriptions and “Learn More” buttons. The major improvement to the Windows 10 privacy settings though is during the Creator’s Update installation process.
As part of the update process, you will now be able to review your privacy settings, even if you were already running Windows 10. As well as being able to choose between Basic and Full levels of data collection, you will also be able to fine-tune other privacy settings including location access, and personalized ads.
While the majority of the data outlined in the TechNet post is specific to the device you are using, there are areas that overlap with your Microsoft account. This includes the use of Cortana as the personal assistant will store your preferences and interests. Microsoft recently launched a web-based privacy dashboard that allows you to view and remove data that has been collected and associated with your Microsoft account.
Are You Ready to Trust Microsoft Again?
With credit to Microsoft, they have listened to their users and made a concerted effort to be more transparent about data collection. They have provided more controls and options as to what data is stored and how it is used. This may have reassured you that Microsoft — despite their ties to the PRISM program — isn’t overreaching in their data collection.
However, it’s important to note that all the new privacy features are only available and relevant to Windows 10 Creator’s Update. Windows 7, 8, and “vanilla” 10 will not be receiving the same treatment or level of transparency.
If you find yourself still struggling to accept Microsoft’s data collection tactics then there are many ways to protect yourself. You could either switch to an encrypted email provider, use a VPN, or use a privacy-focused web browser like Firefox. If you are thinking of ditching Windows for good then you may be tempted by one of the privacy-focused Linux distros.
Do you think this is a breakthrough for Microsoft? Do you feel more relaxed about using Windows 10? Or do you think this has been totally overblown? Let us know in the comments below!