Threat Modeling


In the digital privacy community our why is always the same: controlling the digital data we generate. What differs from person to person is the threats they want to protect against and the effort they are willing to go through to mitigate those threats. This is my primer on threat modeling and what my threat model is. This is a bit of a longer post and if you are well versed with threat modeling, skip to My Threat Model.

Definitions

When people talk about digital privacy, it is all about access and control of data you own or generate. However it is useful to differentiate the two types of data which people generate.

  1. Data - without any qualifier, when referring to digital data we mean the message, document, or other information being stored/transmitted
  2. Metadata - data about the digital data including, but no limited to who generated it, when it was generated, where it was generated, etc.

It can be just as important to control access to metadata as it is to control data. Even the NSA prefers using metadata for surveillance over reading message data. Consider a messaging service which end-to-end encrypts data, but not metadata. Then also consider the typical questions one may want to know about your messaging habits: who, what, where, when, why, how. Using metadata, I can determine who (you messaged), where (you sent the message), when (the message was sent), and how (messenger used). Only the what and why are ambiguous to me.

metadata
From dataedo.com blog on metadata

There are a few other terms commonly used in privacy parlance. Security of data is primarily divided into three categories.

  1. Security - the methods an individual undertakes to make sure only individuals or parties they authorize have access to their data. Typically this applies to the message data and normally excludes metadata.
  2. Privacy - the ability to control disclosure and access to data specifically pertaining to your identity. Such data points include but are not limited to name, permanent address, gender, marital status, etc. This is primarily, but not exclusively about metadata associated with a digital action.
  3. Anonymity - the ability to conceal your identity when taking an action digitally. This can be thought of as the logical extreme of privacy.

You’ll also notice that in order to achieve these concepts, you must master them in order. Digital privacy is impossible if you have no security. Similarly anonymity is impossible without having good security and privacy. With these terms defined, we can now move on to the topic of this article: Threat Modeling.

For the purposes of this article I will focus on digital threat modeling and will not discuss other types of threat modeling such as physical surveillance. However, depending on the threats you are mitigating, it should be considered for the highest threat models (say government dissident in an authoritarian regime).

Data Control

The method use to execute control over your data is mostly commonly encryption controlled by a password which you know. One caveat is that encryption typically doesn’t cover metadata. Consider email; PGP is a common technology used to encrypt email but the metadata about the email (originated IP address, time stamps, subject line) are not encrypted. Whenever people talk about encryption as a tool, check what is being encrypted. There are several other methods that can be used to execute data control. One such method is to store as much data as possible in offline media like a USB drive. It is completely immune to all forms of remote hacking unless connected to an internet enabled device.

While immune to remote attacks, the USB drive is vulnerable to physical attacks like theft. Thus many people will layer these protection methods together to greatly improve the security. A common method of layered digital file security is to store data on a removable media encrypted by a password they know. Assuming the password and encryption method is sufficiently advanced, the only vulnerability is when the drive is decrypted and under active use. However, each layer of security incurs a cost in both maintaining the security and accessing the data secured.

privacy spectrum
From VMWare blog on zero trust

This image is only slightly misleading. It is far more accurate to say that the Pareto principle applies where people will get 80% of the maximum security benefit for a service that is 20% less convenient. With that said, every measure taken to mitigate a threat comes with some convenience tradeoff. It is up to the individual to decide where the right tradeoff is between better security and more inconvenience.

To make these abstract concepts more tangible, we’ll start with some concrete examples.

Threat Modeling Examples

I think it’s easier to ingest these principles by using personas with example threats. Let’s start with the most permissive example and get progressively more restrictive as we go. To start with, we will take the average user threat model which is primarily concerned with convenience.

Average Avery

Avery is an person who thinks “I have nothing to hide” and “nothing I have is worth stealing” so he does not make any active decisions regarding his digital threat model. He reuses passwords often and lets Chrome remember them as it’s just soo convenient. He uses all of the popular social networks (Facebook, Instagram, Twitter) on a regular basis to post about anything.

What’s the problem here? This issue is the amount of data which Avery is inadvertently or purposely making public. Let’s say there’s a data leak at his employer which leaks a lot of personal identifiable information (hey, it happened to the DoD). Using this breach, they start to apply for fraudulent credit using Avery’s identity. However, they encounter questions to prove you are Avery including information like where they lived and what cars they bought. Using their social information they can get this information and create hell for Avery for years to come.

If the breach also includes a password in plain text then these hackers could gain access to other accounts and cause additional pain to Avery. If they were able to use this information to get access to his Google account, they could sign in and get access to all of his passwords and logins for all his accounts.

Secure Stacy

To an external observer, Stacy may seem similar to Avery. She is on the same social networks and services. However, Stacy has taken some basic steps to secure all of here accounts and improve her privacy. Firstly, Stacy’s social media timeline is only observable to friends and less of a complete story of her life and just important events with friends and family. She also uses a password manager like 1Password which allows her to generate unique and complex passwords for each digital account without having to memorize them herself.

Let’s take the same scenario from before and see how it plays out. Stacy’s personal identifiable information is leaked by a hack of her employer. However, while the criminals have enough information to start a new account process, they are unable to clear the hurdle of finalizing the credit. Due to being unable to complete the application, Stacy is notified that she has pending applications at a bank and immediately flags it as fraudulent. She can then take additional safeguards to prevent additional harm from this breach.

Most people would benefit from moving from a unconscious threat model to one where all of their “important” accounts are secured. It is imperative to consider what is under the “important” umbrella. Even innocuous accounts may have sensitive information you don’t want linked to the public. That dating site you signed up for may not be “important” to your finances, but leaked messages or perhaps revealing pictures on it might make it important. For an interesting take on how this can happen to average people, the Netflix true crime docuseries “Web of Make Believe: Death, Lies and the Internet” episode 4 “Sextortion” show how social media hacking was used to extort women.

Hardcore Harlan

Harlan is concerned about the effect that advertising and social media are having on society. While initially part of a few social networks and using email and other services by big tech companies, he has seen such surveillance capitalism approaches as harmful and need to be mitigated. Thus he makes sure to use services which respect user privacy, open source if possible. Some examples include Proton’s suite of email, calendar, and cloud storage which use end-to-end encryption, using Bitwarden as one of the few open source and audited password managers, and using secure messaging services with friends like Signal. He occasionally posts to Mastodon after scrubbing any metadata from any posted images.

To a lot of people this is very extreme. While it is still possible for very motivated threat actors to steal his data, the attack surface is much smaller and most hackers prefer to target either many easy to hack targets (think Avery), or a few high value target with sophisticated phishing and/or social engineering (think DoD hack).

Anonymous Amy

Amy is a government dissident in Syria. She needs to keep a low profile as communications with certain individuals or publicly posting anti-government messages can lead to death. With that in mind she uses a specific set of technologies to communicate and organize with people she trusts. She is using Graphene OS instead of stock Android to remove any tracking based software and generally removes her SIM card when she doesn’t need the internet. She uses Briar messenger to communicate with her colleagues and posts to social media sites using a VPN and the TOR network to make sure any internet action will not be traced back to her.

While most people in an established democracy will not need to fear for their lives due to political opinions, a knowledge of these tools and how to set them up is useful to have in mind. What may be free speech one day is not guaranteed to be so on with the next bill and internet privacy and security is under attack even from democracies.

My Threat Model

Wheh, all this just to get to my threat model. Of the people above, Harlan is my self-insert character. I am skeptical of government surveillance programs which have been shown to be of little value when subject to oversight boards. However, I specifically guard against surveillance capitalism. People and corporations generally follow their incentives. In the case of corporations, their incentive is to make money, specifically off your attention to show you ads for the “free” services. I don’t think these companies are out to literally be an evil corporation, but you end up with such an outcome when profit is above all else.

To that end, I have divested myself from as many “free” and non-privacy respecting services, preferring open source where feasible. I also seek to remove as many advertisements and algorithmically promoted content from my media consumption. However, that is not an absolute. I’ll cover each of the major categories I’ve migrated separately. What I have not done is pursued anonymity. I register anonymously if a service provides it. However, if I am uncomfortable with providing a service the identity data it requests, then I reconsider using the service instead of using pseudonyms.

I’d also like to note that my wife does not follow my practices. She has been patient with the changes I made moving off of Google services, but she did not agree that every mitigation I was taking was worth the cost. I’ll go over where we differed on our threat model. It’s important to remember this isn’t just about your personal threat model. My threat model also includes keeping my marriage safe.

Operating System

For my desktop operating systems I’ve migrated from Windows to Linux. Windows is a privacy nightmare and Linux has only been getting better and easier to run. I tend to bounce around between Linux OSs and don’t have any particular advice other than to use a Linux distro which best matches the workflow you’re moving from. For example, the Linux Mint Cinnamon desktop is very similar to Windows 10, while Elementary OS is very similar to Mac. I’m personally using Nobara on my laptop and Ubuntu on my desktop for various reasons.

On my phone, I use the Google Pixel line. However, I’ve replaced the Google Android with Calyx OS. Depending on your threat model, I’d recommend this or Graphene OS. I didn’t use Graphene OS since I prefer to get most of my phone applications from the FDroid and Calyx OS includes changes to allow FDroid to run without user intervention. This is similar to how Google play runs on other Android phones. I can still get Google Play apps from the Auroa Store for the few apps I need to install from there.

My wife actually uses an iPhone as here phone and Chromebook for her computer. I had tried to get her to use Calyx OS, but the lack a good mapping alternative she could easily use was a deal breaker. Thus Apple was deemed to be the best commercial offering for privacy and she uses Apple maps for navigation. I’ll give her my Linux laptop either when her computer dies or when I decide to upgrade my laptop. That is one issue with switching to more private options tends to be more expensive.

For internet browsing I keep a couple of different browsers used for different purposes. My setup is pretty similar across all of them though. For general internet information lookup, I use a privacy focused fork of Firefox (specifically LibreWolf for Linux and Mull for Android). I have these browsers setup to have most of the hardened features on and to clear browsing data on close.

For websites where I like/need to stay logged into, I use Brave on desktop and Calyx OS’s degoogled Chromium browser on mobile. I try to use the built in Chromium feature to install websites as desktop applications. This method also works for mobile browsers. My wife has switched over to Brave on mobile since their default setup increases privacy without breaking websites and doesn’t require any addons.

For search, I have switched over to using Brave Search. For a long time I was using Duck Duck Go, but some of their privacy snafus caused me to reconsider using their service. I then switched over to Startpage as my default. However, it’s important to know that Duck Duck Go and Startpage are what’s known as a metasearch engine. In other words, they are a proxy for other search engines, namely Bing and Google respectively. I was convinced to use Brave Search as they are using their own index so my search queries don’t go toward a big tech company.

Video Services

I experimented for a long time to find an affordable streaming setup that my family could use which provided additional privacy. Each option was either too expensive (buying a media center computer), not usable (stick computer couldn’t play the service), or not intuitive enough for the rest of the family to use. I opted to just use Roku and mitigated the privacy issue with them by blocking telemetry tracking using a DNS filter (I use NextDNS).

In terms of other video services, I do watch YouTube videos through proxy services like Invidious, FreeTube, or NewPipe. In addition, I try to watch provider content on alternative platforms like PeerTube or Odysee to reduce the reliance on YouTube. I’ve particularly liked the Nebula service as that supports the creators I watch and fits my threat model. Most of the edutainment content I watch on YouTube has migrated over to Nebula, making that a very clear choice for me.

Messaging Services

I have been able to move the people I most commonly message to Signal. Signal is a great option as it has been shown to be private even under court orders. Unless your threat model needs a fully decentralize messenger with anonymity, Signal is a proven private messenger.

My email choice may raise some eyebrows from digital privacy advocates: Zoho Mail. I needed an email service which allowed for me to affordably enroll my entire family (5 people), share calendars, and integrate well with mobile devices. Options like Proton and Tutanota, didn’t integrate well on mobile OSes and were generally pretty pricy (about $200 or more for all 5 people). Mailbox.org was a option I considered that largely met my criteria, but was still pretty expensive (€3/user or about €180 per year then convert to $USD). Compare that to Zoho Mail which is $12 per user per year for basic business mail or $60 a year. That and their privacy policy is reasonable and their incentives align with my threat model.

Social Media Services

This was pretty easy for me as I was rarely on social media to begin with and had deactivated my Facebook account on and off for years. I then took the plunge and completely deleted my Facebook account. The only other social media I use is LinkedIn as it has been a good tool for me to meet recruiters and find other employment. I have also recently started using Mastodon as a the biggest FOSS decentralized federated network primarily for finding other FOSS and privacy advocates.

My wife is still on Facebook and has, at times, removes it from her phone. However, her entire friend group is on there and likes to communicate using FB Messenger. So this is a must to stay in communication with her friends.

Everything Else

The other big thing in the privacy community is the use of VPNs. I don’t use them. All you are doing is transferring trust, and it’s alarming the consolidation and how many broken promises there have been of the ones which tout a “no logs policy”. I know that my ISP is spying on me and use tools like DoH and HTTPS for all regular web traffic and TOR browser when needed. I may change my mind about this in the future, but it can get quite complicated when I think about it for my whole family.

Phone navigation has also been a hard one to replace. I have bounced bach and forth between Magic Earth and Here Maps. It seems that the digital privacy community has a sufficient replacement for everything except phone navigation. And before you yell at your screen, yes I know about OsmAnd, but they don’t have street address numbers when doing address lookups. Also their long distance navigation has always been lacking.

For file storage, I switched to using a Synology NAS instead of NextCloud. I actually use to run a lot of self-hosted services from a computer I used as a server with NextCloud and a few other key services running. However, we had to move and there were difficulties with moving and keeping my service up uninterrupted. In the end, the experience forced me to reconsider the idea of self-hosting all of my key services as I was trading very high privacy for being the tech support when things go wrong. I make sure to keep the contents of the NAS synced to both of my computers and backed up encrypted to Backblaze B2 to follow the 3-2-1 backup principal.

Final Thoughts

I hope this was educational, entertaining, or interesting to you! This was a bit of a long post, but there was a lot to cover. Even with the length, I’m sure I missed some important aspect of my threat model which is just an automatic default to me. I don’t plan to write a lot of posts like this, or at least not very often. After all my goal for this blog is to be a communication avenue for my projects, not be my primary project! Until next time!

References