Uncomplicating cloud Security - Data protection (Part 5)
We all know how the poem goes, “Roses are Red and Violets are blue, if your Data isn’t protected then neither are you” or something like that.
What do the collapse of FTX and a book cover have in common? They both may look like one thing from the outside but are something completely different on the inside. Of course, we all know not to judge a book by its cover, yet that’s exactly what many did for several years with FTX. SBF fooled many financial experts and public figures to invest and endorse a now apparent crypto Ponzi scheme that almost nobody could have imagined. Yet, it’s hard to lie when you leave damning evidence and sloppy bookkeeping arguably in plain sight. For example what good is it to use an encrypted end-to-end messaging service if you name the group “Wirefraud” in the first place? Of course with this post, I don’t want to make scammers better at what they do but I do want to touch on what can keep your data on AWS safer.
As you’ll see protecting data is not about choosing “off-menu” AWS jungle AZ’s. Keeping your data far away from attackers by hiding in the dense rainforest.
We can get our direction from the Data Protection section of the AWS Well-Architected Framework's security pillar which shows us how to safely store, process, and transmit data. We will be covering the best practices for ensuring the confidentiality, integrity, and availability of data in the cloud, and it is essential for organizations to understand these practices and implement them effectively in order to protect their data and maintain compliance with relevant regulations and standards.
Lock and key
One key aspect of data protection in AWS is the use of encryption. Encryption is the process of encoding data in such a way that it can only be accessed by someone with the correct decryption key. AWS offers several options for encrypting data at rest and in transit, including server-side encryption, client-side encryption, and key management services.
Before diving into the different encryption methods let’s take a look at how crucial it is to leverage encryption at every layer with one well-known example of a data breach involving unencrypted data that occurred in 2017 when the credit reporting company Equifax announced that it had suffered a major cyberattack that exposed the personal and financial data of 143 million people.
According to reports, the hackers accessed the data through a vulnerability in Equifax's website software and were able to gain access to a database containing the personal and financial information of millions of consumers, including names, addresses, social security numbers, and credit card numbers. The data was not encrypted, which made it easier for the hackers to access and use the information. The breach had significant consequences for Equifax and its customers, including financial losses, damage to the company's reputation, and legal action.
Types of encryption
Server-side encryption is a process in which data is automatically encrypted by AWS when it is stored in a service such as Amazon S3, Amazon EBS, or Amazon RDS. This means that the data is encrypted at the server level and is secure even if someone were to gain physical access to the server. AWS supports a variety of encryption algorithms, including AES-256, which is considered to be highly secure.
When should you use it:
- When you do not have control over the encryption keys: With server-side encryption, the cloud provider manages the encryption keys, which means you do not need to worry about managing and securing them.
- When you need to maintain data confidentiality: Server-side encryption can help ensure that data is kept confidential, even if it is accessed by unauthorized individuals.
Client-side encryption is a process in which data is encrypted by the client before it is sent to AWS. This allows organizations to retain control over the encryption keys and ensures that the data is encrypted even if it is transmitted over an unsecured network. AWS offers several tools and libraries that can be used to implement client-side encryption, including the AWS Encryption SDK and the AWS Key Management Service (KMS).
When should you use it:
- When regulatory requirements mandate it: For example, EU's General Data Protection Regulation (GDPR), requires organizations to take additional steps to protect the privacy of personal data. It is unclear or at best not explicitly stated that encryption is enough to protect personal data under GDPR, due to the ambiguous context protect yourself by doing thorough due diligence.
- When you have control over the encryption keys: which means you can manage access to the data and ensure that only authorized individuals can decrypt it.
Both encryption methods are effective:
- When storing sensitive data: such as financial records, personal information, or intellectual property.
- When you need to maintain data integrity: This can be especially important in cases where the data is used for critical business processes and can’t afford for it to be compromised or tampered with.
Need to manage encryption keys? Use a managed service.
The AWS Key Management Service (KMS) is a fully managed service that makes it easy for organizations to create and manage encryption keys. KMS allows organizations to generate, rotate, and destroy encryption keys as needed, and it provides auditing and monitoring capabilities to ensure that keys are used correctly. KMS also integrates with other AWS services, such as Amazon S3 and Amazon EBS, to enable the encryption of data at rest.
Top secret clearance
Another important consideration for data protection in AWS is the management of access to data. AWS provides fine-grained access controls that allow administrators to specify who has access to which data and what actions they can perform on it. This can help to prevent unauthorized access and protect against data breaches. Administrators can set up internal stratification of permissions tiers and assign users to the tiers. You can get creative and set up clearance tiers similar to the FBI like, confidential, secret, and top secret.
For example, an organization might use AWS Identity and Access Management (IAM) to control access to data stored in Amazon S3. IAM allows administrators to create user accounts, assign permissions to those accounts, and define which actions users are allowed to perform on specific resources. This could include permissions to read, write, delete or list objects in a specific S3 bucket.
IAM also provides support for multi-factor authentication (MFA), which adds an additional layer of security by requiring users to provide a code from a hardware or software token in addition to their login credentials. This can help to prevent unauthorized access to data, even if someone were to obtain a user's login credentials. If this is one of the security tips that you already know but skip because it’s inconvenient, I would think twice since the exploitation of misconfigured MFA has been an increasing entry point for hackers famously in 2022, the travel booking website Expedia suffered a data breach that exposed the personal information of hundreds of thousands of customers. As a result of the breach, Expedia was quite embarrassed and forced to notify affected customers and offer credit monitoring services to help protect against identity theft.
Wait a minute, back it up!
Apart from encryption and access controls, organizations should also consider implementing backup and disaster recovery measures to protect against data loss. AWS offers a range of services for this purpose, including snapshot backups, snapshot copy, and replication.
Snapshot backups are point-in-time copies of data that are stored in a service such as Amazon EBS or Amazon RDS. These snapshots can be used to recover data in the event of data loss or corruption, and they can also be used to create new instances of a service with the same data. AWS automatically performs snapshot backups on a regular basis, but organizations can also create manual snapshots if needed.
Snapshot copies are copies of snapshots that can be created and stored in different AWS regions or accounts. Snapshot copies can be used to enable disaster recovery by allowing data to be quickly restored in the event of a regional outage or other disasters.
Replication is the process of copying data from one location to another, typically in real-time or near-real-time. In AWS, replication can be used to protect data by creating copies of data in multiple locations, which can then be used to restore data in the event of data loss or corruption. Replication can be implemented using a variety of AWS services.
Judge data by its label
A large section of data protection in the AWS well-architected framework in Data classification is focused on ensuring that data is classified and handled appropriately based on its sensitivity and value. This includes establishing clear policies and procedures for identifying and classifying data, as well as implementing appropriate controls to protect data based on its classification.
This section recommends using a consistent, systematic approach to data classification that takes into account the potential impact of a data breach on the organization, as well as any regulatory requirements or industry standards that may apply. It also suggests that organizations establish clear roles and responsibilities for data classification and implement controls to protect the data such as the access controls and encryption methods we touched on above.
There are several different approaches to data classification, but a common method is to use a four-tier system, with each tier representing a different level of sensitivity and risk. These tiers typically include:
- Public: This category includes data that is available to the general public and does not require any special protection or handling. Examples include publicly available information, such as news articles or company press releases.
- Internal: This category includes data that is intended for internal use only and is not meant to be shared with external parties. Examples include internal memos and financial reports.
- Confidential: This category includes data that is sensitive and requires some level of protection. Examples include HR reports, customer data, intellectual property, and trade secrets.
- Highly Confidential: This category includes data that is extremely sensitive and requires the highest level of protection. Examples include HR reports, financial data, legal documents, and strategic plans.
When classifying data, consider the potential impact on the organization if the data were to be compromised. For example, a breach of highly confidential data could have significant financial consequences, while a breach of internal data may not have the same level of impact. Since the data being classified is custom and independent of each company it’s crucial to perform internal risk assessments to make the decisions around which risk you can and can’t afford. They need not be huge and cumbersome undertakings, check out this article to get to the core of what a good risk assessment exercise should look like.
Conclusion
Hopefully, you will never have to pass through the harrowing experience and uncomfortable conversation with your manager about how an outage could have been avoided if you have just backed up the production database or if the confidential data had been encrypted properly instead of being stored in plain text. It is probably unintuitive to think just how delicate our cloud and compute systems really are, especially compared to the amazing things that they are capable of when they are running properly. But they are, if our planning and security measures aren’t implemented correctly we might be unaware of the dreaded single points of failure that can bring our systems to their knees in the blink of the eye so it pays to stay on top of data classification, access controls, encryption, and data backup and recovery protocols. By regularly reviewing and updating data protection measures, organizations can adequately protect their data and meet the needs of their customers and stakeholders.
Regardless if you are a Developer, DevOps, or Cloud engineer. Dealing with the cloud can be tough at times, especially on your own. If you are using Tailwarden or Komiser and want to share your thoughts doubts and insights with other cloud practitioners feel free to join our Tailwarden discord server. Where you will find tips, community calls, and much more.