Series of articles: defence in depth part 5 of 7

Infrastructure and data storage

The first three articles were about designing and obtaining an access token. In the fourth article, we discussed how to validate an incoming call and build a strong access control in our API.

In this article we will discuss the infrastructure you use to deploy the system we have developed so far. We will also look at what you need to consider in terms of data storage.

Infrastructure is a large and broad area. Based on our experience in security work, we choose to focus this article on the following key points:

Minimize public exposure of services and features
Encrypt all traffic with TLS end-to-end
Securing the use of third party dependencies
Secure data storage

Minimize public exposure of services and features

An important aspect of the principle of "Least Privilege" and defence in depth is not to expose more than is necessary. For example, if your API is not to be used by public clients, there is no need to have access to the network from the Internet. One way to limit the functions of our network that are exposed is through scale protection in the form of firewalls and gateways. We should not rely on scale protection, but it provides increased protection against misconfiguration and other mistakes further down the chain of functionality.

Some frameworks for developing APIs include reverse proxies, which have features that can be exploited by an attacker, or pure vulnerabilities. Security may not be a priority for the developer of the framework on which you are building your API. In many cases, it is recommended by the vendor of your framework to use an external product with a security focus as the first line of protection against incoming traffic.

The concepts and products in this area have overlapping functions. What they have in common is the aim of strengthening our protection against simpler, often automated attacks.

A classic firewall gives us a good basic protection against accidentally exposing services from our infrastructure, such as FTP, shared file systems and the like. An attacker looking for open, weak services to gain a foothold in our system will have a more difficult task if we use a properly configured firewall as a first line of defence.

A Web Application Firewall (WAF) can inspect incoming HTTP packets and detect, for example, common injection attacks, insecure HTTP headers or certain XSS.

An application gateway has additional features that may be important in a security context, such as rate limiting traffic from individual clients. We can also limit outgoing traffic from our system to a list of approved addresses. This makes it more difficult for an attacker to extract data after gaining a foothold in our system.

Another important feature from this type of product is the ability to prevent denial of service attacks (DDoS) from causing downtime to our system.

Following the principle of "Least Privilege", you should also turn off all services and functions that you do not use in your system. This is particularly important if you are using servers and not cloud services. Servers must also be kept up to date with any security patches made available by the provider.

Depending on your organisation, cloud computing has an advantage here over using your own servers. A cloud provider with a strong security profile, experience and expertise around the product may be able to do a better job of security compared to an in-house organisation.

Consider limiting exposure to data sources in the cloud. Often these are public by default, even if the cloud platform offers different types of network protection.

Segment your internal networks so that an attacker with a foothold in one part of your system has a harder time extending its presence to other subsystems. Also ensure that all user accounts with access to your infrastructure use strong passwords and multi-factor authentication (MFA).

User account management is a very important aspect of security that is often overlooked. Employees who no longer work with the system should of course not have access to the infrastructure. All accounts should also be personal so that we can trace who has made changes to the system. Traceability is an important aspect of investigation and follow-up when we have had a breach of our systems.

Rotating passwords to services is important for the same reason. Someone who no longer works with the system may still have old passwords to databases etc. See separate chapter on secure data storage for more information.

Encrypt all traffic with TLS end-to-end

Traffic on the network must be protected by TLS. It is important to understand what encrypted network traffic does and does not give us. With TLS we get:

Integrity
Confidentiality
Authentication on the client of the recipient of the call (e.g. an API)
Ability for the recipient to authenticate the client via mTLS (requires client certificate)

TLS does not provide us :

Anonymity
Traceability
Invisibility

Even if the traffic is encrypted, someone monitoring a node can see the IP addresses of the client and the recipient. This makes it possible to identify which parties are communicating, even if we can't see the content of the traffic. In other words, we don't get full anonymity.

TLS does not provide traceability because it is simply an encryption of the communication.

Non-repudiation is a stronger concept than traceability and means that the person who performed an operation cannot deny that it took place. I.e. the system can cryptographically link a user to a given read or write operation. TLS does not give us this stronger traceability because we cannot guarantee that we can link traffic to a user.

We believe that all traffic over HTTP with strong TLS (HTTPS) is sufficiently protected to prevent an attacker from viewing our traffic. It gives us protection for all data included in the call. This includes attackers who have full control over a node in the network. It does not include an attacker who has control over the node running any part of our system, such as the API or client.

TLS encrypts all data in the HTTP message, i.e. header, body and query. However, the recipient's address (host) is not protected.

Be sure to also use TLS in your test systems and development environments. It is perfectly reasonable today to require developers to use HTTPS during development. Settings and configuration to make exceptions in test and development environments have proven to have the ability to sneak into production systems as well.

Also make sure to refer all traffic that goes over unencrypted HTTP traffic to HTTPS. We also recommend that you use HTTP Strict Transport Security (HSTS) to inform browsers to use HTTPS only.

The choice of algorithms in the X.509 certificates used for TLS is important. Weak algorithms provide insufficient protection against a powerful attacker. You can test your certificates yourself and verify that you are using strong cryptography, for example at https://www.ssllabs.com/.

Where to terminate HTTPS may vary a bit between different systems and how they are deployed. We strive for end-to-end encryption, but there is a balance to determine where the termination occurs. OAuth2 and OpenID Connect require strong transport layer protection with TLS 1.2 or later. Choosing where we terminate TLS in a large system is important and can be a difficult balance between simplicity, security and complexity.

For example, in a Kubernetes environment, it is common for TLS to be terminated in the preamble, and for all traffic within the cluster to be unencrypted. This of course means that an attacker who has gained a foothold inside the cluster has access to all traffic, and thus for example all access tokens. If the cluster is limited to a system with few and well-controlled administrators, this does not open up a large attack surface. Then we can choose to terminate TLS in the preamble, which is a simpler technical solution compared to TLS up to the pod.

However, in large, unsegmented networks shared by many systems with many administrators, an attacker has a large attack surface to work with. A foothold on any node in the network means that we can no longer maintain the integrity and confidentiality of the systems running in an unsegmented network.

From a security perspective, there may be reason to consider an internal network as public, precisely because we consider it so large that we cannot reasonably consider it to achieve a good level of privacy or confidentiality for the purposes of our system.

A common scenario is that the node that terminates TLS at the entrance to an internal network, re-encrypts further traffic inwards. Then we have good internal protection of the transport layer, but note that that node can read all traffic and needs to be secured.

Terminating TLS before the call reaches our API makes it more difficult to handle certificate-based access tokens as it is based on mTLS. It is still possible to use certificate-bound access tokens, even if TLS terminates before the call reaches our API. One solution is for the node terminating TLS to pass on the certificate information we need in our API in an HTTP header.

Securing the use of third party dependencies

A modern system is based on components and services that we do not develop ourselves. All these dependencies need to be continuously updated to avoid creating vulnerabilities that could be exploited by an attacker. For example, a flaw in any of the components we use in our web client could create opportunities for an attacker to perform an XSS attack. Security, long-term maintenance and updating of these dependencies is an important consideration when we choose components and services.

Note that this applies throughout the lifetime of the system, not just during its development. The operation and maintenance of a system must include the updating of all components that affect safety. Careful selection, assessing the safety impact of the component and future maintenance, is an important part of our safety work.

A modern web application consists of packages that can be at least partially downloaded directly from external sources. Google Analytics and Google Tag Manager are two examples where we often download both packages and other content to our JavaScript runtime. If an attacker can gain control of the package source and deliver her own content, she also has full control over your application. This is also true for our API. The extent of the problem varies with the type of framework we use to build our application.

To reduce the risk of third-party malware reaching our application, we can fetch the packages at application creation instead of dynamically at runtime. This allows us to detect problems before they reach our systems.

This makes it more difficult for an attacker because she then needs to check the source over a longer period of time. If we still choose to fetch packages directly from an external source at runtime, we should verify that the package does not contain irregularities or vulnerabilities.

There are many tools that help us with both static and dynamic code analysis when creating the application. Examples include scanners that integrate with your build pipeline and look for known vulnerabilities in both your own code and external packages.

For packages that we nevertheless download dynamically in a web application, we can strengthen protection by using Subresource integrity. We then verify in our application that we are retrieving the right package and that the content is as we expect. The disadvantage is increased administration when packages are updated.

Secure data storage

All accounts used by our APIs to access data should have minimal rights to the data source they are linked to. Connection strings should be rotated at regular intervals. Passwords in connection strings should always be mask-generated and have high entropy. A human should never choose passwords to databases.

The user accounts used by administrators to connect to data sources must be personal and should be as limited as possible. This is partly to reduce the security risk, but also to minimise the scope for human error.

Data must be stored encrypted, often through product or operating system support.

Don't forget about database backup management. Many of the biggest attacks on IT systems have been against poor database backup practices. Examples of this are unencrypted backups placed in a public S3 bucket.

Some types of data are so sensitive that the database's own encryption is not enough, but the content also needs to be encrypted by the application. A good example of this is the storage of passwords, which should be handled as recommended (e.g. bcrypt or PBKDF2).

Summary

In this article, we have discussed the infrastructure you use to deploy the system and what you need to consider about data storage.

Just as we need centralised logging of our application, we need logging and monitoring of our infrastructure to detect intrusions and misuse. There are many logging and monitoring products on the market. Choose a solution that gives you a good overall picture and good possibilities for relevant, automated alerts.

Updating and maintaining your system is a very important aspect of security and includes everything from operating systems and services to the software components of your system.

In the next article, we'll take a closer look at the browser and the security challenges it presents for web applications.