Building for Billions: Addressing Security Concerns for Platforms at Scale
Security operations once consisted of a multitude of manual operations based around alerts, thresholds and severity levels. As systems scale and platforms continue to grow, how do you keep up with the growing requirements to secure these transactions and the networks they are built upon?
Multinational organizations that have billions of users and trillions of transactions per day have opportunities to ensure that failure is not an option. There are lessons to learn in looking at the “old ways” they are not willing to accept as they scale their existing operations to meet the challenges of securing a global platform.
As of September 2019, Facebook had 2.45 billion monthly active users worldwide. With a daily active user rate of 1.62 billion, this represents a massive amount of user interaction with the platform. With so many varied types of user inputs, and so much code being deployed to continue enabling feature updates for the platform, how would an organization as large as Facebook improve their software-development lifecycle (SDLC) process to care for scaling their codebase securely?
In this case, Facebook chose to address the problem early in the development stages (as best practice dictates) by developing an open-source tool called HACK. HACK is a programming language for Hip-Hop Virtual Machine (HHVM) that runs seamlessly with the general-purpose scripting language PHP. It provides the discipline afforded by static typing, without sacrificing the ability to catch errors early and inspect code quickly, which is particularly useful for larger codebases.
This code is a great example of a common mistake where a method could unexpectedly be called on a null object, causing an error that wouldn’t be caught until runtime:
By moving the controls responsible for finding errors and potential security issues closer to the beginning of the development lifecycle, Facebook is enabling themselves to increase their overall go-to-market speed with new updates, while maintaining security competency in its code.
Google has taken a similar approach by not just looking for point solutions to add to its arsenal to fix a problem, but by changing the very architecture it relies on to build its environment. Google has built upon the industry-influencing BeyondCorp model that it built to secure its own enterprise network environment.
Like BeyondCorp, the Google BeyondProd project was built on the foundation that there is no trusted zone that exists outside of the application itself. There is also no trust built on the service or application IP address. Instead, trust is built upon code provenance and service identity. The principles in BeyondProd are being hailed as the future of application security.
As many of the transactions that the modern internet is built upon require equally modern microservice architectures, BeyondProd assumes that VPNs, firewalls and trusted network ranges are not the way to establish trust within an application.
Since the cloud-native nature of new businesses is being developed around containerized microservices, this model provides greater security than trying to port your existing/old architectures for security to these cloud environments.
Key BeyondProd concepts are:
- Mutually authenticated service endpoints
- Transport security
- Edge termination with global load balancing and denial-of-service protection
- End-to-end code provenance
- Runtime sandboxing
As Google’s own CIO-level documentation states:
- “Google’s infrastructure deploys workloads as individual microservices in containers, and manages these workloads using Borg – our container orchestration system. This is an inspiration and template for what’s widely known today as a “cloud-native” architecture.
- Google’s infrastructure has been purposefully designed with security in mind; not added later as an afterthought. Our infrastructure assumes no trust between its services.
- Google protects its microservices with an initiative called BeyondProd. This protection includes how code is changed and how user data in microservices is accessed.
- Moving from a traditional security model to a cloud-native security model required us to make changes to two main areas, namely our infrastructure and our development process. Building shared components into a shared fabric enveloping and connecting all microservices, also known as a service mesh, made it easier to roll out changes and achieve consistent security across services.”
Making note of the key areas italicized above: Infrastructure assumes no trust between its services, and mutually authenticated service endpoints, end-to-end code provenance, runtime sandboxing and a service mesh to envelop connections between all microservices securely. Sounds like a punch list for application security and zero-trust euphoria.
The only catch is, you must have a cloud-native deployment, and be deployed into Google’s public cloud infrastructure. If you are both of those, then the services could probably help your organization piggy-back on both solid initiatives to better your security posture at scale.
If your environment is not cloud-native and you have to secure a large environment that’s already built – or if you’re transitioning between “on-prem” and cloud environments – then the most common approach is to scale both your infrastructure and security processes while lowering overall risk by performing a series of security functions on your behalf “before” traffic is sent to your environment.
Some content delivery networks (CDNs) fit into this category. By using a CDN in front of your computing environment, you can tackle difficult problems at great scale, without the costs of trying to scale the same environment and the depth of those security controls at a traditional cloud hosting provider.
Although there’s no silver bullet, security at scale has proven to be difficult for many large organizations. By leveraging new security architectures as well as emerging cloud platform capabilities, moving forward with this building for billions can be achieved securely.