Deploying Identity and Access Management (IAM) Infrastructure in the Cloud - PART 2: DESIGN

Technology Blog

May 20

Written By Robert Miranda

Blog Series: Deploying Identity and Access Management (IAM) Infrastructure in the Cloud

Part 2: DESIGN - Architecting the Cloud Infrastructure

In Part 1 of this series, we discussed:

Overall design
Cloud components and services
“Security of the Cloud” vs. “Security in the Cloud”
Which teams from within your organization to engage with on a cloud initiative

In Part 2, we will take a closer look at core cloud infrastructure components and design concepts.

For a visual deep dive into Running IAM using Docker and Kubernetes, check out our webinar with ForgeRock.

The VPC Framework

VPCs

The “VPC” or “Virtual Private Cloud” represents the foundation of the cloud infrastructure where most of our cloud resources will be deployed. It is a private, logically isolated section of the cloud which you control, and can design to meet your organization’s functional, networking and security requirements.

At a minimum, one VPC in one of AWS’s operating regions will be required.

Regions and Availability Zones

AWS VPCs are implemented at the regional level. Regions are separate geographic areas, and are further divided into “availability zones” which are multiple isolated locations linked by low-latency networking. VPCs should generally be designed to operate across at least two to three availability zones for high availability and fault tolerance within a region.

Interconnectivity

When you are in the early stages of your design, it will be helpful to ascertain what kind of interconnectivity you will need. A VPC will more than likely need to be connected to some combination of other networks, such as the internet, peer VPCs and on-prem data centers. This needs to be considered in your design.

Public-facing resources that need to be accessible from the internet will require publicly routable IP addresses, and must reside on subnets within your VPC that are configured with a direct route to the internet. For resources that do not directly face the internet but need outbound internet access, one or more internet facing NAT Gateways can be added to the VPC.

Establishing connectivity to other VPCs either in the same or different AWS account can be achieved through a) “VPC Peering”, which establishes point-to-point connections, or by b) implementing a “Transit Gateway”, which uses a hub and spoke model.

Reference Architecture for Identity and Access Management (IAM) cloud deployment (select to expand) — Reference Architecture for Identity and Access Management (IAM) cloud deployment *(select to expand)*

Connecting VPCs and on-prem networks also presents some options to consider. Point-to-point IPSec VPN tunnels can be created between VPCs and on-prem terminating equipment capable of supporting them. A Transit Gateway can also be used as an intermediary, where the VPN Tunnels can be established between the on-prem network and the Transit Gateway, and the VPCs connected via “Transit Gateway Attachments.” This will reduce the number of VPN tunnels you need to support when integrating multiple VPCs. AWS “Direct Connect” is also available as an option where VPN connections are insufficient for your needs.

VPC IP Address Space and Subnet Layout

IP address space of VPCs connected to existing networks needs to be planned carefully. It must be sized to support your cloud infrastructure without conflicting with other networks or consuming significantly more addresses than you need. Each VPC will require a CIDR block that will cover the full range of IP subnets used in your implementation. Determining the proper size of CIDR blocks will be possible once the functional design of the cloud infrastructure is further along.

VPC Security and Logging

In keeping with the principle of “Security in the Cloud,” AWS provides a comprehensive set of tools to help secure your cloud environment. At a minimum, “Network ACLs” and “Security Groups” need to be configured to establish and maintain proper firewall rules both at the perimeter and inside of your VPC.

Other AWS native services can help detect malicious activity, mitigate attacks, monitor security alerts and manage configuration so it remains compliant with security requirements. It is also possible to capture and log traffic at the packet level on subnets and network interfaces, and log cloud API activity. You should take the opportunity to familiarize yourself with and utilize these services.

Some organizations may also have requirements to integrate 3rd party security tools, services, appliances, etc. with the cloud infrastructure.

EKS

ForgeRock solutions can be implemented as containerized applications by leveraging “Kubernetes” - an open-source system for automating the deployment, scaling and management of containerized applications.

“EKS” is Amazon’s “Elastic Kubernetes Service.” It is a managed service that facilitates deploying and operating Kubernetes as part of your AWS cloud infrastructure. From a high-level infrastructure perspective, a Kubernetes cluster consists of “masters” and “nodes.” Masters provide the control plane to coordinate all activities in your cluster, such as scheduling application deployments, maintaining their state, scaling them and rolling out updates. Nodes provide the compute environment to run your containerized applications. In a cluster that is deployed using EKS, the worker nodes are visible as EC2 instances in your VPC. The master nodes are not directly visible to you and are managed by AWS.

An EKS cluster needs to be designed to have the resources needed to effectively run your workloads. Selection of the correct EC2 instance type for your worker nodes will depend on your particular environment. ForgeRock provides general guidance in the ForgeOps documentation to help in this process based on factors like the number of user accounts that will be supported; however, the final selection should be determined based on the results of performance testing before deploying into production.

While Kubernetes will automatically maintain the proper state of containerized applications and related services, the cluster itself needs to be designed in a manner that will be self-healing and recover automatically from infrastructure related failures, such as a failed worker node or even a failed AWS availability zone. Implementing Kubernetes in AWS is accomplished by deploying nodes in a VPC across multiple availability zones, and leveraging AWS Auto Scaling Groups to maintain or scale the appropriate number of running nodes.

It should be noted that EKS worker nodes can consume a large number of IP addresses. These addresses are reserved for use with application deployments and assigned to Kubernetes pods. The number of IP addresses consumed is based on the EC2 instance type used, particularly with respect to the number of network interfaces the instance supports, and how many IPs can be associated with these interfaces. Subnets hosting worker nodes must be properly sized to accommodate the minimum number of worker nodes you anticipate running in each availability zone. These subnets must also have the capacity to support additional nodes driven by events such as scaling out (due to high utilization), or the failure of another availability zone used by the VPC.

Other EC2 Resources

Your VPC may need to host additional EC2 resources, and this will be a factor in your capacity planning.

ForgeRock Instances

Some ForgeRock solutions will require additional EC2 instances for ForgeRock components beyond what is allocated to the Kubernetes cluster. For example, these could include DS, CS, CTS, User, Replication and Backup instances depending on your specific design.

Utility Instances

Other instances that will likely be a part of your deployment include jump boxes and management instances. Jump boxes are used by systems admins to gain shell access to instances inside the VPC. They are particularly useful when working in the VPC before connectivity to the on-prem environment has been established. Management instances, or what we tend to refer to as “EKS Consoles” are used for configuring and launching scripted deployments of EKS as well as providing a command line interface for managing the cluster and application deployments. Additional tools such as Helm or Skaffold are commonly installed here.

Load Balancers

Applications deployed on EKS are commonly exposed outside the cluster via Kubernetes services, which launch AWS Load Balancers (typically Application Load Balancers). ALBs can be configured as public (accessible from the internet) or private. Both types may be utilized depending on your design. Application load balancers should be deployed across your availability zones, and each subnet that ALBs are assigned to currently must have a minimum of eight free IP addresses to launch successfully. An ALB uses dynamic IP addressing and can add or reduce the number of network interfaces associated with it based on workload, therefore it should always be accessed via DNS.

Global Accelerator

AWS recently introduced a service called “Global Accelerator.” This service provides two publicly routable static IP addresses that are exposed to the internet at AWS edge locations, and can be used to front-end your ALBs or EC2 instances. This service can enhance performance for external users by leveraging the speed and dependability of the AWS backbone.

It also can be used to redistribute or redirect traffic to different back-end resources without requiring any changes to the way users connect to the application, e.g. in blue / green deployments or in disaster recovery scenarios. When external clients require static IP addresses for white listing purposes, this service addresses the ALB’s lack of support for static addresses. Furthermore, the global accelerator has built-in mitigation for DDOS attacks. The global accelerator should be associated with subnets in each availability zone in your VPC, and will consume at least one IP address from each of these subnets.

DNS

Route 53 is AWS’s DNS service. It can host zones that are publicly accessible from the internet as well as private hosted zones that are accessible within your cloud environment. In some situations it may be desirable for systems in on-prem and / or peer VPCs to be able to perform name resolution in zones hosted by on-prem DNS servers. Conversely, there are situations where on-prem systems need to be able to resolve names in your private hosted zones in Route 53. This can be achieved by establishing inbound and/or outbound Route 53 resolver endpoints in your VPC, as well as configuring the appropriate zone forwarders. Resolver endpoints should be associated with subnets in each availability zone in your VPC, and will consume at least one IP address from each of these subnets.

Environments

A significant element of planning your cloud infrastructure is determining the types of environments you will need and how they will be structured.

For non-production, you may want to create separate lower environments for development, code testing and load / performance testing. Depending on your preferences or specific needs, these environments can share one VPC or have their own dedicated VPCs. In a shared VPC, these environments can be hosted on the same EKS cluster using different namespaces, or have their own dedicated EKS clusters.

For your production environment, you may want to implement a blue / green deployment model. This also presents choices, such as whether to run entirely separate cloud infrastructure for each, or whether to have shared infrastructure where the EKS cluster is divided into blue and green namespaces.

Again, these choices depend on your preferences and specific requirements, but your operating budget for running a cloud infrastructure will also influence how much cloud infrastructure you will want to build and operate. The more infrastructure that is shared, the less your ongoing operating costs will be.

Putting It All Together

Preparing Cost Estimates For Your Cloud Infrastructure

By now you should be developing a clearer picture of what your cloud infrastructure will look like and be better situated to put together a high-level design. The next step is to take an inventory of the components in your high-level design and enter this information into the various budgeting tools AWS makes available to you, like the Simple Monthly Calculator and TCO calculator. This step will provide you with estimates of your cloud infrastructure operating expenses and help you determine if the cost of what you have designed is consistent with your budget.

If costs come in higher than anticipated, there are approaches you can take to reduce your expenses. As previously discussed, sharing some infrastructure can help; for example, consolidating your lower environments into a single VPC if that is feasible. Since EC2 instances generally represent the largest percentage of your operating costs, exploring the use of “reserved instances” would also present cost savings opportunities once you are closer to finalizing design and are prepared to commit to using specific instance types over an agreed time period.

VPC IP Address Space and Subnet Layout – Part 2

We’ve already talked briefly about the criticality of planning your IP address space carefully. Now we will put this into practice.

Once you have prepared a high-level design that meets your functional and budgetary requirements, the next step is to identify the networking requirements of each component; e.g. the number of IPs EKS worker nodes will consume, or how many IPs are needed to always be available on a subnet for an Application Load Balancer to successfully be deployed.

Another factor to consider is the number of IP addresses reserved on each subnet by the cloud provider. On AWS, five addresses from each subnet are reserved and unavailable for your use. These factors, along with the number of availability zones you will be deploying into will drive the number and size of the subnets within each VPC.

You should now have the basis for determining the CIDR block sizes needed to build your cloud infrastructure as designed, and can work with your organization’s networking team to have them allocated and assigned to you.

Anecdotally, we have encountered situations where clients could not obtain the desired size CIDR blocks from their organization and had to scale back their cloud infrastructure design to meet the constraint of using smaller CIDR blocks. Engaging your networking team early in the design process will help identify if this is a potential risk and help you to more efficiently work through any IP address space constraints.

Next Steps

In this installment, we’ve provided you with detailed insight into what goes into designing public cloud infrastructure to host your ForgeRock implementation. In the next part of this series, we will move into the development phase, including tools, resources and automation that can be leveraged to successfully deploy your cloud infrastructure.

Next Up in the Series: DEVELOPMENT - Tools, Resources and Automation

CloudIAMIdentityAccessNetworking

Robert Miranda