The Importance of Proper Cloud Design
by Eric Rousseau, Senior Architect
For anyone unfamiliar with the cloud, the simplest way to describe it is as software and services that run on the Internet instead of locally in your own data center. The cloud can be consumed as SaaS (Software as a Service), with products such as Dropbox, Netflix, and Carbon Black Cloud Endpoint Protection for Next-Generation antivirus. These are often used through a subscription without concern for the underlying components. The other way the cloud is consumed is through PaaS (Platform as a Service) or IaaS (Infrastructure as a Service). In these scenarios, cloud providers like Microsoft Azure, Amazon Web Services, etc. provide almost endless building blocks for designing a virtual datacenter and infrastructure or cloud-based application. Components are available for virtual servers, storage, networking, load balancing, firewalls, microservices that can run code without the need for a server, and many more things being added daily. This discussion focuses on these areas where the design is left up to the customer for all deployment and design decisions. If a business is not already using the cloud for some aspect of their IT environment, then they are most definitely looking into it.
The cloud appears to make things very easy and allows for extremely quick deployment of resources. The cloud providers have already built all of the underlying infrastructures, across many regions and data centers around the world, so customers need only to sign in and start deploying. For engineers who have already been supporting physical servers, and virtual machine hypervisor environments like VMware, Hyper-V, and XenServer, they no longer need to spend the time necessary to deploy the physical foundation and ensure everything is fault-tolerant and properly networked. So, with even the most basic knowledge of computers and networking, a person can obtain a cloud subscription and start building the environment of their dreams. Many times, things are deployed and all seems to be working fine, but without the proper planning and designs, these deployments are destined for problems.
Like myself, many people out there are DIYers and have found themselves empowered by being able to accomplish something you would otherwise have to pay for. You can even compare the cloud to a major home improvement store where you are able to buy every part necessary to fix a small problem up to build an entire house. Stepping inside those doors, you feel like anything is possible. The problem many of us have found is that just because we are able to buy these tools and materials doesn’t mean we have all the knowledge or skills necessary to build that house from scratch. Sure, I can fix a sheetrock wall, but my seams on the finished product may not be perfect when I look at the finished paint. Plumbing fixed incorrectly will lead to major leaks, electrical done improperly will lead to fire, and without the knowledge of the proper code of construction, my house could be unstable, dangerous, and may require more money to repair than having the expert do the job. Think of the cloud as your house.
In order to use and grow into the cloud, there are many things that need to be considered, planned for, and built into the designs. If these topics are thoroughly planned for in advance, then patterns can be created and the cloud can be a secure, stable, and standardized environment. There are many detailed sources available for cloud design planning and execution, but here is a shortlist of major reasons why proper cloud design is important. This is list is by no means complete, and descriptions are kept short, but hopefully, they will give a little insight as to why organizations should invest in someone like Agilant Solutions to design their cloud solutions instead of going with the DIY route and hoping everything was thought out and planned.
The cloud is after all, on the Internet. Cloud networks can be isolated with only VPN connections to internal networks, but they make it so easy to expose data publically. A virtual server deployed with a public IP and ports open for Remote Desktop Protocol is easy prey to attackers. Publically available database servers, storage accounts, remote access ports, etc are all high risks. Designing the cloud with security in mind ensures network security rules are enforced, remote access is controlled, web application firewalls protect web servers, users work with the least privilege and credentials are vaulted.
Networks and Global Placement
The cloud makes it easy to locate resources in different parts of the country or world. Network design is critical as it is the data highway between systems and users. Cloud networks need to be planned using IP spaces that do not conflict with existing networks and should be allocated for growth. Analysis needs to be done to understand where resources will be placed across regions as well as how the communication will route, perform and where costs will incur. Will resources communicate across the public Internet, through service endpoints or through private endpoints? What is the required SLA and how will the network support that?
Stability and Fault Tolerance
Those who have worked with services on cloud providers know that they go down. They provide their own SLA for each component type, but customers must decide if that SLA is acceptable. Each cloud provides for designing fault tolerance into applications and infrastructure, but the correct components must be designed around the correct usage, acceptable cost, and whether or not their use is supported for the situation. Availability zones (different datacenters) can be used within the same region, services can be spread out between regions, load balancers are available and external DNS services can be used. If a failure is not planned for, business disruption can occur and money can be lost.
Sizing and Cost Analysis
With cloud components, different SKUs are available based on the need. Virtual Machines have SKUs that represent the amount of CPU, RAM, supported disk types, GPUs, etc., and each comes at a different cost. VPNs have SKUs based on ISP connection and whether or not they should have zone redundancy. Since the cloud is a subscription and consumption-based model, the right analysis and design will provide systems that work as expected with an acceptable and understood price.
Automation and Standardization
The cloud can be fully managed as code. Each cloud has its own specific system for automating resource deployment (Azure has ARM, AWS has CloudFormation) or a multi-cloud solution like HashiCorp Terraform can be used. If an organization relies on multiple engineers to deploy and configure cloud resources their own way, it will mirror the issues of having non-standard physical server deployments and configurations. Cloud designs should use code as much as possible, tags should be assigned to all resources to provide searchable metadata and policies can add a layer of control to the environment. Proper planning in this area will lead to fewer unknowns and cleanup later.
Monitoring, Performance, and Scaling
Now that mission-critical resources have been deployed to the cloud, how will you know all is well? This area is one of the most lacking due to the focus being primarily on deploying resources and then moving on to the next project. Cloud platforms provide great tools natively for monitoring as well as a wealth of open-source tools that can be used when monitoring different cloud platforms. Now that systems are running in the cloud and no longer local, it becomes even more important to have visibility. With the proper understanding of the applications running in the cloud, metric baselines can be set up, log analysis can be done, alerts can be established, dashboards can be created for visualization, and notification channels can be setup. Using these tools, cloud providers allow configurations to be defined for resources to be automatically scaled out as well to handle increased demand during peaks of activity.
Backup and Disaster Recovery
This topic is self-explanatory but critical to building into the design. Never leave out backup and disaster recovery. Policies that exist for on-premises applications must apply to those in the cloud as well. Will the configurations work as expected and guarantee the same RPO and RTO?
Refactoring and Going Cloud-Native
The cloud now has many new resource types that never existed before. The proper design will determine the best use of these and possibly allow applications to be refactored to new cloud-native designs. Azure Functions, Logic Apps, and AWS Lambda functions allow code and workflows to be active without requiring a server to be maintained. Cloud providers offer DB as a service, Web applications as a service, machine learning components, distributed compute nodes, etc. So many options are available for modernizing applications and moving away from the monolithic approach.
Now that all of the above topics have been designed in the cloud of your choice, does it also make sense to distribute your infrastructure across multiple clouds for diversity? Properly understand the application and business requirements will determine if the design calls for multi-cloud tenancy.
Many more topics can be added to this list, but what is here should begin to shine the spotlight as to why a well-designed cloud deployment is crucial to the success of its use. Like the home improvement store, the cloud will give you all the tools and parts to build amazing things, but their proper use must be understood and coupled with a solid design. In these days of ransomware and increased Internet criminal activity, no stone can be left unturned as companies extend their infrastructure and application to the cloud.