Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.42 MB, 40 trang )
some time to evaluate the scenarios that a user might encounter if
your cloud provider encounters problems. How able are you to pro‐
vide a reasonable experience to the user until the situation can be
resolved? Is fallback to a different provider or other endpoint feasi‐
Thinking ahead to what kind of situations your users might
encounter in the event of a problem with your cloud environment
can provide valuable foresight. And, it can be the starting point to
updating your edge services to be able to compensate for issues
closer to your application.
Strategy 2: Embrace Processing at the Edge as
Part of Your Total Design
As edge services and edge devices become more advanced and pow‐
erful, it is an oversight not to consider and take advantage of their
functionality as part of an overall edge-to-cloud deployment strat‐
egy. For example, in many cases, Internet of Things (IoT) edge devi‐
ces can run analytics at the edge to produce useful, more compact
data rather than having to send it to the cloud for processing. Pro‐
cessing at the edge can reduce and/or complement processing that
would normally be routed to the cloud. This can also provide a
pathway for processing to continue even if the cloud functionality is
Strategy 3: Engage with Your Cloud Provider
to Arrive at the Optimal Topology
Moving to the cloud can mean moving into a world where every ser‐
vice is hosted and managed by the cloud provider. Cloud services
are usually designed in a security model with a shared level of
responsibilities between the cloud provider and the customer. Risk
exposure can be best minimized by having open conversations at the
beginning of a relationship with any internet and managed DNS
provider with whom you partner.
Conversations should not shy away from the questions that need to
be asked, such as these:
• How can we help address the challenges we face getting our
applications closer to our users?
Chapter 3: Strategies to Meet the Challenges
• In the service model chosen, where does your responsibility
begin and end?
• What do you recommend as the optimum approach or technol‐
ogy for load balancing internet traffic between endpoints?
• How many distinct transit providers are available for bandwidth
• What are your methodologies and procedures for outages/fail‐
Such engagements do not require limiting your approach to only
what that provider offers. But it will give you a wider context to
draw upon in making decisions.
Strategy 4: Increase Redundancy and
Reliability with Multicloud and Hybrid Cloud
While focusing on functionality at the user edge, it’s important not
to overlook cloud topologies that can assist in ensuring edge-toendpoint resiliency and reliability. Two of those topologies to con‐
sider are multicloud and hybrid cloud. Like adding two or more
bandwidth/telecommunications (telco) providers or a redundant
server, deploying a similar methodology in cloud and hybrid cloud
approaches should happen. Why? If your single-cloud instance/
provider is unavailable, how would you steer traffic to an alternate
vendor or site?
The most common approach today is a hybrid cloud strategy comb‐
ing and integrating resources on-premises with resources from
cloud providers. This approach is typically utilized to gradually shift
from an on-premises model to the cloud or as a result of workloads
that are not cloud native or ready for such a deployment. But strate‐
gically having on-premises resources that serve and interact with
users, regardless of whether cloud resources are available, can help
ensure that your users stay reliably connected to their applications.
A newer approach taking shape in the simplest terms, a multicloud
strategy centers on using two or more cloud service providers for a
single business need. The reasons for this can be to provide options
for pricing, protection against a single point of failure, better cover‐
age against multiple availability zones, or higher value propositions
Strategy 4: Increase Redundancy and Reliability with Multicloud and Hybrid Cloud Strategies
in the areas of SaaS, PaaS, or IaaS. Many of these same benefits also
serve the overall business needs of resiliency and reliability.
Strategy 5: Involve DevOps Staff in All Aspects
of Edge Services Planning and
DevOps is a set of ideas and recommended practices focused on
how to make it easier for development and operational teams to
work together on developing and releasing software.1 If you have a
DevOps practice within your organization, it’s important to involve
them in all stages as you are designing and planning for resiliency.
Developers need to understand what functionality should be imple‐
mented at the site and user edge. And operations specialists need to
know how they will need to configure the edge services or devices to
implement the desired resiliency. Ensuring that there is shared
awareness and buy-in upfront will help make sure your strategy can
be effective once deployed. Finally, DevOps in many organizations
have been required to run areas outside of their traditional wheel‐
house, namely managing the edge and core infrastructure. Tradi‐
tionally, this has not been the case. With newer drive to be nimbler,
DevOps teams have found themselves on the front line of responsi‐
bility when it comes to ensuring the cloud edge is secure, fast, and
redundant to meet the performance requirements.
Strategy 6: Inject Chaos to Find Weaknesses
Before They Affect Customers in Production
Chaos engineering involves “experimenting on a (distributed) sys‐
tem in order to build confidence in the system’s capability to with‐
stand turbulent conditions in production,” according to the
Principles of Chaos Engineering.
Chaos engineering is a proactive discipline that aims to help you
build trust and confidence in a system’s reliability under various
conditions before your customers experience those conditions. A
chaos engineering team works closely, in partnership and collabora‐
1 The Phoenix Project and Effective DevOps are a couple great resources for learning
more about DevOps.
Chapter 3: Strategies to Meet the Challenges
tively, with development teams to help them explore and overcome
any weaknesses revealed through chaos experimentation and test‐
There are two main techniques in chaos engineering: game days and
automated chaos experiments. Both techniques offer ways of explor‐
ing a system’s resilience across everything from infrastructure,
through platforms, the applications themselves, and even the people,
practices, and processes that surround production. Game days offer
a cheap and often fun exercise that exposes weaknesses in a system
by manually causing failures, usually in a staging environment, and
then exploring how teams of people and the tools they use can
detect, diagnose, and respond to those situations.
Game days are extremely powerful and often require no specialized
tooling, but they are expensive on people’s time, especially if the sys‐
tem is rapidly evolving and confidence in its resilience needs to be
maintained on a continuous basis. This is where automated chaos
experiments come in. The free and open source Chaos Toolkit is one
such tool in this space; you can use it to create automated chaos
experiments that can then be used to choreograph experiments,
without manual intervention, over a range of chaos-creating and
Automated chaos often begins with chaos experiments, in which an
experiment is defined and then executed against the target systems
to see whether there has been any deviation in normal functioning
that lets you know whether a weakness is present. In this “experi‐
mental” phase, the automated chaos experiments are deemed valua‐
ble and successful when they find a weakness. A weakness is an
opportunity to improve the system, and this is celebrated.
The next phase of an automated chaos experiment’s life is often then
to become more of a chaos test. In contrast to the first phase, now
the automated chaos test is continually executed to prove that prior
weaknesses or even new weaknesses are not reintroduced over time.
A chaos experiment empirically proves that a weakness is present
and needs to be overcome, and this learning is celebrated. A chaos
test, often executed continuously, is celebrated when it does not
detect a weakness. Over time a suite of automated chaos experi‐
ments and tests then becomes the baseline for confidence and trust
in your production system’s resilience, and it is extended and grown
as new weaknesses are encountered or anticipated.
Strategy 6: Inject Chaos to Find Weaknesses Before They Affect Customers in Production
Strategy 7: Use Managed DNS Functionality to
Limit Endpoint Exposure and Network
DNS services have matured over the past 30 years from a simple
lookup service to a powerful cloud-hosted service allowing you to
rely on managed DNS infrastructure and services to better manage
internet volatility and routing issues at the edge—before they affect
your content or services endpoint.
Managed DNS refers to a service from a provider that allows users
to manage their DNS traffic via specified protocols for cases such as
failover, dynamic IP addresses, geographically targeted DNS, and
more. To illustrate this approach, consider an application that must
remain available across several large cities across the globe. In this
example, the network traffic can be busy with ever-increasing user
“Data sovereignty comes into play when an organization’s data is
stored outside of their country and is subject to the laws of the
country in which the data resides. The main concern with data sov‐
ereignty is maintaining privacy regulations and keeping foreign
countries from being able to subpoena data.”2
In some cases, you might need to create an environment in which
certain content is served up based on a specific region. When you
need to apply certain region-based security requirements and data
sovereignty regulations, you can use DNS to keep users and their
data within a defined region versus routing to all available sites
across the world.
The concept of geolocation requires a more specific example. If your
company operates an online retail business in Germany, and the
German government states that all users’ personal data must stay
within Germany, you can configure DNS to meet that requirement
and ensure that all queries originating in that country stay in that
You can accomplish this by intelligently routing these requests to
endpoints within Germany in such a way as to prevent routing
2 R&G Technologies, “Data Sovereignty: What it is and why it matters”.
Chapter 3: Strategies to Meet the Challenges
through another country and back again to Germany. This can be
important not only to comply with regulations, but also to target the
most efficient pathways for a given area’s service providers.
In this example, you are using DNS-based, geolocation load balanc‐
ing at the user edge to route internet traffic based on policy that you
design to reduce the risk of internet volatility along the way and to
ensure every user across the globe receives the optimal experience.
No strategy is a complete solution for the challenges facing your net‐
work. The range of threats and complexity of technology continue
to grow at a pace that is daunting for even the most proactive busi‐
ness to keep up with. And it can be impractical to retrofit strategies
that haven’t been designed for resiliency from the beginning.
But even without the opportunity to start over in designing a more
resilient network, the approaches outlined in this chapter can be val‐
uable as a starting point. They can serve as the genesis for crucial
conversations with colleagues or management around the exposures
your business will be or already is facing. Questions that you should
be asking yourself include: What kind of experience will our users
have if our cloud provider goes down? Would we benefit from a hybrid
cloud (or multicloud) strategy versus what we have now? Would
adding a managed DNS service buy us anything?
Eventually, another question will likely force its way to the front of
your thoughts: What can we do right now to start protecting our net‐
works? Realistically, out of all the strategies outlined in this chapter,
the most comprehensive answer to this question is the near-term
adoption of a managed DNS service. You can adapt a managed DNS
service to an existing network and immediately begin shielding the
critical entry points from attacks, service degradation, and other
points of failure.
Even though a managed DNS service might not be the right answer
for every business, it’s potential value as a solution that you can
implement in the near-term is worth understanding. This is
especially true for anyone who can also imagine the potential busi‐
ness disruptions in the near-term from unseen threats at the edge. In
Chapter 4, we further explore what the managed DNS service is and
what you should know about it as you consider it as a potential solu‐
Chapter 3: Strategies to Meet the Challenges
Managed DNS Services
A managed DNS solution might represent the best near-term strat‐
egy for protecting a business from the challenges that face today’s
networks at the edge. In this chapter, we dive deeper into what bene‐
fits it can provide.
At a high level, we can define a managed DNS service as a service
sourced through a specialized DNS service provider that enables
users not only to manage DNS traffic, but also to access advanced
features, including active failover, load balancing, dynamic IP
addresses, and geographically targeted DNS.
Each managed DNS service provider brings its own value proposi‐
tion to users looking for such services. We’ll explore some of the
typical provider services later in this chapter. But first, it’s worth dis‐
cussing where you can use traditional DNS services as a foundation
for understanding what a managed DNS provider can offer.
Benefits of DNS
Some of the many uses to which organizations apply their DNS ser‐
vice include the following:
You can use geolocation load balancing for performance opti‐
mization; routing the request to the server closest either to the
user edge or to the endpoint.
Ratio load balancing allows for a gradual transition to the cloud;
you can migrate some traffic to new cloud-hosted resources
environments to test and validate access and then slowly move
more traffic when ready.
Active failover allows you to establish a second endpoint or
multiple alternate endpoints to which the first can fail over to
ensure availability and health of the connection path.
Containers can be published to multiple clouds, but are the
clouds themselves load balanced? DNS traffic steering enables
users to keep containers highly available, load balanced, and
In the sections that follow, we look a little more closely at three
areas: performance, availability, and security. These benefits apply
whether the DNS is delivered from a managed DNS service provider
or bundled with services from a CDN, a local ISP, a web application
firewall (WAF) provider, or even your own data center staff.
Internet traffic is at an all-time high and shows no signs of slowing
down. Correspondingly, network infrastructure in most companies
is struggling to keep up. In any number of scenarios today, servers
(whether physical or virtual) can become overloaded. Any time a
server is at or near capacity, it can have a direct negative effect that
can ripple throughout the network. By taking advantage of the rout‐
ing capabilities of DNS, traffic and requests can be routed to alterna‐
tive systems not experiencing as much load. To truly make this
effective though, there must be a predetermined plan that is put into
action quickly (preferably automated).
Another, less obvious, benefit is being able to direct traffic to test
systems for performance testing. Using DNS infrastructure, the
developer can run test environments in real time. To move traffic
from test to production, developers can change time-to-live (TTL)
settings, redirecting traffic to the chosen location.
On a related point, using decentralized DNS for nameservers to
resolve your queries inherently reduces latency and maintains a
Chapter 4: Managed DNS Services
smooth user experience. It can also eliminate the need to trouble‐
shoot unidentified performance issues with your telco provider.
Outages can happen at any time. If your business has multiple data‐
centers or uses a service with multiple datacenters, DNS can be the
traffic diverter from the outage area to another location that keeps
your customers in business. Where practical, you can also do this on
a small scale with other failover/clustering technologies.
You can also utilize DNS to route traffic away from your legacy data‐
center during maintenance times. Having the ability to take control
at the DNS layer and reroute traffic in a quick, transparent way pro‐
vides a key advantage in continued availability to your customers.
Increased levels of security threats are occurring at the DNS level
every day. These arise from DDoS attacks, malicious bots, malware,
and other application vulnerabilities that propagate via the network
back roads. Access via DNS must be guarded, but it also becomes
the first place the threats can begin with mitigation. Often, DNS
amplification or reflection attacks are prime suspects during a DDoS
DNS has a significant role to play in providing edge resilience, pro‐
tection, and stability. It has close proximity to the edge and we can
also use it to direct traffic where we need it to go—transparently to
the customer. Think about how traditional failover occurs at the
network or server layer. Deploying an intelligent DNS network that
can steer traffic at the apex of a domain would be faster versus rout‐
ing to an end node and then making a decision. This is important
when reducing latency and quickly issuing a new session should a
failure occur in order to serve your clients.
Note that none of these scenarios we have discussed required a man‐
aged DNS service—just DNS. It is possible with enough staff, plan‐
ning, and priority responses to utilize DNS without utilizing other
additional tooling as a shield. Beyond that, it is also possible to
automate some aspects of a DNS strategy in-house, assuming
resources are available.
DNS Anycast Networks
Anycast is a one-to-many network routing scheme in which a desti‐
nation address has multiple routing paths to a variety of endpoints
(at least two). Anycast DNS routing allows traffic to be distributed
to multiple datacenters, providing global active-active load balanc‐
A DNS anycast network offers several benefits. With a DNS anycast
network, you can route requests to the closest PoP for the best
response; take advantage of the one-to-many relationship between
IP addresses and their associated nameservers; distribute traffic
from a single IP address to different nameservers based on the ori‐
gin of the request; and add multiple telco providers to these name‐
servers, adding another level of redundancy at these PoPs. Why does
all this matter?
By routing requests to the closest nameserver, the resolution time is
greatly reduced, and users experience improved overall perfor‐
mance. This effect is magnified for websites that include multiple
DNS lookups for additional files and assets that need to be loaded
before a page completes. Web apps must resolve various compo‐
nents for the user edge to become successful, and this is where DNS
can potentially make or break the online experience. Some organiza‐
tions believe their cloud provider or telco speeds and feeds drive
performance—and this is true—but DNS and CDN networks also
An intelligent edge positions your organization to deliver continued
optimal responses to:
• Planned interruptions resulting from routine maintenance or a
switch to new cloud services
• Unplanned outages due to inclement weather, power failures, or
faulty fiberoptic lines
• Redundancy based on having multiple anycast PoPs or multiple
transit providers per PoP
Chapter 4: Managed DNS Services