Amazon CloudFront

Networking & Content Delivery • CloudFront

Amazon CloudFront

Overview

Amazon CloudFront is a web service provided by AWS that gives an easy and cost-effective way to distribute content at low latency at a high data transfer rate. It is a content delivery network service.

Used by almost every website and web server for various purposes such as accessing the content, uploading and downloading etc.

Why do we need CloudFront?

We need CloudFront for low latency and fast data retrieval. latency is nothing but the time duration between the content requested to the server by the client and the content provided by the server to the client. Lower the latency and higher the performance.

  • In the above scenario let's say the Web Server is deployed in American Region and the clients/users of the American region get the lowest latency of 2ms. But for the same web application when it is accessed by a client from an Asian region or any other region the latency varies it may be high but it will never be equal to or approx to the American client latency.

  • Also, in the above scenario, the load on the server increased rapidly due to an increase in multiple user requests at a time.

  • To overcome that situation we use autoscaling but we can't help with latency,

  • There might be a possibility that we can deploy our web application in multiple regions so which may solve the latency issue but it will increase the cost and also cause problems in data replication.

    If we deploy the application in multiple regions we have to use multiple databases with the same data and it will be a challenge to make the database synchronized.

How CloudFront Works?

In CloudFront the deployed application is called the origin and each client is connected with their nearest edge location. In CloudFront, every request and response goes through the edge location between the Client and Server.

CloudFront or CDN Edge location acts as an intermediate that connects Client and Server. And each edge location is connected with the Origin server directly and the edge location can reach to Regional Edge Cache if required but Regional Edge Cache can't reach directly to the Origin Server.

  • Let's say a client from Asia wants to access the server (example.com) it will request the server but we're using CloudFront so DNS will redirect to the nearest edge location.

Edge location will search for the requested content in its cache memory. If it's available it will be delivered to the client.

  • In case if edge location doesn't find any data in its cache it will contact the origin server and provide to the client and store it in its cache memory as well (to serve other clients with the same data instead of receiving it from the server to achieve low latency)

  • If the edge location doesn't have any data the first client will get high latency for the very first request.

  • In case data is requested for the first time it will deliver the data to the client and store it in its cache memory simultaneously.

  • Using CDN provides an advantage like a reduced load on the origin server, fast data retrieval and restrict access etc.

  • There are more than 400 edge locations and 13 regional edge cache

Note: Nearest edge location is not based on the distance it depends on the latency i.e., which location delivers the content first.

Edge location depends on the number of users in that location.

What is CloudFront?

  • Cloudfront is a global service.

  • Amazon CloudFront is a web service that speeds up the distribution of your static and dynamic web contents such as HTML, CSS, Image files etc. to users.

  • When a user request that you're serving with CloudFront, the user is routed to the edge location that provides the lowest latency. So, that the content is delivered with the best possible performance.

  • If the content is already in edge location memory with the lowest latency, CloudFront delivers it immediately.

  • This dramatically reduces the number of networks that your user request must pass through, which improves performance.

  • If not, CloudFront retrieves it from an Amazon S3 bucket or HTTP/Web server that you have identified as the source for the definitive version of your content (origin server).

  • CloudFront also keeps a persistent connection with the origin server. So files are fetched from the origin as quickly as possible.

  • You can access Amazon CloudFront in the following ways -

    1. AWS Management Console

    2. AWS SDK

    3. CloudFront API

    4. AWS Command Line Interface

CloudFront Edge Locations

  • It acts as a cache memory for a specific area to deliver the content to its client requested from that area.

  • Edge locations are not tied to availability zones or regions.

  • Amazon CloudFront has more than 400 edge locations and 13 regional edge locations.

CloudFront Regional Edge Cache

  • Amazon CloudFront has added several regional cache locations globally, in proximity to your viewers.

  • They are located between your origin web server and the global edge location that serves content directly to your viewers.

  • As objects become less popular individual edge locations may remove those objects to make room for popular content.

  • Regional edge cache works as an alternative of origin to reduce the burden of origin.

  • The regional edge cache has a large cache width than any individual edge location so objects remain in the cache longer at the nearest Regional edge cache.

How Regional Edge Cache Works

  • When a viewer makes a request on your website or through your application, DNS routes the request to the CloudFront edge location that can best serve the user request.

  • This location is typically the nearest to the CloudFront edge location in terms of latency.

  • At the time of deletion of data from the edge location, it is transferred to the Regional Edge Cache right before deletion.

  • Regional Edge Cache holds the data in its cache longer than the Edge Location i.e., More than 24 hours.

  • In the edge location, CloudFront checks its cache for the requested files.

  • If the files are in the cache, CloudFront returns them to the user.

  • If the files are not in the cache the edge servers go to the nearest regional edge cache to fetch the object.

  • Regional edge cache has features parity with edge location. For example - A cache invalidation request removes an object from both the edge cache and regional edge cache before it expires.

The next time a viewer requests the object, CloudFront returns to the origin server to fetch the latest version of the object.

  • The proxy method PUT / POST / GET / DELETE goes directly to the origin server from the edge location and does not proxy through the regional edge cache.

  • Dynamic content as determined at request time doesn't flow through the regional edge cache but goes directly to the origin.

Advantage

  • Improves read performance, content is cached at the edge

  • 400+ Points of Presence globally (edge locations)

  • DDoS protection, integration with Shield, AWS Web Application Firewall

  • Can expose external HTTPS and can talk to internal HTTPS backends

What can be AWS CloudFront – Origins

S3 bucket:

    • For distributing files and caching them at the edge

      • Enhanced security with CloudFront Origin Access Identity (OAI)

      • CloudFront can be used as an ingress (to upload files to S3)

Custom Origin (HTTP)

    • Application Load Balancer

      • EC2 instance

      • S3 website (must first enable the bucket as a static S3 website)

      • Any HTTP backend you want

CloudFront Caching

  • Cache based on

    • Headers

    • Session Cookies

    • Query String Parameters

  • The cache lives at each CloudFront Edge Location

  • You want to maximize the cache hit rate to minimize requests on the origin

  • Control the TTL (0 seconds to 1 year), which can be set by the origin using the Cache-Control header, and Expires header...

  • You can invalidate part of the cache using the CreateInvalidation API

CloudFront Geo Restriction

  • You can restrict who can access your distribution

  • Whitelist: Allow your users to access your content only if they're in one of the countries on a list of approved countries.

  • Blacklist: Prevent your users from accessing your content if they're in one of the countries on a blacklist of banned countries.

  • The “country” is determined using a 3rd party Geo-IP database

    • Use case: Copyright Laws to control access to content

CloudFront Signed URL / Signed Cookies

  • You want to distribute paid shared content to premium users over the world

  • To Restrict Viewer Access, we can create a CloudFront Signed URL / Cookie

  • How long should the URL be valid? Shared content (movie, music): make it short (a few minutes) Private content (private to the user): you can make it last for years

  • Signed URL = access to individual files (one signed URL per file) Signed Cookies = access to multiple files (one signed cookie for many files)

CloudFront Signed URL Process

Two types of signers:

  • Either a trusted key group (recommended) Can leverage APIs to create and rotate keys (and IAM for API security)

  • An AWS Account that contains a CloudFront Key Pair Need to manage keys using the root account and the AWS console Not recommended because you shouldn’t use the root account for this

  • In your CloudFront distribution, create one or more trusted key groups

  • You generate your own public/private key The private key is used by your applications (e.g. EC2) to sign URLs • The public key (uploaded) is used by CloudFront to verify URLs

CloudFront – Field-Level Encryption

  • Protect user-sensitive information through the application stack

  • Adds a layer of security along with HTTPS

  • Sensitive information is encrypted at the edge close to the user

  • Uses asymmetric encryption

Usage:

  • Specify a set of fields in POST requests that you want to be encrypted (up to 10 fields)

  • Specify the public key to encrypt them

CloudFront - Pricing

  • CloudFront Edge locations are all around the world

  • The cost of data out per edge location varies

  • You can reduce the number of edge locations for cost reduction

  • Invalidation requests No additional charge for the first 1,000 paths requested for invalidation each month. Thereafter, $0.005 per path was requested for invalidation.

  • CloudFront Three price classes:

    1. Price Class All: all regions – the best performance

    2. Price Class 200: most regions, but excludes the most expensive regions

    3. Price Class 100: only the least expensive regions

refer to this for more about CloudFront pricing.

Did you find this article valuable?

Support Xander Billa by becoming a sponsor. Any amount is appreciated!