JWT – Token Based Authentication


, , , , , ,

In my earlier post on Cryptography, we looked at some of the cryptographic techniques and cryptographic functions that are commonly used to secure the data.

In this post, we’ll discuss JSON Web Token (JWT) which is one of the most commonly used token based authentication. It has become quite popular since it allows the distributed systems to securely communicate with each other without having the need to share the user id and password. In the microservices architecture, it is important to ensure that the APIs are able to authenticate and authorize the end-user identity seamlessly which is both secure and scalable. JWT allows the APIs to share the identity of the principal and verify its identity to establish the trust before sending the response back.

Authentication Token Types

Token by Reference – Reference based token contains no information except some random strings which is useful to the authentication server or the entity that generated the token. Token information including the identity of the principal is known only to the authentication server or the entity that generated the token.

Token by Value – Value based token contains all the information in the token itself. Anyone can read the information contained inside the token. JWT falls under this category as it contains all the information pertaining to the token as discussed later in the post.

JSON Web Token

JWT is a compact and URL-safe means of representing claims to be transferred between two parties. JWT is self-contained since all the necessary information is contained in the payload within the token itself. JWT tokens are quite widely used for securing the REST APIs in a microservices architecture. JWT can also be used as the identity token in the OpenID connect, access tokens in OAuth and as a session id stored in the cookie after the user is authenticated.

JWT token could be either JSON Web Signature (JWS) or JSON Web Encryption (JWE). When the JWT is signed by using either a symmetric key or an asymmetric key then it is known as JWS.

JWT/JWS Token Structure

JWT token consists of 3 sections that are separated by dots and base64URL encoded.



JWT header is a JSON object that usually contains 2 attributes and provides information about how the JWT signature should be computed.


typ – Identifies the type of object i.e. JWT

alg – Identifies the type of hashing algorithm (HS256, RS 256 etc.)  used to sign the token


JSON object containing the information about the token i.e. the claims. Some of the claims are predefined based on the JWT standard. Custom claims can be added which are not part of the JWT standard. Even though anyone can see the claims, payload cannot be tampered since the JWT token becomes invalid if any of the claims are modified.

Below are some of the standard claims that identify the entity which issued the token, subject of the token, TTL (Time To Live) and the audience that identifies the group of entities for which the token is valid.

  • iss (Issuer) – Identifies the principal that issued the token
  • sub (Subject) – Identifies the principal that is the subject of the token. It is usually the entity for which claim statements are being made.
  • aud (Audience) – Identifies the recipients (audience) that the JWT is intended for. JWT is rejected if the principal processing the claim does not identify itself with the value in the audience when the claim is presented.
  • exp (Expiration Time) – Identifies the expiration time for the token
  • nbf (Not Before) – Identifies the time before which the token is invalid for processing
  • iat (Issued At Time) – Identifies the time at which the token was issued
  • jti (JWT Id) – Unique Identifier that can be used to prevent the JWT from being replayed


JWT token is secured by using digital signature or by generating the Message Authentication Code (MAC). Asymmetric algorithm (e.g. RS256) is used for signing the token using the private key of the entity generating the token. On the other hand, symmetric algorithm (e.g. HS256) uses a shared secret key known to both the parties i.e. token generator and token validator.

Signature algorithm – HS256

key= "Shared Secret Key"

unsignedToken = base64URLEncode(header) + "." + base64URLEncode(payload)

signature = HMAC-SHA256(unsignedToken , key)

Signature algorithm – RS256

key= "Private Key"

unsignedToken = base64URLEncode(header) + "." + base64URLEncode(payload)

signature = RSA-SHA256(unsignedToken , key)

Below diagram shows how both the symmetric and asymmetric algorithm can be used together for authenticating the API consumer to the API provider using the Authentication Server. API consumer (microservice A) generates the HS256 JWT token using its own shared secret key and sends it over to the authentication server. Authentication server validates the HS256 JWT token using the shared secret key of microservice A. If the token verification is successful, authentication server generates the RS256  JWT token using its own private key and sends it back to microservice A. Microservice A sends the request to microservice B with RS256 JWT token that it got from the authentication server. Microservice B can authenticate the request from microservice B by validating the RS256 JWT token using the public key of the authentication server. Since microservice A has already established the trust with the authentication server, microservice B can trust the request from microservice A if the RS256 token is successfully validated.


JWT does not encrypt the data and hence it does not offer confidentiality. Instead, it  provides a standard based (RFC7519) and scalable solution to authenticate the request by encoding and signing the data in the token.

AWS – Relational Database Service


, , , , , ,

Amazon Relational Database Service (RDS) is a fully managed and cost efficient database service that makes it easy to provision, manage, and scale a relational database in the cloud. Amazon RDS provides an option to choose from the 6 available relational database engines –

  • Commercial
    • Oracle
    • Microsoft SQL Server
  • Open Source
    • MySQL
    • PostgreSQL
    • MariaDB
  • Cloud Native
    • Aurora

Benefits of using the RDS service.

  • No Infrastructure Management
  • Cost Effective – Pay as you go model
  • Instant Provisioning – Database up and running in few minutes
  • Ease of Scaling up or down  – Resize the database instance and storage to get an appropriate configuration of CPU and memory
  • Compatibility with existing applications

When compared to DynamoDB which offers push button scaling without any downtime, RDS requires a bigger instance types or read replicas to scale the database for larger workloads.

RDS Backups

Automated Backups

AWS RDS creates automated backups of the database instance during the backup window selected for the primary instance. If no back-up window is selected, a default window of 30 mins is automatically assigned to the DB instance. Backup is taken from the stand-by instance if the RDS service is running in multi-AZ. Automated backup allows you to recover the database to any point in time during the retention period. Automated backups are saved according to the retention period specified for the DB instance. During automated backups, storage I/O maybe suspended briefly resulting in  an elevated latency for few minutes.

The retention period can be specified between 1 and 35 days with default value being 1 day when the DB instance is created using Amazon CLI or RDS REST APIs, or 7 days if the DB instance is created using AWS console.  The automated backup can be disabled by setting the retention period to 0 days. There is no limit for the number of automated snapshots that can be taken for a given region.

Automated backup takes a full daily snapshot of the DB instances and also stores the transaction logs throughout the day. During the database recovery, it will select the appropriate full day snapshot captured during the retention period and will apply the transaction logs in order to recover the database to a point in time up to a second. DB restoration creates a new RDS instance with a different endpoint than the original RDS instance from which it was restored.

Automated backups are deleted when the database instance is deleted after which the backups cannot be recovered. Automated backup allows you to recover the database in the same region as the database

Manual Snapshots

Manual snapshots are user-initiated and hence do not get deleted when the DB instance is deleted. It enables you to recover the entire database from a single snapshot. Manual snapshot is useful for the longer retention period beyond the 35 days limit with the automated snapshots.

Manual snapshots can be used for cross-region backup by copying the manual snapshot to a different region and then restoring the database. There is a limit of 100 manual snapshots per region.

RDS Features

High Availability

RDS provides High Availability by allowing DB instances to be created in multiple availability zones. Multi-AZ allows to create an exact copy of the database in a different availability zone than the primary DB instance. Any data written to the primary DB instance is automatically replicated to the stand by instance in a different availability zone within the same region.

The replication is synchronous and hence the database can be failed over to the stand by instance automatically without any manual instance. During automated fail-over the stand by instance is promoted to be the primary instance. This is done by updating the DNS CNAME record of the DB endpoint of the master to the endpoint of the stand by instance. This DNS fail-over can take up to 2 minutes and is completely transparent to the application since the DNS endpoint used by the application does not change.


Multi-AZ setup is for High Availability and Disaster Recovery only. It is not designed to be used for improving the performance and scaling the database for heavy workloads.

Read Replicas

One of the options to scale RDS Database with read heavy workload is to create read replicas. It allows you to have a read only copy of your primary database. The data is automatically replicated asynchronously from the primary database to the read replica copies across multiple regions. Read replica can be used for reading the data only and cannot be used for writing the data. All the data is written to the primary DB instance.

It is possible to create up to 5 copies of read replicas either from the primary DB instance or read replica itself. Each read replica will have its own DNS endpoint.

Read replica can only be used for scaling. It is not designed for High Availability and Disaster Recovery. However, the read replicas can be promoted to primary for faster recovery in case of a disaster.

Read replicas can be created in a different region and currently supported for MySQL and MariaDB. While creating read replica, you can select a DB instance class different from the instance class of the primary DB.

Data Protection

  • In Transit Encryption
    • Can enable SSL for all the inbound and outbound traffic from the database instance
  • At Rest Encryption
    • Supports AWS Key Management Service (KMS)
    • Two tiered key hierarchy
      • Data Key encrypts customer data
      • AWS KMS master-key encrypts data keys
    • Provides an option to encrypt the database at the time of provisioning
    • Can only be encrypted at the time of database creation and cannot be removed once the database is encrypted
    • Unencrypted snapshots can be turned into encrypted snapshots while copying the snapshots
    • Cannot copy encrypted snapshots or replicate encrypted DB across regions

Aurora – Cloud Native Relational Database 

Amazon Aurora is a MySQL and PostgreSQL compatible cloud native relational database to provide better performance at a lower cost when compared to other relational databases.

Relational databases were not designed for the cloud with multiple layers of functionality (SQL, transactions, caching and logging) in a single monolithic stack storing the data in the storage layer. Aurora DB has a distributed design based on the Service Oriented Architecture. Logging and storage layer has been separated from the monolith and made distributed, scale-out and multi-tenant. Storage layer is distributed across 3 Availability Zones and striped across hundreds of nodes.

Aurora Features-

  • Scaling
    • Storage Scaling – Start with 10 Gb and scales with 10 Gb increments up to 64 Tb
    • Compute Scaling – Up to 32vCPUs and 244 Gb of memory
  • High Availability
    • Storage – Maintains 6 copies of the data on the storage nodes (2 copies each in 3 Availability Zones). These 6 storage nodes are on a peer to peer gossip network and used quorum system for read/write.
    • Designed to transparently handle loss of 2 copies of data without affecting write availability and  3 copies of the data without affecting read availability
  • Read Replicas
    • Aurora Replicas – 15 replicas with automatic fail-over. You need to assign a priority tier which is used for deciding which replica is promoted to primary during fail-over
    • MySQL Replica – 5 replicas without automatic fail-over
    • Single DNS read replica endpoint is possible for Aurora

AWS S3 – Access Management


, , , , , , , , ,

In my earlier posts on Cloud Storage and AWS S3, we discussed different storage types offered by the cloud providers and the Simple Storage Service (S3) by Amazon for storing the objects in the cloud. In this post, we’ll look at how to manage the access to buckets and objects stored in the AWS S3.

Buckets and objects created in AWS S3 are private by default with read and write access granted only to the owner who created the resources. Resource owner can grant access permission to other entities by creating policies that could be attached to buckets, objects or users.

Depending upon whether the policy is attached to a user or a resource, policy options can be broadly categorized as either the user based policies or resource based policies.

Resource Based Policy

Resource based policies are applied to buckets and the objects contained within the bucket. There are 2 types of resource based policy – Bucket Policy and S3 Access Control List (ACL)

S3 Bucket Policy

S3 bucket policies are attached to buckets only and determine which principals are allowed or denied the actions on the bucket. The permissions granted via. the bucket policy also applies to all the objects within the bucket.

Bucket Policy includes following information-

  • Version – Version element specifies the policy language version. The most recent version that is currently available for use is 2012-10-17.
  • Statement.Effect – Effect element under statement specifies whether the statement will result in an allow or deny.
  • Statement.Principal – Principal element under statement specifies the user entity to which the policy applies.
  • Statement.Action – Action element under statement specifies what set of actions are allowed or denied on the S3 resource.
  • Statement.Resource – Resource element under statement specifies the S3 resource to which the policy statement applies.

S3 Bucket policies provides a simple way to manage access to bucket allowing cross account access without the IAM roles. S3 supports bucket policy of up to 20 kb. Bucket policy provides a better visibility in knowing who can access the specific S3 bucket to which the policy is attached.


S3 ACL is a sub-resource that’s attached to every S3 bucket and object granting full access to the owner who created the resource as a default ACL policy. ACL policy identifies which users and groups are granted access and the type of access.

S3 ACL allows to manage the access at both the bucket and object level. ACL policy is the only way to manage S3 access at the object level. ACL is also suitable in those cases where the bucket policy or IAM policy grows large and exceeds the size limits imposed by them.

User Policy (IAM Policy)

User policies are applied via. IAM to users, groups, or roles to specify what actions are allowed or denied on what S3 resources. User policy is similar to bucket policy specifying the statements that identify the S3 resources and the corresponding actions that are allowed or denied to the users to which is policy is applied. Unlike bucket policy which has a principal defined in the policy itself, there is no principal defined in the user policy since the principal is the user, group or role to which the policy is attached.

IAM user policy is better suited for use cases where you want better visibility into what a specific user is allowed to do in the AWS environment including the S3 resources. It is also easier to manage the S3 access when there are a lot of buckets and its hard to manage the access by defining bucket policy for each S3 bucket.

IAM policy has a limit of 2kb for users, 5kb for groups and 10 kb for roles.

S3 Authorization with multiple policies

Whenever an AWS principal issues a request to S3, the authorization decision depends on the union of all the bucket policies, S3 ACL policies and IAM policies. The decision logic follows the principle of least-privilege by defaulting the resource access to deny. Deny always supersedes Allow effect.

Cloud Storage Types – Object, Block and File


, , , , , , ,

Cloud storage is making inroads and increasingly becoming quite common in enterprises these days due to the advantages it offers in terms of availability, durability and cost.

Cloud storage solution can either be deployed in the private cloud or accessed over the internet in the public cloud depending upon the sensitivity of the data and compliance requirements. Cloud storage solutions hosted on the public cloud can be easily provisioned since the cloud provider manages the underlying infrastructure supporting the cloud storage solution.


  • Availability – All the data is highly available when needed. Depending upon the type of data, there might be a requirement for the storage solution to be highly available. In some scenarios, where the data is archived or the data itself could be recreated e.g. image thumbnails from the original image, the cost of the storage could be reduced by having a slightly less availability  (e.g. 99.9% instead of 99.99%).
  • Durability – Data is stored redundantly across multiple facilities and physical devices to avoid the loss of data in case of natural disaster or device failure.
  • Cost – The total cost of ownership for storing the data in the cloud is less since there is no CAPEX involved in buying the hardware and the inital set-up. Also,  there is no operational cost involved in maintaining the hardware. The capacity on the public cloud storage can be added or removed as needed, thus reducing the cost since you don’t pay for the ‘idle’ infrastructure which is provisioned ahead for the future use.

Storage Types

There are primarily three types of storage solution provided by the cloud providers. Each of these storage types provides unique features and functionality for their respective use cases.

Block Storage

Block storage stores the files/data by splitting the data into small chunks known as block which can be accessed by the application using the memory address of the block. The files stored in the block storage can be updated by replacing the data stored in the blocks. This type of storage is analogous to SAN (Storage Area Network).

Enterprises use SAN as a block storage that could be attached to the servers to appear as if the storage volume is running locally. Servers get access to dedicated SAN logical units (virtual hard disk) which are not shared among the servers. There are 2 protocols used to access the SAN storage – Fiber Channel and iSCSI (Internet Small Computer System Interface). iSCSI uses the existing ethernet network to access the SAN and hence has lower cost, however it has lower performance when compared to fiber channel.

Below diagram shows the SAN having Logical Unit Numbers shared among multiple servers.


Elastic Block Store (EBS) is the Amazon service for block storage in the cloud. EBS is used as a disk volume when an EC2 instance is provisioned. It is also suitable for running the database. EBS volumes are replicated within the availability zone and can be mounted to only one EC2 instance running within the same availability zone.

File Storage

File storage is also based on the block storage having a file system optimized for serving files with large number of network connections. It is analogous to NAS (Network Attached Storage).

Enterprises use NAS to store files that are shared by multiple systems. NFS protocol is used for accessing the files stored on the NAS share.

Elastic File System (EFS) is the AWS service for storing and sharing the files in the cloud. EFS can be mounted on multiple servers at the same time for sharing the files. It can also be mounted to on-premise servers over the VPN or DirectConnect. EFS is replicated across multiple availability zones within the AWS region.

Object Storage

Traditionally, enterprises have  stored the objects (unstructured data) on disks running OS with a file system. This solution can only scale to a certain level due to the limits imposed by the file system. Object storage addresses the limitations of storing the objects in file system.

Object storage is a key-value storage in which the objects are stored as a single entity as opposed to multiple blocks in block storage. Any update to the object requires replacing the full object with a new version. Object storage is not suitable for hosting operating system or running databases. It can however be used for storing the backup or snapshot volumes of the disk or EBS.

Objects are accessed via. the HTTP protocol using RESTful API allowing the application to create, update, copy and delete the objects in the bucket/container. Object storage can store billions of objects and still offer a very good performance due to the underlying key-value storage. Each object is addressed as a URL. Unlike file system, objects are not stored in a hierarchy, however object names can have ‘/’ providing a pseudo-nested hierarchy for organizing the objects within the namespace.

Object consists of the actual file, metadata and the globally unique identifier. The metadata consists of the contextual information like what the data is, its confidentiality and any other data required for accessing the object.

Simple Storage Service (S3) is the Amazon service for storing the objects in the cloud.

AWS DNS Service – Route 53


, , , , , , , , , ,

In one of earlier posts on DNS, we looked at the basic functionality provided by the DNS service and some of the important concepts related to the DNS protocol.

AWS Route 53 is a distributed managed service that provides both the public and private DNS lookup service with a very high availability and scalability. It makes it easy to manage the application traffic globally through a variety of routing policies that allows low latency and fault tolerant architectures.

Route 53 can be used to configure DNS health check such that requests are routed to healthy endpoints only.

DNS Setup

  • Register a domain name – Domain can be registered with Route 53 or any registrar offering the domain registration service.
  • Create hosted zone in Route 53 – Route 53 automatically creates a new hosted zone when a new domain is registered with it. If the domain is registered elsewhere, a new hosted zone needs to be created in order to use the Route 53 DNS service.
  • Create record sets in the hosted zone – This step involves creating the A record, CNAME record, Alias record etc. within the hosted zone. Alias record is unique to Route 53 which is similar to CNAME and allows to map the domain to other AWS services like ELB, S3 website, CloudFront etc. having dynamic IP addresses.
  • Connect domain name to the hosted zone – This is known as delegation which involves updating the name server with the domain registrar. If the domain is registered elsewhere, name servers are updated with the registrar to point to Route 53 hosted zone.

Routing Policies

Simple Routing Policy

Simple Routing Policy is the most basic routing policy defined using an A record to resolve to a single resource always without any specific rules. For instance, a DNS record can be created to resolve the domain to an ALIAS record that routes the traffic to an ELB load balancing a set of EC2 instances.


Weighted Routing Policy

Weighted Routing Policy is used when there are multiple resources for the same functionality and the traffic needs to be split across the resources based on some predefined weights.


Latency Routing Policy

Latency Routing Policy is used when there are multiple resources for the same functionality and you want Route 53 to respond to DNS queries with answers that provide the best latency i.e. the region that will give the fastest response time.


Failover Routing Policy

Failover Routing Policy is used to create Active/Passive set-up such that one of the site is active and serve all the traffic while the other Disaster Recover (DR) site remains on the standby. Route 53 monitors the health of the primary site using the health check.

Failover Routing.png

Geolocation Routing Policy

Geolocation Routing Policy is used to route the traffic based on the geographic location from where the DNS query is originated. This policy allows to send the traffic to resources in the same region from where the request was originated i.e. it allows to have site affinity based on the location of the users.


AWS Elastic Load Balancer


, , , , ,

A load balancer is a device that acts as a reverse proxy and distributes the application traffic across multiple servers. This results in increased capacity and greater reliability of the applications running behind the load balancer.

Generally load balancers are grouped into 2 types

  • Layer 4 load balancer – Acts on the data available in network and transport layer such as TCP, UDP, FTP etc.
  • Layer 7 load balancer – Acts on the data available in the application layer such as HTTP

AWS Elastic Load Balancer (ELB) automatically distributes the incoming application traffic across multiple applications, microservices and containers hosted on Amazon EC2 instances in multiple Availability Zones.

ELB is configured to accept incoming traffic on one or more listeners that checks for the client connection requests. It is configured with a protocol and port number for the connections with the client and sends the traffic to EC2 instances registered with the ELB.

When an Availability Zones is enabled on the load balancer, ELB creates a load balancer node in the Availability Zone. EC2 instances in an Availability Zone which has not been enabled on the ELB won’t receive the traffic.

ELB Features

  • Elasticity – ELB is inherently elastic and automatically scales to meet the increased incoming traffic load.
  • Security – ELB works with VPC such that security groups could be assigned to restrict which ports are open to a list of allowed sources. It also integrates with the AWS Certificate Manager to make it easy to enable the SSL connection for the application.
  • Integrated – ELB is integrated with other AWS services like CloudWatch, Route53, Auto Scaling etc. that makes it easy to design and implement robust solutions.
  • Highly Available – ELB automatically route traffic to multiple EC2 instances running across multiple availability zones. ELB itself runs in multiple availability zones. It also monitors the health of the EC2 instances and takes any unhealthy instances out of rotation.
  • Choice of Classic Load Balancer (Layer 4) vs Application Load Balancer (Layer 7)

Classic Load Balancer – Layer 4

  • Supports both the TCP and SSL protocols
  • Incoming client connection is bound to the server connection
  • Does not allow any header modification
  • Proxy protocol prepends the source and destination IP address along with the ports to the request

Application Load Balancer – Layer 7

  • Supports both HTTP and HTTPS protocols
  • Incoming client connection is terminated at the load balancer and pooled to the servers behind the load balancer
  • Headers may be modified. For instance, X-Forwarded-For header is added that contains the client IP address
  • Supports path based routing which allows the requests to be routed to different applications running behind a single load balancer
  • Supports containers

Following diagram shows how the ELB configured in multiple availability zones can be used to load balance the EC2 instances which are also running in multiple availability zones. This type of network topology provides high availability and fault tolerance by leveraging redundancy across multiple availability zones.


ELBs can be configured for cross zone connection such that the ELB in availability zone 1 can send the application traffic to EC2 instances running in availability zone 2 and vice versa. If cross zone load balancing is disabled, traffic is evenly distributed across the ‘Availability Zones’ irrespective of the number of instances running in each Availability Zones. This could lead to uneven distribution of the traffic at the instance level if there are more instances running in one Availability Zone compared to the other Availability Zone. If cross zone load balancing is enabled, traffic is evenly distributed across all the ‘EC2 instances’ running in enabled Availability Zones.

Cross zone load balancing is always enabled for an Application Load Balancer and is disabled by default for a Classic Load Balancer.

Domain Name System – An Overview


, , , , , ,

Domain Name System (DNS) is a networking protocol that converts the human friendly domain name to an IP address. IP addresses (IPv4 or IPv6) uniquely identify the devices connected to the internet and helps in routing the network packets from the source to destination. DNS server can be thought as a directory that maintains the list of all the domains on the internet and its corresponding public IP address.

IPv4 is made up of 32 bits and has approximately 4 billion IP addresses which is not sufficient to assign a unique IP address to each device connected to the internet. Due to exponential rise in the number of devices connected to internet in the last decade and now even a greater proliferation due to Internet of Things (IoT), IPv6 is becoming more common since it’s made up of 128 bits and has a large number of IP addresses (2^128). Some of the IP addresses are reserved to be used inside a private network and hence cannot be assigned to a device connected to internet. One of the special IP address is which is used to identify the host machine and routes all the packets to the host machine itself without sending the packets over to the internet.

IP addresses could be dynamic or static depending upon the network configuration and the use case. Dynamic Host Configuration Protocol (DHCP) server assigns a new dynamic IP address along with other configuration to the device at the time the device is registered with the network. Web server and other devices requiring static IP addresses that does not change over time could be assigned IP addresses which are associated with the MAC address of the network interface.

Top Level Domains

Domain names consist of character strings separated by dot e.g. abc.xyz.com. The string after the last dot represents the top level domain. In this example, ‘com’ is the top level domain and ‘xyz’ is the second level domain. Domain names are hierarchical with top level domain at the root of all the sub-domains defined under the top level domain. Following are some of the most common top level domain names widely used.

  • .com
  • .org
  • .net
  • .edu
  • .gov
  • .io

The top level domains are controlled by Internet Assigned Number Authority (IANA) in a root zone database which is essentially a database of all the available top level domain names.

Domain Registrar

All the sub-domains under the top level domain needs to be unique such that there are no duplicate domain names resulting in name resolution conflicts. Domain registrar controls the list of domains that could be assigned avoiding any duplicates under the top level domain. Each domain registration becomes part of a central domain registration database known as whois database.

DNS Name Resolution

Internet Service Providers (ISP) hosts the DNS servers that maintains a small database of domain names and associated IP addresses. When the DNS resolution request is received by the ISP DNS server, it tries to resolve the domain name and if it doesn’t have the DNS record it delegates the DNS resolution to other DNS servers on the internet.

A domain server that manages a specific domain is called Start Of Authority (SOA) for that domain. Over time, the DNS lookup results propagates from SOA to other DNS servers on the internet. Each DHS server could cache the result of DNS lookup for a specific period of time known as Time To Live (TTL). The TTL value could be configured for each DNS server and allows the DNS lookup to be more efficient with minimal latency. Root name servers are at the top of the hierarchy for a given top level domain and other DNS servers can contact the root name server for the SOA record.

DNS Configuration

A Record

A record is the basic mapping for the host name to the IP address. ‘A’ in the A record stands for the address and associates the IP address to the host name. It is mandatory to have a A record defined for every DNS entry. A record is defined for the naked doamin name i.e. the domin name without ‘www’ sub domain.

CNAME Record

Canonical Name (CNAME) record is used to resolve one domain name to another. It is like an alternative domain name such that anyone accessing the CNAME is automatically directed to the IP address mapped in the A record. CNAME cannot be defined for naked domain name.


AWS VPC Network Security


, , , , ,

One of my earlier post on AWS Virtual Private Cloud described  the basics of VPC including some of the security features it offers to control which packets move in and out of the VPC. In this article let’s look at the VPC network security in further detail.

Following diagram shows an example of how the security groups and ACLs are associated with the subnets defined within a custom VPC.


Security Groups

Security groups control the inbound and outbound network traffic at the instance level. It’s the first layer of defense. One or more security groups (max 5) can be assigned to an EC2 instance. Each instance in a subnet can be assigned to the same Security Group or different Security Groups. Default security group is automatically assigned to the EC2 instance if no security instance is selected at the time of launching the EC2 instance.

Security Group supports ‘allow’ rule only which are stateful i.e. if an inbound rule is defined to allow the traffic then the outbound traffic for that connection is automatically allowed and vice versa for the outbound rule. All the rules are evaluated before deciding whether to allow the traffic or not. Any change in the Security Group rule is applied immediately. Security groups cannot be used for blocking specific IP addresses.

  • Default Security Group – Each VPC comes with a default security group which allows the instances in the default security group to talk to each other.
    • Inbound – Allows inbound traffic from instances assigned to the same security group
    • Outbound – Allows all outbound traffic
  • Custom Security Group – Custom security groups can be defined which does not allow the instances in the security group to talk to each other.
    • Inbound – When a new custom security group is created no inbound rules are defined and hence all the inbound traffic is automatically denied
    • Outbound – Includes an outbound rule that allows all the traffic

Access Control List

ACLs control the inbound and outbound network traffic at the subnet level. It’s the second layer of defense.  ACL rules are applied to all the EC2 instances created within the associated subnet. Each subnet must be associated with only one ACL. If a custom ACL is not associated with a subnet explicitly, the default ACL is automatically assigned to the subnet. One ACL can be associated with multiple subnets. ACL can be used for blocking specific IP addresses.

It supports both the allow and deny rules which are stateless i.e. rules for the return traffic should be explicitly defined. The rules are evaluated in the order in which they are defined (ascending rule number) to decide whether to allow the traffic or not as soon as the first rule matches.

  • Default ACL – It allows both the inbound and outbound network traffic
  • Custom ACL – It denies all the inbound and outbound traffic unless rules are added to allow the network traffic. Ephemeral ports need to be explicitly defined in both the inbound and outbound rules when creating the custom ACL. When a host initiates a network connection, a unique ephemeral port is automatically assigned as per the TCP/IP protocol when the packets are sent to the destination. The ephemeral port is used to route the return packets back to the host process which is associated with the ephemeral port.

Flow Logs

Flow logs allows to capture the information about the IP traffic going to and from the EC2 instances created in the VPC. Flow logs data is published to the CloudWatch logs which can then be used to troubleshoot any issue with the IP networking.

AWS VPC – NAT Instances and NAT Gateway


, , , , ,

NAT Overview

Network Address Transaction (NAT) is a technique of assigning a public IP address to a host or a group of hosts within a private network such that all egress network packets have the same public source IP address. NAT helps in limiting the number of public IP addresses required for a private network to connect to the internet and also allows to hide the private IP address space from the outside world. All the hosts inside a private network could use the same public IP address to send the packets to the internet. The destination will receive the packets from different hosts that looks like coming from the same host as all the packets will have the same source public IP address. A NAT table is needed on the router or any NAT enabled device for the ingress packets to be routed back to the respective hosts in the private network.

Following diagram shows an overview of how the NAT enabled router translates the private IP address of all the devices inside the private network to the public IP address of the router before the packets get routed to the internet.


By default a private subnet created in AWS VPC does not have an access to the internet. NAT can be used to allow the instances created in the private subnet to securely access the internet. It is possible to either set-up NAT instances or NAT gateways to allow the instances within the private subnet in AWS VPC to connect to internet. All the traffic from the private subnet is NATed via NAT instances or NAT gateways  configured in the public subnet.

NAT  Instances

NAT instances are EC2 instances running in public subnet that allows EC2 instances running in the private subnet to connect to internet. NAT instances requires a security group to be configured to allow the ingress and egress traffic from and to the internet. EC2 instance perform source/destination check by default to make sure that it is the source or destination of any traffic received or sent by EC2 instance. NAT instance should be able to send and receive traffic even if it’s not the source or destination of the traffic. In order for the traffic to pass through the NAT instances, you’d need to disable the source/destination check on the EC2 instance created for the NAT. Also, the main route table connected to private subnet would need to updated to allow the packets with destination to the internet ( be routed to the internet via the NAT instances.

NAT instances can be made highly available by creating multiple instances and adding them to the Elastic Load Balancer group. The network bandwidth depends upon the bandwidth of the EC2 instance type.

NAT Gateway

NAT Gateway is preferred over NAT instances and more commonly used since it’s fully managed by AWS and offers several benefits. NAT gateway is created in a public subnet and assigned an elastic IP at the time of creation. Once the NAT gateway is created, the main route table attached to the private subnet can be updated to allow the packets with destination to the internet ( be routed to the internet via the NAT gateway. NAT gateway doesn’t need to be behind a security group as opposed to NAT instances which are always behind a security group.

NAT gateway are highly available and created with redundancy in the availability zone. NAT gateway supports the bandwidth bursts of up to 10 Gbps.


AWS Virtual Private Cloud


, , , , , , ,

AWS Virtual Private Cloud (VPC) is a web service that allows provisioning of a logically isolated infrastructure in the public cloud with its own IP address range, subnets, internet gateway, ACLs and route table configuration. It can be thought of as an isolated data center in AWS. VPC does all the heavy lifting and makes it very easy for the enterprises and start-ups to create virtual data centers hosted by cloud providers with all the features like security, IP addressing, subnetting and routing. Hybrid cloud  which is a mix of on-premise and hosted cloud infrastructure is becoming  increasingly common for big enterprises  to reduce the capital expenditure and increase agility.

It is not uncommon for enterprises to have multiple security zones that partitions the infrastructure and resources based on similar characteristics like public or internal facing and data protection. Inter zone communication is usually controlled and restricted by having a firewall for all the traffic that flows between the security zones. There could be a public facing subnet that hosts the web servers that talk to the internet and take the requests from the end-users. Backend systems like application servers and databases are hosted on a private facing subnet that cannot be accessed directly from the internet. All the traffic from web servers hosted on the lower trust zone flows through the firewall to the application servers hosted on the higher trust zone.

AWS VPC provides defense in depth by allowing to define security groups, network access control list and the route tables to restrict the traffic between the public facing subnet and private subnet.

VPN connection can be created between the corporate data center and VPC to set-up a hybrid cloud environment where VPC is an extension to the corporate data center.

AWS automatically creates a default VPC when an EC2 instance is provisioned. All subnets within a default VPC have a route to the internet. EC2 instances created in a default VPC will have a public and private IP address.

When a new VPC is created it automatically creates a main route table, default security group and a default network ACLs. It doesn’t create any subnets as those are expected to be explicitly defined  by the user. It also doesn’t automatically create an internet gateway.

VPC IP Addresses

Private networks created in a VPC can use IP addresses anywhere in the following range listed with the Classless Inter Domain Routing (CIDR) notation-

  • – (
  • – (
  • – (

AWS allows the creation of a new VPC if the IP address block sizes are between  /16 netmask and /28 netmask. It also allows the users to select the multi-tenancy for the new VPC. VPCs can be created on the shared hardware or the dedicated hardware based on the needs of the user and other factors like compliance or regulations.

It’s best to avoid using IP address range that might conflict and overlap with other networks to which the VPC might connect. VPC is created within a specific AWS region and could span across multiple availability zones within the region.

Following diagram shows a VPC with a private IP address range of and having 3 availability zones having subnets with their own IP address range falling under the VPC IP address range.

VPC with  /16 block has total 64K IP addresses available that can be assigned across all the subnets in the region. Subnets with /24 block has total 256 IP addresses that can be assigned within the subnet, however 5 IP addresses are reserved by AWS due to which only 251 IP addresses are available in each subnet.


VPC Routing

Route table contains a set of rules that determines how the network packets are routed from source to destination. VPC comes with a default route table. Each subnet should be associated with only one route table. Multiple subnets can be associated with the same route table. Based on the destination, either the packets can be routed locally within the subnet or can be routed to internet gateway with destination which is a CIDR notation for matching any destination IP address. The most specific rule that matches the route gets applied to the packet.

VPC Network Security

Since VPC can make a connection to the internet, it is important to have a mechanism to control which packets move in and out of the VPC. There are 2 ways in which the VPC network can be secured-

  • Access Control List (ACL) – ACLs are analogous to the stateless firewall and can be applied at the subnet level.There is separate set of rules for ingress and egress traffic.
  • Security Groups – These are stateful rules which are defined at the host level. It controls what packets are allowed when someone initiates a connection and automatically tracks the connection to allow the response traffic back to the source.

Click here to read more about the VPC Network security.

VPC features

  • One subnet is associated with one availability zone only. A subnet cannot span multiple availability zones. Security groups, ACLs and route tables can span multiple availability zones.
  • Instances can be launched either in private subnet or public subnet
  • Custom IP addresses can be assigned in each subnet
  • Route table can be configured to connect to internet gateway which determines whether the subnet is internet facing or private
  • VPC can have only one internet gateway attached to it
  • Allows to define the security groups which are stateful
  • Allows to define ACLs at the subnet level which are stateless

VPC Peering

Allows to connect one VPC with another VPC via a direct network route using private IP addresses. Peering is done is star configuration and transitive peering is not allowed.