From Amazon Web Services in Action, Second Edition by Michael Wittig and Andreas Wittig

In this article, we will compare various file systems and how you can share data across EC2 instances.


Save 37% on Amazon Web Services in Action, Second Edition. Just enter code fccwittig into the discount code box at checkout at manning.com.


Many legacy applications store state in files on disk. Therefore, using Amazon S3, an object store, is impossible by default. Using block storage might be an option, but it won’t allow access to files from multiple machines in parallel. Hence you need a way to share the files between virtual machines. With Elastic File System (EFS), you can share data between multiple EC2 instances and your data is replicated between multiple Availability Zones (AZ). EFS is based on the NFSv4.1 protocol,which allows you to mount it like any other file system. In this article you’ll learn how to set up EFS, tweak performance, and backup your data.

NOTE: EFS ONLY WORKS WITH LINUX

At this time, EFS isn’t supported by Windows EC2 instances.

EXAMPLES ARE 100% COVERED BY THE FREE TIER

The examples in this articleare covered by the Free Tier. As long as you don’t run the examples longer than a few days, you won’t pay anything. Keep in mind that this only applies if you created a fresh AWS account for this articleand nothing else’s going on in your AWS account. Try to complete the articlewithin a few days; you’ll clean up your account at the end.

Let’s have a closer look at how EFS works compared to Elastic Block Store (EBS) and Instance Store. An EBS volume is tied to a data center, also called Availability Zone (AZ), and can only be attached over the network to a single EC2 Instance from the same data center. Usually EBS volume are used as the root volumes, which contain the operating system, or for relational database systems to store the state. An Instance Store consists of a hard drive directly attached to the hardware which the virtual machine is running on. Instance Store can be regarded ephemeral storage and is used for caching or for NoSQL database with embedded data replication only. In contrast, the EFS file system can be used by multiple EC2 instances from different data centers in parallel. Additionally, the data of the EFS file system is replicated among multiple data centers and remains available even if a whole data center suffers from an outage, which isn’t true for EBS and Instance Store. Figure 1  shows the differences.


Figure 1. Comparing EBS, Instance Store, and EFS File System


Now that you know about the differences, it’s time to have a closer look at EFS. Two main components require your attention:

  1. EFS File System: Stores your data
  2. EFS Mount Target: Makes your data accessible

The EFS File System is the resource that stores your data in an AWS region. But you can’t access the file system directly.To access your file system, you must create an EFS Mount Target in a subnet. The mount target provides a network endpoint that you can use to mount the file system via NFSv4.1. With the mount target endpoint, you can finally mount the EFS File System on an EC2 Instance. The EC2 Instance must be in the same subnet as the EFS Mount Target,but you can create mount targets in multiple subnets. Figure 2 demonstrates how to access an EFS File System from EC2 instances running in multiple subnets.


Figure 2. Mount targets provide an endpoint for EC2 instances to mount the file system in a subnet


Equipped with the EFS theory about file systems and mount targets, you can now apply your knowledge to solve a real problem.

The Linux operating system is a multiuser system. Many users can store data and run programs isolated from each other. Each Linux user can have a home directory which is usually stored under /home/$username. If the user name is michael, the home directory would be /home/michael. And only the michael user would be allowed to read and write in /home/michael. The ls -d -l /home/* command list all home directories.

 
 $  ls -d -l /home/*                                             
 drwx------ 2 andreasandreas    4096 Jul 24 06:25 /home/andreas  
 drwx------ 3 michaelmichael    4096 Jul 24 06:38 /home/Michael  
 

❶   List all home directories with absolute paths

❷   /home/andreas can only be accessed by the user and group andreas

❸   /home/michael can only be accessed by the user and group michael

If you’re using multiple EC2 instances, your users will have a separate home folder on each EC2 instance. If a Linux user uploads a file on one EC2 instance, she can’t access the file on another EC2 instance. To solve this problem, you’ll create an EFS File System and mount the EFS File System on each EC2 Instance under /home. The home directories are then shared across all your EC2 instances and users feel at home when they login no matter on which virtual machine.

That’s all for this article.


If you’re interested in learning more about the book, check it out on liveBook here and see this slide deck.