A Distributed Computing Lecture by Steven Choy
An Overview on Amazon Web Services
- Amazon Web Services Portal: http://aws.amazon.com
- A set of APIs and business models which give developers access to Amazon technology and content
- Data As a Service
- Amazon E-Commerce Service
- Amazon Historical Pricing
- Infrastructure As a Service
- Amazon Simple Queue Service
- Amazon Simple Storage Service
- Amazon Elastic Compute Cloud
- Amazon CloudFront
- Search As a Service
- Alexa Web Information Service
- Alexa Top Sites
- Alexa Site Thumbnail
- Alexa Web Search Platform
- People As a Service
Amazon S3
Reference: ONJava.com: Introduction to Amazon S3 with Java and REST
Introduction to Amazon S3
- Amazon S3 = Amazon Simple Storage Service (http://aws.amazon.com/s3)
- allow a customer to store files into remote storage
Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure.
- example usage
- S3 is used by companies to store photos and videos of their customers, back up their own data, and more.
- Pricing: an overview
Storage: $0.15 per GB-Month of storage used
Data Transfer in: $0.10 per GB
Data Transfer out: ~$0.18 per GB
- S3 provides both SOAP and REST APIs
S3 Basics
- S3 handles objects and buckets.
- An object matches to a stored file.
- Each object has an identifier, an owner, and permissions.
- Objects are stored in a bucket.
- A bucket has a unique name that must be compliant with internet domain naming rules.
- An object is addressed by a URL (e.g.
http://s3.amazonaws.com/bucketname/objectid)
S3 Key Features
- The number of objects a customer can store is unlimited.
- Write, read, and delete objects containing up to 5 gigabytes of data each.
- Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access.
- Objects can be made private or public, and rights can be granted to specific users.
- Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.
S3 Security
- An AWSSecretKey is assigned to each AWS customer, and this key is identified by an AWSAccessKeyID. The key must be kept secret and will be used to digitally sign REST requests.
- Authentication: Requests include AWSAccessKeyID
- Authorization: Access Control List (ACL) could be applied to each resource
- Integrity: Requests are digitally signed with AWSSecretKey
- Confidentiality: S3 is available through both HTTP and HTTPS
- Non repudiation: Requests are time stamped (with integrity, it's a proof of transaction)
Demonstration (S3 in Action)
S3 Upload Applet
Developing S3 Applications
Companies that use Amazon S3: three examples
Jungle Disk is an application that lets you store files and backup data securely to Amazon.com's S3 ™ Storage Service.
SmugMug is an independent, self-funded, profitable, and debt-free company with one programmer and just 15 employees. They are a subscription-based online photo sharing company with over 150,000 paying customers who depend on SmugMug to safely store more than 70 million photos on their behalf.
Webmail.us provides email hosting to more than 60,000 businesses, hosting more than 600,000 paid mailboxes
It provides a flexible API that allow developers to easily integrate videos on their site while keeping full control of the videos and encoding options.
Probing further
Amazon EC2
Introduction
- Amazon EC2 = Amazon Elastic Compute Cloud (http://aws.amazon.com/ec2)
- A Web service that provides resizable compute capacity in the cloud.
- Designed to make Web-scale computing easier for developers.
- A simple Web service interface that provides complete control of your computing resources
Benefit of Amazon EC2
- Reduces the time required to obtain and boot new server instances to minutes
- Quickly scales capacity, both up and down, as your computing requirements change
- Changes the economics of computing: pay only for capacity that you actually use; No start-up, monthly, or fixed costs
Instances: $0.10 per instance-hour consumed (or part of an hour consumed)
Data Transfer In: $0.10 per GB - all data transfer in
Data Transfer Out: $0.18 per GB - first 10 TB, etc
Inter-service bandwidth is free
Amazon EC2 Concepts
- Amazon Machine Image (AMI):
- Bootable root disk
- Pre-defined or user-built
- Catalog of user-built AMIs
- OS: Fedora, Centos, Gentoo, Debian, Ubuntu, Windows Server
- App Stack: LAMP, mpiBLAST, Hadoop
- Instance:
- Running copy of an AMI
- Launch in less than 2 minutes
- Start/stop programmatically
- Network Security Model:
- Explicit access control
- Security groups
How Amazon EC2 work
- Amazon’s EC2 infrastructure is built using a large number of machines based on x86 hardware running Xen.
- To start an instance you use an XML web service API to instruct EC2 to downloads a series of encrypted and compressed 10Mb chunks from Amazon’s S3 service for the particular image you wish to use.
- EC2 then reassembles, decrypts and decompresses the image and boots the operating system.
- The kernel on your Amazon Machine Image gets replaced with a Xen 2.6.16 kernel compiled with GCC 4.0 because Amazon do not allow custom kernels although you can use custom kernel modules.
- You are free to use any the images (AMIs) which Amazon have created or you can use ones from third parties or create your own.
Getting Started: A Brief Overview
- Have Java 1.5 or above in your computer
- Create your account at http://aws.amazon.com
- Create a X.509 Certificate and download the private key and certificate to your computer
- Download the command line tools for working with EC2
- Setup environment varaibles
- List images:
ec2-describe-images
- Create a Keypair:
ec2-add-keypair soc-keypair
- Save the private key returned into your computer (e.g.
soc-keypair)
- Create an instance of your machine:
ec2-run-instances ami-xxxxxxxx -k soc-keypair
- Wait for the instance to boot up in Amazon's machine
- Check on the status of the instance:
ec2-describe-instances
- Authorize SSH and HTTP:
ec2-authorize default -p 22 ]ec2-authorize default -p 80
- Use a web browser to visit your page at the url given in the previous step
- You can ssh into the machine using the key you created in the previous step
Notes:
Use SSH key pairs to connect to a Amazon server as root without a password
Use the XML web services API (via Java command line tools) to control EC2 instances
Use the AMI tools to create your own images
To View More
Amazon CloudFront
Introduction
- Amazon CloudFront is a web service for content delivery. It integrates with other Amazon Web Services to give developers and businesses an easy way to distribute content to end users with low latency, high data transfer speeds, and no commitments. (http://aws.amazon.com/cloudfront/)
- Functionality:
- Store the original versions of your files in an Amazon S3 bucket.
- Create a distribution to register that bucket with Amazon CloudFront through a simple API call.
- Use your distribution’s domain name in your web pages or application. When end users request an object using this domain name, they are automatically routed to the nearest edge location for high performance delivery of your content.
- Pay only for the data transfer and requests that you actually use.
- Pricing : please refer to http://aws.amazon.com/cloudfront/
- To Learn More: How to Setup Amazon S3 with CloudFront as a Content Delivery Network
Thanks for Reading
If you would rather like to have this lecture note in printed format, please click the print action link in the top right corner.
If you find any problem in this lecture note, please feel free to reach Steven by steven@findaway.hk