Tuesday, June 7, 2016

03_AWS - Database

Amazon Web Services

 

2006: Amazon launched Amazon Web Service (AWS) on a utility computing basis although the initial released dated back to July 2002.

Amazon Web Services (AWS) is a collection of remote computing services (also called web services) that together make up a cloud computing platform, offered over the Internet by Amazon.com.

The most central and well-known of these services are Amazon EC2 (Elastic Compute Cloud )and Amazon S3 (Simple Storage Service).

 

Book:

Amazon Web Services is based on SOA standards, including HTTP, REST, and SOAP transfer protocols, open source and commercial operating systems, application servers, and browser-based access.

 

Topics:

 

1.       Amazon RDS

2.       AWS Schema Conversion Tool

3.       Amazon DynamoDB

4.       Amazon ElastiCache

5.       Amazon Redshift

 

 

1). Amazon RDS (Relational Database Service)

 

·         Amazon RDS is a web service that makes it easier to set up, operate, and scale a relational database in the cloud.

·         It provides cost-efficient, resizable capacity for an industry-standard relational database and manages common database administration tasks.

 

·         Amazon RDS manages backups/automated backups, software patching, automatic failure detection, and recovery.

·         In order to deliver a managed service experience, Amazon RDS does not provide shell access to DB instances, and it restricts access to certain system procedures and tables that require advanced privileges.

·         You can get high availability with a primary instance and a synchronous secondary instance that you can failover to when problems occur.

·         You can use the database products you are already familiar with: MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, and the new, MySQL-compatible Amazon Aurora DB engine.

·         You can also control who can access your RDS databases by using AWS IAM to define users and permissions.  You can also help protect your databases by putting them in a VPC.

 

Components:

·         DB Instances

·         Regions and Availability Zones

·         Security Groups

·         DB Parameter Groups

·         DB Option Groups

 

2). AWS Schema Conversion Tool

 

·         This Tool is a desktop application that helps you convert your database schema from an Oracle or Microsoft SQL Server database, to an

o    Amazon RDS MySQL DB instance or

o    Amazon Aurora DB cluster or

o    PostgreSQL DB instance

 

 

3). Amazon DynamoDB

 

·         It’s is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.

·         You can use DynamoDB to create a database table that can store and retrieve any amount of data, and serve any level of request traffic.

·         DynamoDB automatically spreads the data and traffic for your tables over a sufficient number of servers to handle your throughput and storage requirements, while maintaining consistent and fast performance.

·         All of your data is stored on Solid State Disks (SSDs) and automatically replicated across multiple Availability Zones in an AWS region, providing built-in high availability and data durability.

 

·         DynamoDB lets you offload the administrative burdens of operating and scaling a distributed database, so that you don't have to worry about hardware provisioning, setup and configuration, replication, software patching, or cluster scaling.

 

4). Amazon ElastiCache

 

·         Amazon ElastiCache is a web service that makes it easy to Set Up, Manage, And Scale Distributed In-Memory Cache Environments in the cloud.

·         It provides a high performance, resizable, and cost-effective in-memory cache, while removing the complexity associated with deploying and managing a distributed cache environment.

 

·         With ElastiCache, you can quickly deploy your cache environment, without having to provision hardware or install software.

·         For enhanced security, ElastiCache can be run in the Amazon VPC environment, giving you complete control over network access to your clusters.

·         With just a few clicks in the AWS Management Console, you can add or remove resources such as nodes, clusters, or read replicas to your ElastiCache environment to meet your business needs and application requirements.

 

·         You can choose from Memcached or Redis protocol-compliant cache engine software, and let ElastiCache perform software upgrades and patch management for you.

·         Existing applications that use Memcached or Redis can use ElastiCache with almost no modification; your applications simply need to know the host names and port numbers of the ElastiCache nodes that you have deployed. The ElastiCache Auto Discovery feature for Memcached lets your applications identify all of the nodes in a cache cluster and connect to them, rather than having to maintain a list of available host names and port numbers; in this way, your applications are effectively insulated from changes to node membership in a cluster.

 

·         ElastiCache has multiple features to enhance reliability for critical production deployments:

o    Automatic detection and recovery from cache node failures.

o    Automatic failover (Multi-AZ) of a failed primary cluster to a read replica in Redis replication groups.

o    Flexible Availability Zone placement of nodes and clusters.

o    Integration with other Amazon Web Services such as Amazon EC2, CloudWatch, CloudTrail, and Amazon SNS to provide a secure, high-performance, managed in-memory caching solution.

 

 

5).  Amazon Redshift

 

·         Amazon Redshift is a fast, fully managed, petabyte-scale Data Warehouse Service that makes it simple and cost-effective to efficiently analyze all your data using your existing Business Intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions.

·         An Amazon Redshift data warehouse is a collection of computing resources called Nodes, which are organized into a group called a Cluster.

·         Each cluster runs an Amazon Redshift engine and contains one or more databases.

 

·         An Amazon Redshift data warehouse is an enterprise-class Relational Database Query and Management System.

·         Amazon Redshift supports client connections with many types of applications, including business intelligence (BI), reporting, data, and analytics tools.

·         When you execute analytic queries, you are retrieving, comparing, and evaluating large amounts of data in multiple-stage operations to produce a final result.

·         Amazon Redshift achieves efficient storage and optimum query performance through a combination of

o    Massively Parallel Processing,

o    Columnar Data Storage, and very efficient,

o    Targeted Data Compression Encoding Schemes

 

RedShift Data Warehouse System Architecture

 

 

Regards,

Arun Manglick

 

 

 

 

 

 

 

 

 

 

 

 

 

 

No comments:

Post a Comment