Resumen para el AWS Solutions Architect Associate

Tras casi un año acumulando borradores en el blog, por fín publico algo. Dejo aquí el resumen (con notas en inglés y español) que hice para la certificación de AWS Solutions Architect Associate (que me saqué a través del curso de A Cloud Guru). Hice el examen a finales de Octubre de 2017, así que a día de hoy, debería ser útil para aquellos que quieran subir a examinarse en breve.

El resumen consta de las siguientes secciones
1. IAM
2. S3
3. EC2
4. EFS
5. Lambda
6. Route53
7. Databases
8. VPC
9. Application Services
10. Whitepaper: Security
11. Exam feedback
12. Random notes

Al final del post dejo un enlace al resumen en PDF.

1. IAM

Users, Groups (users heritage its policies), Roles (for aws resources) and Policies (json)
Global
Root Account (never use it) + MFA
By default new users have no permissions
- Programmatic access
- AWS Management console access
IAM password policy
Roles:
- AWS Service Role – The usual one, and the one we are interested in
- AWS service-linked role: for Alexa
- Role for cross-account access: allow IAM users to access to another AWS accounts
- Role for identity provider access: grant access from Cognito, or OpenID (facebook, google, amazon), SAML, etc

2. S3

Key-Value Object Storage. Files from 0 byte to 5TB. Unlimited storage
S3 buckets: universal namespace. Default: max 100 buckets/account
Read after write for PUTs of new objects
Eventual consistency for overwrite PUTS and DELETES

S3 object consists on:

Key: nombre del fichero
Value: el contenido del fichero (secuencia de bytes)
Version ID
Metadata
Subresources
Access Control List

S3 tiers

S3 standard: Objeto mínimo 0 bytes
S3 IA: accedes 1 vez al mes (o cada 6 meses). Pero necesitas acceso rápido
- Objeto mínimo 128Kb. Es la opción más barata de S3
RRS: para ficheros que puedes permitirte perder. Thumbnails
Glacier: para archivar. Tardas de 3 a 5h en recurarar un fichero
- Restauras via S3 API o via la consola de AWS.

	S3 Standard	S3 IA	S3 RRS
Durability	99.999999999%	99.999999999%	99.99%
Availability	99.99%	99.9%	99.99%
Concurrent facility fault tolerance	2	2	1

Bucket URL formats

http://s3-[region].amazonaws.com/[bucket]
http://[bucket].s3-[region].amazonaws.com

Versioning

No puede deshabilitarse, sólo suspenderse
Cada update es un fichero por sí mismo, con su propio ID
Eliminar un fichero es marcarlo (delete marker) como eliminado > desaparece del bucket, no del histórico > Sólo el propietario del bucket puede eliminarlos de verdad
Puedes habilitar MFA para los deletes

Cross region replication (CRR)

Require “versioning” habilitado.
- Permite subconjuntos via prefijos. También replica metadatos y ACL
Al subir algo nuevo (o update) al bucket, se replica a otro bucket (en otra región) – también requiere versioning pero acepta otro tipo de S3 (IA, RRS…). Requiere IAM roles.

Lifecicle & Glacier

Sin versioning
- 30 días de S3 a IA (sólo para objetos mayores de 128KB)
- 30 días de IA a Glacier
Con versioning
- Tengo 2 LC, uno para el objeto actual y otro para las versiones antiguas

S3 Security & Encryption

Por defecto los buckets son privados
Control de acceso via bucket policies (aplica a todos los objetos) o ACL
Puedes habilitar logging > lo guarda en otro bucket
Encryption
- In transit (SSL/HTTPS) – SSL/HTTP endpoints using HTTS protocol
- At rest
  - Server Side Encryption (SSE)
    - S3 Managed Keys (SSE-S3). Amazon se encarga de todo.
    - AWS Key Management Service (SSE-KMS) Permite Audit Trail
    - Customer Provided Keys (SSE-C): Tú controlas las claves

S3 Transfer Acceleration

Usa las edge locations de CloudFront para subir los datos desde el más cercano a ti
- Coste adicional. Debes usar la URL proporcionada para esas transferencias

S3 Static Website Hosting

Si usas Route53 con S3, el nombre del bucket debe ser el del dominio (sin el “.com”)
http://[bucketname].s3-website-[region].amazonaws.com
Puedes especificar index/error pages y redirect rules

CloudFront

Edge location: caché, TTL (default 24h), puedes habilitar la escritura/update en edge locations que updatean el origin
- Puedes elegir «Allowed HTTP methods» (GET, HEAD, PUT, DELETE…)
Origin (permite múltiples origines para la misma distribution)
- S3 bucket: puedes restringir el bucket para que sólo se pueda acceder desde el CDN -> Origin Access Identity
- EC2 instance
- ELB
- Route53
- Fuera de AWS
Distribution
- Web distribution: para websites
- RTMP: media streaming

S3 multipart upload API

abort or failed uploads via lifecycle policies. Puede usarse con tx acc
Recommended for files > 100MB

Storage Gateway

VM que instalas en tu datacenter y replica a S3.
3 tipos
- Gateway Storage Volumes: tus datos en local, SGW replica a S3 (bkp)
- Gateway Cached Volumes: tus datos en S3, SGW sirve de caché local
- Gateway Virtual Tape Library (VTL): reemplaza los bkps en cinta > usa S3

Import/Export

Actualmente reemplazado por Snowball. Permite:
- Exportar desde S3
- Importar a S3, Glacier y EBS

Snowball

Importar/Exportar hacia/desde S3.
Snowball: Petabyte scale data transport solution
Snowball edge: + compute cababilities. i.e gather data during a flight
Snowmobile: el camión. Exabyte scale

3. EC2

Pricing

On demand
Reserved: 1 or 3 years. Predictable usage or Reserved Capacity
Spot: flexible start/end, only feasible at low prices, urgent compute needs
- Si la termina AWS, no pagas por esa fracción de hora
Dedicated hosts: Por hora o Reserved. Licencias o for Regulatory Requirements

Types

Dr Mc Gift Px

EBS (Elastic Block Storage)

General Purpose SSD (gp2). 3iops/GiB max 10K iops
Provisioned iops SSD (io1). Por si necesitas más de 10K iops (hasta 20K)
Throughtput optimized HDD (ST1). Frequent Access. Large amount of data in sequence as Data warehousing, log processing. Cannot boot
Cold HDD (SC1). Less frequent access. Typical: fileserver. Cannot boot
Magnetic (Standard). Infrequent access, lowest cost
Por defecto: root volumen terminated al terminar la instancia
Los volúmenes deben estar en la AZ de la instancia que los quiere usar
EBS guarda copias redundantes dentro de la misma AZ

EBS: upgrading volumes (cambiar tamaño o tipo)

BEST PRACTICE: parar instancia, dettach, hacer snapshot, crear new volumen, attach.
EBS pueden updatearse on the fly (excepto magnetic standard)
- Sólo un cambio en 6 horas
El tamaño sólo puede incrementarse (incluso desde snapshot)

RAID & EBS

Aumentar iops = Raid 0 (stripped) o 10
Application Consistent Snapshots:
- Necesita 1) parar escrituras a disco desde la aplicación 2) flush caché
- 3 métodos para hacer esto:
  - Freeze the filesystem
  - Unmount the RAID array
  - (BEST OPTION) Parar la instancia, tomar snapshot, iniciar instancia

EBS Snapshots

Puedo: Crear Volumen, AMI, copiarlo a otra región y/o crear una copia “cifrada”
No puedo eliminar un snapshot usado por una AMI (creada a partir de él)
Los snapshots se almacenan en S3, y son incrementales (allow point-in-time recover)

Encrypt Root device volume and create AMI

No puedo crear un snapshot cifrado de un volumen no cifrado
Los snapshots hechos de volúmenes cifrados, están cifrados automáticamente
Los volúmenes restaurados desde snapshot cifrados, están cifrados automáticamente
Sólo puedes compartir AMIs NO cifradas (con otras cuentas AWS o públicamente)
Las AMIs son “por región” pero puedo copiarlas

EBS root vs instance (ephemeral) storage

Si el root device es EBS, éste creó lanza desde una AMI creada de un snapshot EBS
Si es instance store, éste se creó desde una AMI creada desde un template en S3 (slow)
Las instancias con instance storage no se pueden parar (si el host falla, la info se pierde)
Puedes escoger no terminatar los EBS root volumes, pero NO los instance storage.
No puedes desatachar el root EBS sin parar la instancia, claro

Security Groups

Por defecto: inbound denied, outbound allowed
Cambios applicados immediatamente
Son stateful: crean reglas (no visible) para el tráfico relacionado

ALB/ELB y Healthchecks -> self-sanitazion of instances

Tienen su propio security group
LB asociado a una VPC. Puede (debe) trabajar en varias AZ
No tienen IP, sólo un DNS record
Cross-Zone enabled = Balancea entre instancias, independientemente de las AZ
ELB (capa 4)
- No permiten instancias creadas desde Amazon DevPay site
- SSL Termination: has de instalar el certificado en el ELB
- Puedes loggear la actividad con CloudTrail
ALB (capa 7) + Barato
- Internet facing o internal
- Routing > target groups = path based routing! (ie. /a > target1, /b > target2)
- Healthcheck opcionalmente puede checkear el HTTP success code
- Parar SSL termination en las instancias

CloudWatch for EC2

Default metrics on EC2 instances: CPU, disk, Network, Instance status
Standard monitoring (5min) vs detailed (1min)
Dashboards, alarms, events (responde a cambios en los recursos de AWS) and logs (requiere un agente instalado en la instancia. Permiten agregar y almacenar logs)
Cloudwatch (monitoring y logging) VS CloudTrial (para auditar)
Tipos de alarma: OK, Alarm, insuficient-data

Userdata & Metadata

Bootstrap scripts: user data section (max 16KB)
Instance Metadata: http://169.254.169.254/latest/meta-data/

Launch configuration & ASG

Launch configuration: plantilla con la creación de imágenes
ASG: size, VPC y subnets donde crear las instancias, ELB, Healthcheck (ELB o EC2)
- + Scaling Policies: min/max & increase/decrease when…
- termination: selects AZ with most instances > delete the one using the oldest lc
- cooldown: seconds after another scaling event can happen

EC2 termination protection deshabilitado por defecto

EC2 Placement groups

Grupo lógico de instancias que necesitan low latency y/o high network throughtput
- 10Gbps. Misma AZ
El nombre del PG debe ser único en tu cuenta AWS
Sólo para cierto tipo de instancias (cpu, ram, storage y gpu)
No puedes juntar PG. Tampoco mover una instancia de un PG a otro.

4. EFS

Soporta NFSv4 y miles de conexiones simultáneas
Petabytes. Data stored in multiple AZ in a región
Read after write consistency
Tiene su propio sg para cada punto de montaje = subnet = AZ
Puede almacenar datos de una bbdd (al igual que EBS)

5. Lambda

Lambda

Puedes usarlo:
- Event-drive compute service: en respuesta a eventos
- En respuesta a HTTP requests via API Gateway
Lenguajes: Java, NodeJS, Python, C#
Triggers:
- API Gateway
- IoT
- Alexa
- CloudFront
- CloudWatch
- CodeCommit
- Cognito
- DynamoDB
- Kinesis
- S3
- SNS
Máxima duración 5 min
Las ejecuciones son independientes
Escala horizontalmente (scale out) automáticamente

API Gateway

Publish, maintain, monitor and secure APIs to EC2 or Lambda
You can enable API caching to cache (for a TTL) the API response
You can throttle (estrangular) API GW to prevent attacks
You can log results to CloudWatch
CORS (Cross-Origin Resource Sharing) > permite servir contenido de un dominio diferente al original

6. Route 53

ELB do not have IPv4, you resolve to them via DNS name
Understand Alias (you can resolve individual AWS resources) vs CNAME
Routing policies:
- Simple (default): round robin
- Weighted: A/B
- Latency: lowest network latency (ms) to a region > latency
- Failover: active/passive setup -> healthchecks
- Geolocation: latency & show a geo-customized web
Default limit of 50 domain names (can be increase contacting support)

7. Databases

RDS for OLTP

Have to select instance type, EBS size (max 6TB/16TB for SSD), VPC, etc.
- SQL Express max 300GB disk size
Backups: Automated (enable 1 by default 1-35 days) VS Database snapshots > impact performance! -> Backup window (changes to it applied immediately)
- Automatic backups are deleted when terminate (only latest snap could be)
Encryption only at creation time!!! Not even from snapshots (I think)
Multi-AZ: only for disaster recovery. Does not improve performance. AWS Handles failover -> Sync replication
Read Replica: Read performance. Requires auto backups on. Max 5, same AZ. Async
- Available for MySQL and PostgreSQL engines
Permite aplicar particionado de tablas para usar varias instancias RDS
Aurora: 5x faster than MySQL
- Maintains 2 copies of physical data in 3 AZ (min 6 copies)
  - Can fail 2 for writes, 3 for reads
- 2 Types of replica: Aurora (max 15, fault tolerance) & MySQL Read Replicas

DynamoDB – NoSQL

Really scalable (no downtimes!), fast (SSD) and flexible
Spread across 3 data centers
Eventual Consistent Reads (if you can wait 1 second)
vs Strong Consistent Reads (if you can’t) -> increases cost
Very cheap for reads
Provisioned capacity = ios per table
Exists an option for Cross Region Replication

Redshift for OLAP (& BI)

Single node (160G)
vs Multi-node, consists on
- Leader Node
- up to 128 Compute Nodes
Fast because
- Columnar data storage (block size = 1MB)
- Advanced compression (by columns)
- Massive Parallel Processing (MPP) across all nodes

Elasticcache

Memcached and Redis

Extra notes

SSD better performance than magmetic for DBs in EC2 instances
RDS troubleshooting > look for “error nodes” in XML RDS API response

8. VPC

VPC

Private datacenter
Max 1 IGW per VPC. After created, detached
- Route table has to have a route through IGW
VPC peering, even with another AWS accounts (NO TRANSITIVE PEERING)
- IP ranges cannot overlap!!
Custom VPC creates
- Default ACL > all denied by default
- Default SG
- Main Route Table > allow local (private) connections > so by default, all subnets within the VPC can communicate to each other
By default, max 5 VPCs per region
Instances in default VPC will have public and private IP
VPC endpoints to access to AWS resources
VPC Flow logs: capture traffic within the VPC and sends it to CloudWatch

Subnet

1 subnet = 1AZ
Only can be attached to 1 ACL, and associated to 1 Route Table
Public means the route table where is associated has an IGw, and its instances has a public IP

NAT

To allow instances within a private subnet to reach internet (for yum, i.e.)
Be placed in a public subnet (so with an IGw attached)
Needs an entry in the route table associated with the private subnet

Nat instance is just a regular EC2 instance with a specific AMI
- Needs a public IP
- Needs disable “source/destination check”
- HA via ASG, multiple subnets and a script to automate failover
- Throughtput depends on instance type
Nat GW
- Scale automatically up to 10Gbps, across a single AZ

ACLS

Security groups	ACLs
Instance level (1st)	Subnet level (2nd)
Allow rules	Allow/Deny
Stateful	Stateless
All rules evaluated before deciding	FW: Rules in asc order > first match
Only applies to the instance if attached	Applies to all instances in the subnet

Ephemeral ports for outbound connections (1024-65535)
Your VPC automatically have a default ACL, with by default all inbount/outbount traffic is enabled
But when you create your custom network ACL, all inbount/outbount traffic is denied

9. Application Services

SQS: pull. queue. message oriented API

Simple Queue Service: Pull queue message system
To decouple your components < EXAM!!
Message size 256KB any format (text, json, xml)
Messages in queue from 1min to 14 days. Default 4 days
Visibility timeout: tiempo que tiene un consumer para procesar el mensaje (max 12h)
- Si da timeout, el mensaje vuelve a la cola > Puede duplicarse!
Long Polling: en lugar de preguntar cada X seg si hay mensajes, preguntas y te avisa al entrar mensajes, o cuando de el long poll timeout (ReceiveMessageWaitTimeSeconds>0)
2 tipos: default (puede haber duplicados, no en orden) y fifo

SWF: task oriented API

Simple Workflow Service. Can include human interaction
Workflows max 1 year
A task is assigned only once, never duplicated, and in order
SWF tracks all events. With SQS you have to implement your app-level tracking
Parameters in JSON
“Domains” are a collection of related workflows.
- Includes “workflow starters”, “deciders” and “activity workers”.

SNS: push. message oriented API

Simple Notification Message: publish-subscribe service
mobile push notifications, Email/Email-JSON, SMS, SQS or Lambda
SNS topics: access points for clients to allow to subscribe to notifications (also HTTP(S))
Data format in JSON

Elastic Transcoder: media converter

Kinesis

Stream: consists on shards. Data retained max 7 days (default 1)
- Producer > Shards within the stream > Consumers
Firehouse: no shards, streams or consumers. Data send to S3. Optional Lambda analysis
- Producer > Firehouse (optional Lambda) > S3
Analytics: encima de Streams/Firehouse añade SQL analytics

10. Withepapers: Security

Shared security model

AWS is responsible for the security config of its managed services products (DynamoDB, RDS, Redshift, EMR, WorkSpaces, etc.) and the underlaying infra
YOU: IAAS (EC2, VPC, S3) are under your control
YOU are responsible for account & user access.
- Recommend MFA, SSL/TLS for communications and CloudTrial for user activity logging

Storage Decommissioning

AWS uses DoD 5220.22 (National Industrial Security Media Sanitization) or NIST 800-88 (Guideless for Media Sanitization) to destroy data.
Magnetic storage devices are physically destroyed

Network Security

You can connect to AWS via HTTP or HTTPS using SSL
VPC allows to use IPSec VPNs to tunnel between AWS and your datacenter
AWS network is segregated from the Amazon Corporate (.com) network

Network Monitoring & Protection

By default, AWS provides protection for
- DDoS
- Man in the middle
- IP Spoofing: the AWS host-based firewall will not allow instances to send traffic with a source IP or MAC other than its own.
- Port Scanning
- Packet Sniffing by other tenants (inquilinos)
Unauthorized port/vulnerability scans by EC2 users are a violation of AWS Acceptable Use Policy. You may request permission before!

AWS Credentials

Passwords
MFA
Access Key
Key Pairs: SSH login to EC2. Cloudfront signed URLs
X.509 Certificates: SSL certificates for HTTPS/ SOAP-based requests to AWS API

Trusted Advisor

Inspects your AWS environment and makes recommendations to
- Save money
- Improve performance
- Close security gaps
- Fault Tolerance
Provides alerts of common security misconfigurations

Instance Isolation

Instances running on the same box, are isolated from each other via the Xen hypervisor.
- AWS firewall in the hypervisor layer between physical and EC2 NICs
Physical RAM is separated using similar mechanisms
- Memory allocated to guest is scrubbed (set to zero) when unallocated.
Instances have no raw access to disk, but a virtual disk.
- AWS automatically resets (disk zeroing) every customer’s block of storage

Other considerations

Gest OS:
- virtual instances are completely controlled by you. No backdoors for AWS!
- good security practice: EBS volumes and snapshots encrypted with AES-256
ELB: Supports SSL Termination on the LB > intances can identify the source IP address
Direct Connect: dedicated connection from your datacenter to your AWS VPC, using 802.1q VLAN standard, allowing you to connect to AWS public resources (S3) and private ones (EC2 in a private subnet)

11. Exam feedback

Virtualization types

Paravirtual (PV)
Hardware Virtual Machine (HVM)
- Better performance
- Can take advantadge on hardware extensions and run in top of hw

Directory Service’s AD Connector: let’s you connect your existing AD to AWS
Simple AD: inexpensive AD compatible with the common AD features
You can authenticate with AD to AWS using SAML
- Authenticate to AD first, then to STS

AWS Organization & Consolidation Billing

Account Management service to manage multiple AWS accounts from a central location
Consolidated billing: 1 billing-only account. Up to 20 linked accounts. Global discounts

Resource Groups & Tagging

Groups resources that share one or more tags

Security Token Services (STS)

Federation (tipically AD) – means join groups –
- Uses SAML
- Allows users to login to AWS without IAM credentials (but AD)
Federation with Mobile Apps
- Uses Fb, Google, OpenID lo login
Cross Account Access

Workspaces

VDIs. Are persistent.
Runs Win7. By default users are local admins (allow to install applications)
All data on D: is backed up every 12h

ECS

ECR: EC2 Container Registry
ECS Tasks definitions are JSON files describing one or more containers that conform your application (include CPI, RAM, links, etc)
ECS service is like ASG using Task Definitions
Clusters (region specific) are logical groups of container instances to place tasks in
Service Scheduler: ensures a specific number of tasks is constantly running (ELB reg)
Custom Scheduler: third party
ECS Agent (docker agent)
EC2 uses IAM roles to access ECS (Security groups still at host (EC2) level)
ECS tasks uses IAM roles to access services and resources

More info here

12. Random notes

44 AZ, 17 regions
- AZ names are assigned randomly per account!!!
For new AWS accounts > max 20 EC2 instances per region
4 support levels: basic, developer, business, enterprise
Por defecto, max 5 EIP por región > las EIP estarán atachadas a la instancia hasta que explícitamente las detaches (no se detachan si la instancia se para)
CloudTrail permite registrar el histórico de llamadas a la API de AWS
AWS Config permite guardar el histórico de cambios en las configuraciones de recursos de AWS > y enviar notificaciones de cambios via SNS
1GIB <= EBS size <= 16TiB
RPO (Recovery Point Objective): datos que estoy dispuesto a perder (ej. 1h)
RTO (Recovery Time Objective): tiempo en volver a dar servicio (ej. 20j)

Para finalizar, dejo este mismo resumen en formato PDF. Incluye las mismas faltas de ortografía y cambios de idioma, pero se ve mejor al imprimir:

Resumen en PDF

Ah, y como extra, por si has llegado hasta aquí, dejo también algunos enlaces con preguntas de exámen y tests de prueba gratuitos: