I had an opportunity to attend the seventh installment of AWS re:Invent. It was indeed a large gathering with more than 50,000 in attendance. Despite the size of crowds, the conference was very well run. I was not able to reserve a seat for all the workshops and sessions I was interested in advance but I was able to attend most of these workshops and sessions by queuing up in the walk-up line. Here are some of the takeaways and highlights.
Machine Learning was front and center
AWS provides ML capabilities at three levels of abstraction.
Fully managed services such as AWS Rekognition, Polly, Amazon Comprehend (NLP), Alexa, etc. AWS introduced Textract, a smart OCR service.
Managed execution of pre-built or custom ML models. SageMaker fills this role. 150+ machine learning algorithms are being made available in the AWS Marketplace.
Infrastructure for running ML tools such as MXNet, PyTorch, TensorFlow, etc.
Several new sub-services for SageMaker were announced. For models that require a manual effort to train, AWS introduced SageMaker Ground Truth for classification of data via Mechanical Turk, or other sources. AWS also introduced SageMaker RL which performs training of models through rewards over time. To promote this service, AWS introduced DeepRacer, a fully autonomous 1/18th scale race car driven by reinforcement learning to help developers gain hands-on working knowledge of SageMaker RL. Amazon SageMaker Neo enables machine learning models to train once and run anywhere in the cloud and at the edge with optimal performance.
AWS also announced AWS Inferentia, a new inference chip (yes, a custom-built computer chip) that promises to significantly reduce the time it takes to draw inference from an ML model.
ML Insights works with Amazon QuickSight, a BI Service for interactive dashboards. ML Insights adds ML-powered anomaly detection, ML-powered forecasting, and Auto-narratives (add text descriptions automatically) to QuickSight dashboards.
AWS Personalize is managed service for building and consuming recommendations models. It does the heavy lifting needed to design, train, and deploy a machine learning model under the covers.
AWS RoboMaker provides a robotics development environment for application development (open-source robotics software framework, Robot Operating System,ROS), a robotics simulation service to accelerate application testing, and a robotics fleet management service for remote application deployment, update, and management.
AWS IoT services matured and can now connect to more things
AWS IoT Core can now ingest data directly, bypassing the MQTT broker, by having the thing publish data to $aws/rules/ruleName topic. This eliminates the additional time and cost of publishing data to an IoT topic before it reaches the rules engine for desired processing.
AWS IoT SiteWise opens up AWS IoT to data from industrial devices. It runs on a gateway that resides in customer's facilities and automates the process of collecting and organizing industrial equipment data.
AWS IoT Events - Managed service to analyze patterns in IoT data respond accordingly.
AWS IoT Things Graph can be used to connect devices and web services to build IoT applications. With this service, one can define interactions between them to build multi-step automation applications.
Serverless computing saw some important improvements
AWS Lambda now natively supports Ruby.
Lambda now supports custom runtimes (any runtime that can run on Unix) via Lambda Runtime API. C++ and Rust are now supported on Lambda using this new feature. Some other languages that third-parties have enabled on AWS Lambda are Erlang, Elexir, COBOL, and PHP. This feature will certainly encourage migration of legacy code to Lambda.
A new feature of Lambda called Layers allows lambda functions to share code and data. For example, if several Lambda functions use a common library, that library does not need to deployed (duplicated) for each these Lambda functions. Instead, the library can be pulled in from a remote repository.
Step function (orchestration of Lambda functions) can now invoke many AWS managed services (such as DynamoDB, AWS Batch, Amazon SQS, and Amazon SageMaker) directly in a defined flow.
A new service called AWS Serverless Application Repository allows cataloging/discovery/assembly of a serverless application from existing Lambda functions.
Lambda functions can now be placed behind an Application Load Balancer. This allows Lambda functions to be invoked directly via HTTP/HTTPS without having to use the API Gateway.
Firecracker is a lightweight virtualization that is based on KVM. Amazon uses this technology internally to for its AWS Lambda offering as well. According to AWS, this service allows "launching of lightweight micro-virtual machines (microVMs) in non-virtualized environments in a fraction of a second, taking advantage of the security and workload isolation provided by traditional VMs and the resource efficiency that comes along with containers".
AWS App Mesh - For monitoring and controlling communication across microservices on AWS such as ECS, EKS and Kubernetes running on EC2.
API Gateway now supports Web Sockets. For Single Page Apps, live updates from the Server are usually sent over Web Sockets. This makes API Gateway more desirable as a backend for interactive SPAs.
SNS now supports filtering of messages that are published to a given SNS topic. This can help discard undesirable messages at the SNS service level thus reducing traffic to a configured SNS recipient such as AWS Lambda or a microservice.
Databases and Storage
Amazon Aurora, a MySQL-based managed database service, was featured prominently in Werner's keynote. Amazon has famously vowed to get rid of all its Oracle databases. I imagine Aurora will replace a good number of these databases.
Amazon Aurora added a Global database feature that is designed for applications with a global footprint. It allows a single Aurora database to span multiple AWS regions, with fast replication to enable low-latency global reads and disaster recovery from region-wide outages. I imagine one of the main motivations for adding this feature was to match Microsoft Azure Cosmos Database's globally distributed storage model.
Amazon DynamoDB added ACID-compliant transactions across multiple tables in a given AWS region. This is important for applications that need to store data reliably across multiple tables in a single transaction. DynamoDB also added an On-demand pricing model where the application does not need upfront capacity planning (read/write capacity units).
Amazon Timestream is a new database offering optimized for storing timestream data and is more cost effective than other storage options such as RDS. This is an attractive option for storing large amount of streaming data such as Telemetry data from IoT devices.
Amazon had previously introduced AWS Glue to discover and catalog structured and unstructured data to aid in building of a Data Lake. Amazon has now introduced AWS Lake Formation, that sits on top of AWS Glue, and makes the job of configuring data sources and governance of the source data much simpler.
S3 added intelligent tiering which automatically moves data to different pricing/availability tiers of S3 based on S3 object access patterns.
AWS Transfer for SFTP is new fully managed SFTP service S3. This allows access to data stored in S3 buckets through SFTP protocol.
Amazon introduced Amazon FSx for Lustre, a fully managed file system for use with Lustre, a file system used for large-scale cluster computing. Similarly, Amazon FSx for Windows File Server delivers a managed Windows file system (supports SMB, NTFS and AD) for use with workloads on Windows Server.
Amazon finally gets into the Blockchain game
Two Blockchain related services were announced.
Amazon Managed Blockchain is a fully managed service that makes it easy to create and manage scalable Blockchain networks using popular open source frameworks Hyperledger Fabric & Ethereum.
Amazon Quantum Ledger Database (QLDB) is a purpose-built ledger database that provides a complete and verifiable history of application data changes. The database is append only/immutable (can't be edited) and cryptographically verified (to ensure contents have not been tampered).
Finally, there were new offerings in the area of DevOps and Security
AWS CodeDeploy now supports Blue/Green deployments for AWS Fargate and Amazon ECS.
AWS Security Hub enables AWS customers to centrally view and manage security alerts and automate compliance checks within and across AWS accounts.
AWS Control Tower helps create and maintain secure, well-architected multi-account AWS environments with respect to configuration of organizations, federated access, centralized logging, IAMs auditing, and workflows for provisioning new accounts.
AWS Well-Architected Tool can review state of workloads and compare them to the latest AWS architectural best practices.
AWS Outposts (later in 2019) - An on-premise hardware offering developed jointly by Amazon and VMWare. It is fully managed, maintained, and supported by AWS to deliver access to the latest AWS services on customer's site. It brings native AWS services, infrastructure, and operating models to virtually any data center, co-location space, or on-premises facility.
Well, that's all for this year. I believe we are nowhere near utilizing the full potential of AI, Machine Learning, and IoT data. I have no doubt we will see many more outstanding innovations in these areas in the near future. We are now well-beyond dynamic websites (Web 1.0) and mobile computing (Web 2.0). Warp speed to AI, ML and IoT (Web 3.0).
Devices connect to Aws IoT core using one of the supported protocols. Data from the device is transported to the Aws IoT as JSON document.
The message broker supports the use of the MQTT protocol to publish and subscribe and the HTTPS protocol to publish. The message broker also supports MQTT over the WebSocket protocol.
MQTT, Client Certificate, 8883, 443
HTTP, Client Certificate, 8443
HTTP, SigV4, 443
MQTT + WebSocket, SigV4, 443
Connectivity from Device using Client Certificate
Resources on Device
x509 Certificate specific to the device (establishes device identity; equivalent to username in classic authentication). An X.509 certificate is a document that is used to prove ownership of a public key embedded in the cert. CA creates a certificate and signs it with a private key. Anyone can now validate your device certificate by checking its digital signature with the CA’s public key. (.pem.cer file)
Private key corresponding to device's x509 (for signing communication)
Root certificate for Aws IoT server (to verify the authenticity of certificate returned by Aws IoT to the device; Answers question: Am I talking to the real Aws IoT Server?). On Aws IoT Button, this certificate is already baked in. (.pem file)
Client connectivity to Internet: WIFI SSID, password
Aws IoT Server endpoint (region-specific, endpoint for multiple devices). For example: abc.iot.us-east-1.amazonaws.com
Resources on Aws IoT Server
x509 Certificate specific to the device with an associated Certificate Id.
Each connected device is represented as thing. It has unique arn. For example: arn:aws:iot:us-east-1:995042574424:thing/iotbutton_G030MD045XXXXX
Two step process.
1) Establish secure communication between the device and Aws IoT server. This is just like connecting to a secure website. The Server sends it's certificate to Client. Client wants to make sure it is talking to the real AWS IoT. Client verifies that server cert is authentic by using the AWs IoT Service root certificate present on the device. The public key that’s embedded in the root certificate is used to validate the digital signature on the Server provided certificate. Client and Server then negotiate and use a shared secret to encrypt communication.
2) Next device identifies itself to Aws IoT server. Device sends a copy of its device certificate to the server. Device calculates a hash over the data sent to Server with its private key and sends it as the digital signature. AWS IoT is now in possession of the devices’ public key (which was in the device certificate) and the digital signature. It uses the device’s public key to check the accuracy of the digital signature. By using the unique identifier of the certificate, it knows exactly which device is establishing a MQTT session. From then on, all messages between the device and AWS IoT are secured using the shared secret (for efficiency).
Aws Resource Access
Aws resources that are allowed to be accessed by a device are specified by associating a policy with Device's certificate on the Aws IoT Server.
For example, the following policy publishes the data received from a device to an SNS topic.