Consensus 2018 and the Bitcoin OG Phenomenon

Consensus 2018 wrapped its third and final day yesterday, and as some 9,000 attendees, exhibitors, volunteers, and sponsors filed out of the Hilton Midtown in New York City, there was only one thing…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Exploring AWS Glue Part 1

This is the first in a series of exploring the AWS Glue service.

There is a long list of growing applications you can use AWS Glue for. It wouldn’t be possible to cover them in one blog post, hence we will explore them in this blog series. I’ll get into the motivation of this series at the end of the article :)

So… what does that mean exactly? Well, let’s break that down.

From its service page, it lists the following use cases:

The last two are the most recent use cases added as of late 2020.

I have used AWS Glue several times. In my first encounter with it, we used it to create a serverless database to analyze our usage of AWS S3 as we worked to reduce our costs. In another case, as part of our reporting pipeline where it served as our data lake with the underlying data sitting on S3. Most recently, I played around with AWS Glue Studio to create a transformation job without writing any code, which was very neat.

To make AWS money of course!

Plus to help make our lives easier!

You can order the above as you see it…

Machine learning and data analytics projects are very exciting and all the rage these days. At their core, they require the same starting ingredient — data!

Managing data is difficult. Putting data from multiple systems into one coherent place where your team can access it for either reporting, analytics or machine learning purposes isn’t easy.

To better understand the value that AWS Glue provides, we need to look at what you would have done in the pre-Glue days:

Let’s keep it simple: You would have set up a server to extract data from one or multiple data sources (databases, 3rd party services via APIs, etc). Over time, the number of data sources you needed to extract from would increase along with the volume of data you were extracting. You would have to adapt your extraction code to handle this, while also maintaining your server. This adds up to a lot of work!

Wouldn’t it be so nice if we could get away from managing anything except the core transformations we need to do on the data we are extracting?

Under the hood, AWS Glue uses other AWS services to orchestrate our ETL jobs. This involves taking care of provisioning and managing the resources that are required to run our workloads.

This solves the problem highlighted in the last section. Setting up and managing ETL infrastructure is a pain point and blocker to that exciting machine learning or data analytics project you have jumping around in your head. Being able to focus on the differentiating and exciting work because someone else is managing the underlying infrastructure is very empowering.

Thanks for reading! The motivation for starting this series was due to the lack of articles covering AWS Glue and AWS CDK. AWS Cloud Development Kit (CDK) is AWS’s more recent open-source project which allows us to write infrastructure-as-code in the same programming languages we use to write apps (Python, JavaScript, TypeScript, Java, .NET). Under the hood, it converts our code into AWS CloudFormation. So I wanted to explore both the capabilities of AWS Glue while taking advantage of Infrastructure as Code to create something repeatable and easily shareable. We will get our hands dirty with code and deployments in the course of this series.

Subscribe to our Acing AI newsletter, if you are interested:

Interested in learning how to crack machine learning interviews?

Add a comment

Related posts:

3 Changes at Work in 2020

After an unusual year, I decided to gather my thoughts and write down the main changes in my work life in 2020. To be honest I had a good year. I’m lucky enough to have a stable job, for which I am…

Placebo Is Good for Human Species

How does Placebo work? A long time ago, my friend and I used to live in a hostel. Some other people used to stay in the next room of this hostel. My friend was very kind. I started telling the people…

How to Ask A Woman Out in 2021

I often hear the confusion some men feel around appropriately expressing their romantic interest in a woman. Asking someone out can seem even riskier today— especially if you haven’t checked in from…