Creating a custom Kafka cluster configuration using Macros in AWS CloudFormation

Krishnaditya Kancharla
4 min readMay 3, 2020

Introduction

This article explains how to create a cluster configuration in Amazon MSK using AWS CloudFormation.

CloudFormation does not natively support the creation of cluster configurations in MSK as there is no explicit resource which can be defined in your template to model or provision them.

One way to circumvent this obstacle is to leverage template macros in CloudFormation.

Macros, primarily, extend the functionality of CloudFormation by using AWS Lambda under the hood to run any custom provisioning code which you may have written.

The source code can be downloaded from my Github repository.

Prerequisites and 101s

If you are unfamiliar with any of the aforementioned terminologies/technologies, here are some 101s which will help get you up to speed with them.

1. Amazon MSK(Managed Streaming for Apache Kafka):

Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications. With Amazon MSK, you can use native Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning applications.

2. AWS CloudFormation :

AWS CloudFormation provides a common language for you to model and provision AWS and third party application resources in your cloud environment. This provides a single source of truth for the AWS and third party resources.

3. Macros in AWS CloudFormation :

Macros act like pre-processors of your CloudFormation templates. After you submit your CloudFormation template, macros are called to transform portions of your template before CloudFormation starts provisioning resources. Here’s an informative article which explains the usage of macros

4. AWS Lambda :

AWS Lambda is an event-driven, serverless computing platform provided by AWS. The custom logic which would be used to implement the provisioning of cluster configurations would be written as Lambda functions.

Step 1: Creating and Deploying Boto3 Macro in your AWS account

There are two fundamental steps to process templates using Macros : Creating the macro itself and Using the macro to perform processing on your templates.

We will be creating a Boto3 Macro which provisions CloudFormation resources that represent operations performed by boto3. Each Boto3 resource represents one function call. The sample source code of a Boto3 macro is already publicly available on the GitHub repository hosting sample templates made by AWS. However, we need to make certain modifications to ensure that the macros pass the correct parameters for boto3 to execute the CreateConfiguration API call.

So without any further ado, let us get started.

1.1 Creating the lambda deployment package

The deployment package should contain two python codes corresponding to two lambda functions; One which performs the desired template processing and the other which contains the core custom logic, in our case, the code to execute the CreateConfiguration API call from boto3 and relay the results back to CloudFormation.

Lambda function which performs template processing

Lambda function which executes the CreateConfiguration API call using Boto3 and relays the output back to CloudFormation

1.2 Creating a Macro definition Template

Once the above python files have been written locally, its time to deploy them and make the Lambda function available to CloudFormation by creating a macro definition template.

Now that we have all the components required to deploy the Macro function, let’s proceed with it.

1.3 Deploying the Macro Definition Template

  1. You will need an S3 bucket to store the CloudFormation artifacts:

If you don’t have one already, create one with :

aws s3 mb s3://<bucket name>

2. Package the CloudFormation template. The provided template uses the AWS Serverless Application Model, so it must be transformed before you can deploy it.

aws cloudformation package \
--template-file macro.template \
--s3-bucket <your bucket name here> \
--output-template-file packaged.template

3. Deploy the packaged CloudFormation template to a CloudFormation stack:

aws cloudformation deploy \
--stack-name boto3-macro \
--template-file packaged.template \
--capabilities CAPABILITY_IAM

The following resources would be successfully created once the deployment is complete :

Step 2 : Deploying the main template file

It’s time to use this macro in our main template file which is used to provision the MSK Cluster configuration. Here’s a link that helps you find out more about the various properties that can be set in your custom configuration.

While deploying the stack using the above template file, you can edit the values of ‘ServerPropertiesFileContent’ parameters to properties which best suit your use-case :

Once the stack has been deployed successfully, the cluster configuration can be viewed in the ‘Cluster configurations’ section within the MSK console.

Conclusion

This implementation comes in handy in use cases where Data Streaming Workloads in an environment are provisioned collectively through CloudFormation. It overcomes the problem of CloudFormation not natively supporting the provisioning of cluster configurations in MSK.

--

--

Krishnaditya Kancharla

Cloud Technology enthusiast who specializes in AWS. Currently working as a Cloud Engineer at AWS.