Distributed Transactions – Part 1 – Send Emails to a list of Users at least once and only once

Distributed Transactions :

Transactions that span over multiple physical systems or computers over the network, are simply termed Distributed Transactions. In the world of microservices, a transaction is now distributed to multiple services that are called in a sequence to complete the entire transaction. For detailed explanation refer Handling Distributed Transactions in the Microservice world

How to send a simple email using aws?

Amazon Simple Email Service (Amazon SES) is a cloud-based flexible, affordable and scalable email sending service. Amazon SES can be integrated to your existing software using AWS SDK. You can send simple text mail as well as html encoded mails.

Let’s take a use-case :

We wanted to send customised emails to our customers stating their past one month device usage statistics. SES was chosen as a platform for sending emails. A serverless architecture to implement this system was more operationally lighter

Before talking about solution, let’s talk about various problems:

Scalability: The system should be scalable as per the requirements else it could fail when load increases.

Error handling : The system should be able to catch errors and handle it appropriately.

Idempotency: Sometimes the system may fail due to any error and restart again. So the system may send the same emails again creating duplication. So the system must be able to perform deduplication.

Let’s take an example :

  • Suppose we are sending 100 emails on the 1st day of every month. Here the simple basic code will run loop for 100 times sending one email in each iteration to a specific address.
  • Suppose in between, the code fails and we restart. Here the code will start executing again from the initial customer and start sending mails.
  • The problem here is if the code fails at 55 iterations of the loop then after restarting the code will send those 55 mails again and then remaining.
  • If errors are not handled properly, this process can continue indefinitely. It will end up sending duplicate mails to the same customer again and again.
  • And honestly no customer will want spams in his inbox.
  • So that’s why the system has to be idempotent.

Various Approaches :

There are various ways to implement the system. We tried below mentioned implementations:

Conventional Way :

  • We can use lambda function to run a loop and send the emails to specific addresses using aws ses api.

     Problems:

  • The problem here was that the system was not idempotent. Duplicates mails are sent if lambda fails and restarts again.
  • The system is not scalable as the lambda also has restrictions like timeouts, limited retries.
  • Error Handling is poor as a single error can fail lambda and have to retry the entire procedure.
AWS Simple Queue Service

Using Simple Queue service to handle the atomicity of distributed transaction : 

  • As seen in the architecture below, our first lambda function builds emails and sends them to the email queue which is basically a simple queue service. These messages added to the queue are basically email data and details. The ‘sendEmail’ Lambda function processes the messages in the queue and sends emails using AWS SES.
  • Here the probability of scalability is solved as the sqs scales itself and creates multiple instances of sendEmail function.

      Problems:

  • The problem here was about deduplication. The SQS doesn’t have a mechanism to reject duplicates and also does not allow to create customized identifiers for message.Hence if the main lambda fails then after restart it will start adding the duplicates to the queue. Hence the system was not Idempotent.
AWS DynamoDB

Using DynamoDB streams to handle atomicity of distributed transaction:

  • As seen in the architecture below, Our first lambda builds emails and updates the DynamoDB table.Here DynamoDB have streams enabled which triggers the sendEmail lambda function on every row update.

Note : Here the sendEmail processes the single row update and sends email accordingly.

  • Here DynamoDb streams create multiple instances of sendEmail as per the number of concurrent updates.
  • Here the problem of scalability is solved and if a sendEmail lambda instance fails then it only fails one email and others are not affected.
  • Here the problem of duplication also solves as we can make customized keys for every email and also search for past execution.

Problems:

  • Lambda function comes with restrictions on the number of retries and back-off rate. Lambda only allows maximum 2 retries. So if we configure Dead Letter Queue and rerun from that then there is a possibility that it can lead to infinite retries.
  • AWS SES have its own sending limits. These limits are Sending quota—the maximum number of recipients that you can send email to in a 24-hour period and Maximum send rate—the maximum number of recipients that you can send email per second. If the send rate crosses the limit then SES throws error(‘Throttling’) and to tackle that error we need a proper exponential back-off in retries.
  • You need to customize the retries and back-off rate. And the solution for this comes out of the box in AWS Step Functions

To solve the various problems of distributed transaction, we came up with the final architecture using AWS Step Functions :

AWS Step Functions

What are Step Functions :

AWS Step Functions lets you coordinate multiple AWS services into serverless workflows.Workflow into a state machine diagram that is easy to understand, easy to explain to others, and easy to change.

(Main Advantage) Step Functions automatically triggers and tracks each step, and retries when there are errors, so your application executes in order and as expected.

Example of workflow design in Step Functions :

Sample Workflow Design

Solution for distributed transaction using Step Functions:

We wanted to send the device usage of a customer, every month in the form of templated email. The system has to be completely serverless and automated.

Architecture using AWS Step Functions

Step Function configuration :

You can create the Step Function for the following workflow using StepFunctions console https://console.aws.amazon.com/states/ :

Step Function definition :

{
  "Comment": "sending Email",
  "StartAt": "sendEmail",
  "States": {
    "sendEmail": {
      "Type": "Task",
      "Resource": "paste the sendEmail lambda ARN",
       "End": true  
    }
  }
}

Workflow will be created as per definition :

Here we can define error states in definition and define retry count, retry interval and back-off rate for each specific error.

Lambda function  ‘sendEmail’ :

This lambda function runs on the trigger of ‘sendEmailStepFunction’ execution. Here it uses AWS SES api to send a single mail as per the event created by sendEmailStepFunction. The event contains the email data.

Sample code for sending email using Simple email service :

const params = {
   Source: "xyz@gmail.com",
   Template: "emailTemplate",
   Destination: {
      ToAddresses: [emailId]
        },
   TemplateData: "{\"usage\":\"99\”}”
};
//send email
var sendTemplatedEmailPromise = (new aws.SES()).ses.sendTemplatedEmail(params).promise();
await sendTemplatedEmailPromise;

Lambda function ‘mainFunction’ :

The mainFunction extracts the data from the athena table using SQL query. 

The data obtained is in the form of JSON format.

Sample athena response:

athenaQueryOutput = {
    Items:
        [{ userid: 'abc-256', data1: '1', data2: '2' },
        { userid: 'xyz-023', data1: '4', data2: '8' }],

    Data Scanned In MB: 0,
    QueryCostInUSD: 0.0000,
    EngineExecutionTimeInMillis: 0,
    Count: 0,
    QueryExecutionId: 'testExecutionID',
    S3 Location:
        'Test_s3_location'
}

or each user ID, the system creates email data and calls the StepFunction ‘sendEmailStepfunction’, which triggers the ‘sendEmail’ function with the event as input. The ‘sendEmail’ function then sends the email using AWS SES. AWS creates a new instance of the ‘sendEmailStepfunction’ for every execution.

var params = {
        stateMachineArn: 'sendEmailStepFunction ARN',
        input: JSON.stringify(emailData),
        name: key  //you can set any name but it should be unique for all step functions
    };
var stepFunctionPromise = (new aws.StepFunctions())stepFunctions.startExecution(params).promise();
await stepFunctionPromise;

Solution to previous example :

  • We took an example of sending 100 emails on the 1st day of every month. Here, we will create 100 instances of sendEmailStepFunction, and each instance will be responsible for delivering a respective email.
  • Suppose in between, the mainFunction fails and restarts. Here the function starts executing step functions again from first.

 So how to prevent duplication?

  • Here step function’s ‘name’(params.name) has to be always unique. So if there is a re-execution of step function then it will return error ‘ExecutionAlreadyExists’ and hence we can skip that and move ahead. 
  • That’s how we can stop duplication and make system Idempotent.

Cons of using AWS Step Functions:

  • AWS Step Functions takes each execution and each retry as a separate state transition. AWS charge per each state transition hence they can be expensive.
  • So for large scale we can use open source system like Apache Airflow.

Alternative approach (Open Source) :

Using Apache Airflow:

  • The community created Airflow, a platform for programmatically authoring, scheduling, and monitoring workflows.
  • We can programmatically code the desired architecture.
  • We can easily define our own operators & executors and it is highly scalable.

We will come back soon with the blog on Implementation of above email system using Apache Airflow.

References :

Handling Distributed Transactions in the Microservice world

StepFunction

Lambda Function

Simple Queue Services

Simple Email Services

DynamoDB

Tags