Distributed Transactions – Part 1 – Send Emails to a list of Users at least once and only once

Distributed Transactions

Transactions that span over multiple physical systems or computers over the network, are simply termed Distributed Transactions. In the world of microservices, a transaction is now distributed to multiple services that are called in a sequence to complete the entire transaction. For detailed explanation refer Handling Distributed Transactions in the Microservice world


Let’s take a use-case

We wanted to send customised emails to our customers stating their past one month device usage statistics. We decided to use SES as a platform for sending emails. We needed an serverless architecture to implement this system in an efficient way.

Before talking about solution, let’s talk about various problems

Scalability: The system should be scalable as per the requirements else it could fail when load increases.

Error handling: The system should be able to catch errors and handle it appropriately.

Idempotency: Sometimes the system may fail due to any error and restart again. So the system may send the same emails again creating duplication. So the system must be able to perform deduplication.

Let’s take an example :

  • Suppose we are sending 100 emails on the 1st day of every month. Here the simple basic code will run loop for 100 times sending one email in each iteration to a specific address.
  • Suppose in between, the code fails and we restart. Here the code will start executing again from the initial customer and start sending mails.
  • The problem here is if the code fails at 55 iterations of the loop then after restarting the code will send those 55 mails again and then remaining.
  • This can go endless if errors are not handled properly. It will end up sending duplicate mails to the same customer again and again.
  • And honestly no customer will want spams in his inbox.
  • So that’s why the system has to be idempotent.

Various Approaches

There are various ways to implement the system. We tried below mentioned implementations:

Conventional Way

  • We can use lambda function to run a loop and send the emails to specific addresses using aws ses api.


  • The problem here was that the system was not idempotent. Duplicates mails are sent if lambda fails and restarts again.
  • The system is not scalable as the lambda also has restrictions like timeouts, limited retries.
  • Error Handling is poor as a single error can fail lambda and have to retry the entire procedure.
AWS Simple Queue Service

Using Simple Queue service

  • As seen in the architecture below, our first lambda function builds emails and sends them to the email queue which is basically a simple queue service. These messages added to the queue are basically email data and details. These messages in queue are processed by sendEmail lambda function which sends email using aws ses.
  • Here the probability of scalability is solved as the sqs scales itself and creates multiple instances of sendEmail function.


  • The problem here was about deduplication. The SQS doesn’t have a mechanism to reject duplicates and also does not allow to create customized identifiers for message.Hence if the main lambda fails then after restart it will start adding the duplicates to the queue. Hence the system was not Idempotent.
AWS DynamoDB

Using DynamoDB streams

  • As seen in the architecture below, Our first lambda builds emails and updates the DynamoDB table.Here DynamoDB have streams enabled which triggers the sendEmail lambda function on every row update.

Note : Here the sendEmail processes the single row update and sends email accordingly.

  • Here DynamoDb streams create multiple instances of send email as per the number of concurrent updates.
  • Here the problem of scalability is solved and if a send email lambda instance fails then it only fails one email and others are not affected.
  • Here the problem of duplication also solves as we can make customized keys for every email and also search for past execution.


  • Lambda function comes with restrictions on the number of retries and back-off rate. Lambda only allows maximum 2 retries. So if we configure Dead Letter Queue and rerun from that then there is a possibility that it can lead to infinite retries.
  • AWS SES have its own sending limits. These limits are Sending quota—the maximum number of recipients that you can send email to in a 24-hour period and Maximum send rate—the maximum number of recipients that you can send email per second. If the send rate crosses the limit then SES throws error (‘Throttling’) and to tackle that error we need a proper exponential back-off in retries.
  • So retries and back-off rate needs to be customized. And the solution for this comes out of the box in AWS Step Functions

To solve the various problems above, we came up with the final architecture using AWS Step Functions

What are Step Functions :

AWS Step Functions lets you coordinate multiple AWS services into serverless workflows. Workflow into a state machine diagram that is easy to understand, easy to explain to others, and easy to change.

(Main Advantage) Step Functions automatically triggers and tracks each step, and retries when there are errors, so your application executes in order and as expected.

AWS Step Functions

Example of workflow design in Step Functions:

Solution Approach using Step Functions

We wanted to send the device usage of a customer, every month in the form of templated email. The system has to be completely serverless and automated.

Architecture using AWS Step Functions

Step Function configuration

You can create the Step Function for the following workflow using StepFunctions console https://console.aws.amazon.com/states/ :

Step Function definition:

  "Comment": "sending Email",
  "StartAt": "sendEmail",
  "States": {
    "sendEmail": {
      "Type": "Task",
      "Resource": "paste the sendEmail lambda ARN",
       "End": true  

Workflow will be created as per definition:

Here we can define error states in definition and define retry count, retry interval and back-off rate for each specific error.

Lambda function ‘sendEmail’

This lambda function runs on the trigger of ‘sendEmailStepFunction’ execution. Here it uses AWS SES api to send a single mail as per the event created by sendEmailStepFunction. The event contains the email data.

Sample code for sending email using Simple email service:

					const params = {
   Source: "xyz@gmail.com",
   Template: "emailTemplate",
   Destination: {
      ToAddresses: [emailId]
   TemplateData: "{\"usage\":\"99\”}”
//send email
var sendTemplatedEmailPromise = (new aws.SES()).ses.sendTemplatedEmail(params).promise();
await sendTemplatedEmailPromise;

Lambda function ‘mainFunction’

The mainFunction extracts the data from the athena table using SQL query. 

The data obtained is in the form of JSON format.

Sample athena response:

					athenaQueryOutput = {
        [{ userid: 'abc-256', data1: '1', data2: '2' },
        { userid: 'xyz-023', data1: '4', data2: '8' }],

    Data Scanned In MB: 0,
    QueryCostInUSD: 0.0000,
    EngineExecutionTimeInMillis: 0,
    Count: 0,
    QueryExecutionId: 'testExecutionID',
    S3 Location:
Then for every user id, the email data is created and StepFunction : ’sendEmailStepfunction’ is called which triggers sendEmail with input as event and thus sendEmail sends email using aws ses. Here for every execution a new instance of sendEmailStepfunction is created by aws.
					var params = {
        stateMachineArn: 'sendEmailStepFunction ARN',
        input: JSON.stringify(emailData),
        name: key  //you can set any name but it should be unique for all step functions
var stepFunctionPromise = (new aws.StepFunctions())stepFunctions.startExecution(params).promise();
await stepFunctionPromise;

Solution to previous example:

  • We took an example of sending 100 emails on the 1st day of every month. Here 100 instances of sendEmailStepFunction will be created and each will be responsible to deliver respective email.
  • Suppose in between, the mainFunction fails and restarts. Here the function starts executing step functions again from first.

 So how to prevent duplication?

  • Here step function’s ‘name’(params.name) has to be always unique. So if there is a re-execution of step function then it will return error ‘ExecutionAlreadyExists’ and hence we can skip that and move ahead. 
  • That’s how we can stop duplication and make system Idempotent.

Cons of using AWS Step Functions:

  • AWS Step Functions takes each execution and each retry as a separate state transition. AWS charge per each state transition hence they can be expensive.
  • So for large scale we can use open source system like Apache Airflow.

Alternative approach (Open Source)

Using Apache Airflow

  • Airflow is a platform created by community to programmatically author, schedule and monitor workflows.
  • We can programmatically code the desired architecture.
  • We can easily define our own operators & executors and it is highly scalable.

We will come back soon with the blog on Implementation of above email system using Apache Airflow.

Leave a Reply

Your email address will not be published. Required fields are marked *