Data Modeling in DynamoDB

/ Work

Let's start by saying this, Dynamo is fun. It is for everyone but not everyone is for the dynamo. A large part of the problem starts with inaccurate modeling of the dynamo. This article highlights the points I learned while working with Dynamo and we will use examples so it becomes easier to understand.

Understanding the Basics

Unlike relational databases such as MySQL, Dynamo requires you to constantly ask questions about how the data will be queried. Asking these questions leads you down the path of how the entity is stored and how to split them up in a way that enables speedy queries.

 

The first step is to create primary keys for your items. The primary key is composed of a partition key and a sort key. (Sort key is optional but you will want to leverage a sort key.)

Rather than going into the details for both, let just focus on data modeling.

Let’s say that you are designing an application where you need to store information about the new accounts and their relationships. We could say that each account has users and projects. The accounts would also have basic information like AccountName.

 

When you are designing the dynamo you need to start breaking things into a hierarchy. We know without an Account there won't be users or projects and without users, there won't be projects.

The simplest way to start modeling DynamoDB is to start with two minimal keys Partition Key (PK) and SortKey(SK)

 

Your AccountId should be the ultimate source of truth for all the relations. So we could say AccountId is always your PK.

 

The SK will change based on what information you are sorting, so if you sorting user information then your SK would be UserId, if you are storing Project Information then your SK would be ProjectId.

 

So let's model the minimal table for your structure

As you can see each SK is prefixed with a certain key, that key comes in handy while creating your GSI for data retrieval. Now let's take a step forward and create additional attributes for storing the metadata about each entity.

Once you create the additional attributes, your Dynamo Tables look like this.

Now notice how the key "name" is used for all three entity but the meaning for key changes based on SK.

At this point, your aggregated table looks like this

So considering your AccountId being your primary source of retrieval, You can create the following set of queries.

/** get all user by account **/
 let params = {
      TableName: "feedboard",
      KeyConditionExpression: "PK=:PK And begins_with(SK,:SK)",
      ExpressionAttributeValues: {
        ":PK": accountId,
        ":SK": "user",
      },

};

/** get all projects by account **/
let params = {
      TableName: "feedboard",
      KeyConditionExpression: "PK=:PK And begins_with(SK,:SK)",
      ExpressionAttributeValues: {
        ":PK": accountId,
        ":SK": "project",
      },


};

But the thing is, the system does not usually start handling you over the AccountId, right. The first thing you need is an authorization token where your AccountId and UserId are encoded, so all your subsequent request using the token can easily be decoded.

 

So let's start with Login. Login is usually made using a username and password. So what we will need here is to know how we can retrieve the information based on Username. To do so, we are going to create a GSI on the Username attribute and we will make SK as SortKey for our newly defined GSI.

After defining a GSI on Username, we now have two tables

1) The primary table that you see in Dynamo Console

2) The Virtual table created by your GSI

Your stored data attributes for the GSI looks like this

So, our query to retrieve the user by Username becomes

    let params = {
      TableName: `feedboard`,
      KeyConditionExpression: "username=:username And begins_with(SK,:SK)",
      IndexName: "query_by_username",
      ExpressionAttributeValues: {
        ":username": username,
        ":SK": `user`,
      },

};

You now have AccountId(PK), Prefix@UserId(SK), and User information. You would like to split your SK and just retrieve the userId alone. From here, you can encode the AccountId and UserId in your JWT Token and start building your other queries

This is all for today. I am open to everyone's suggestions.

Let me know your thoughts and queries.

 
Mahendra Rathod
Developer from 🇮🇳
@maddygoround
© 2024 Mahendra Rathod · Source