Skip to main content
Home » Blog » AWS

Loading JSON Data File into AWS DynamoDB  

2

In this project, we will look at how to create a Dynamodb table in AWS and populate data inside it. For this, we will first create a table through AWS management console, download a json data file, use aws python sdk and invoke lambda functions to populate data in Dynamodb.

First, let us look at the solution diagram for the process:

Solution Diagram of Dynamodb Data Loading Process

Following are the main steps:

  1. Create a dynamodb table from aws management console
  2. Create an IAM user to connect aws with python notebook
  3. Download json data (github link given), and load it to python notebook
  4. Load data to dynamodb through python boto 

Step 1: Create DynamoDB table 

  1. Go to “Dynamodb” from the search bar.
  1. Click on “Create Table” on the upper right corner.
  1. As our data is related to books, we will name the table “BooksCatalog”. We will name the partition key “Book ID” and change its data type to number (since this is the data type in our JSON data). Sort key is optional, so we will leave it
  1. Scroll down and click “Create Table”

Step 2: Creating an IAM User

  1. Go to “IAM” from the search bar.
  1. Go to “Users” from the dashboard and click on “Add user” 
  1. Select a username and check the first credential type. Click Next: Permissions  
  1. In the permissions, select “Add user to the group” and choose a group or create group and give “AdministratorAccess”. Click “Next:Tags”
  1. The tags are optional, we can skip them for now. Click “Next: Review”.
  2. Review your settings and click “Create User”.
  1. Now we will download the csv containing our Access key ID and secret access key (we will need them to connect third party applications with our aws dynamodb). 

Step 3: Upload Sample Json data file to Python Notebook

  1. The dataset used in this tutorial is available at the following link:

https://github.com/yafra1/JSON-Data/blob/main/BookCatalog

Download or copy/paste it. 

  1. The next step is to load this data to our python notebook. In this tutorial, we are using Google colaboratory jupyter notebook for python coding. 
  2. Go to https://colab.research.google.com/ and click on “New Notebook”
  1. Give a name to the notebook. For loading our json data, select “files” from the left vertical pane. We can see a folder named “sample_data”. Click on the three vertical dots on the right of this folder, which brings different options. Choose “New File”.
  1. Name the new file “BookCatalog.json”. Double clicking this file would open the Json file on the right side of the notebook. We can paste our Json dataset to this file. 

Step 4: Load Data to Dynamodb

  1. Now we need to write the code to load our data to dynamodb table. Following is the code for that: 
#install boto library
!pip install boto3
 
#import boto and json
import boto3
import json
 
#specify access key and secret access key
access_key="your access key here"
secret_access_key="your secret access key here"
 
#boto session
session=boto3.Session(aws_access_key_id=access_key,aws_secret_access_key=secret_access_key, region_name='us-east-1')
client_dynamo=session.resource('dynamodb')
 
#dynamodb table
table=client_dynamo.Table('BooksCatalog')
records=""
 
#read json file
with open('/content/sample_data/BookCatalog.json','r') as datafile:
  records=json.load(datafile)
count=0
 
#for loop to iterate through JSON file items
for i in records:
  i['BookID']=count
  print(i)
 
# table.put_item() for inserting json data into dynamodb table
  response=table.put_item(Item=i)
  count+=1

  1. In this code, we need to update the access_key variable and secret_access_key variable values according to our own credentials which we got in the csv file when creating the IAM user. We also need to update the region_name accordingly. 
  2. Now we can see the data populated in our dynamodb table. To check this, go to dynamodb dashboard, click on the table name and click on Explore Table Items.
  1. We can see the attributes and data populated in the table. 

Hope this simple lab was useful for someone looking to load json data files quickly into the AWS Dynamodb.

Share

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

You may also like

0
    0
    Your Cart
    Your cart is emptyReturn to Courses