Loading JSON Data File into AWS DynamoDB
In this project, we will look at how to create a Dynamodb table in AWS and populate data inside it. For this, we will first create a table through AWS management console, download a json data file, use aws python sdk and invoke lambda functions to populate data in Dynamodb.
First, let us look at the solution diagram for the process:
Following are the main steps:
- Create a dynamodb table from aws management console
- Create an IAM user to connect aws with python notebook
- Download json data (github link given), and load it to python notebook
- Load data to dynamodb through python boto
Step 1: Create DynamoDB table
- Go to “Dynamodb” from the search bar.
- Click on “Create Table” on the upper right corner.
- As our data is related to books, we will name the table “BooksCatalog”. We will name the partition key “Book ID” and change its data type to number (since this is the data type in our JSON data). Sort key is optional, so we will leave it
- Scroll down and click “Create Table”
Step 2: Creating an IAM User
- Go to “IAM” from the search bar.
- Go to “Users” from the dashboard and click on “Add user”
- Select a username and check the first credential type. Click Next: Permissions
- In the permissions, select “Add user to the group” and choose a group or create group and give “AdministratorAccess”. Click “Next:Tags”
- The tags are optional, we can skip them for now. Click “Next: Review”.
- Review your settings and click “Create User”.
- Now we will download the csv containing our Access key ID and secret access key (we will need them to connect third party applications with our aws dynamodb).
Step 3: Upload Sample Json data file to Python Notebook
- The dataset used in this tutorial is available at the following link:
Download or copy/paste it.
- The next step is to load this data to our python notebook. In this tutorial, we are using Google colaboratory jupyter notebook for python coding.
- Go to https://colab.research.google.com/ and click on “New Notebook”
- Give a name to the notebook. For loading our json data, select “files” from the left vertical pane. We can see a folder named “sample_data”. Click on the three vertical dots on the right of this folder, which brings different options. Choose “New File”.
- Name the new file “BookCatalog.json”. Double clicking this file would open the Json file on the right side of the notebook. We can paste our Json dataset to this file.
Step 4: Load Data to Dynamodb
- Now we need to write the code to load our data to dynamodb table. Following is the code for that:
#install boto library !pip install boto3 #import boto and json import boto3 import json #specify access key and secret access key access_key="your access key here" secret_access_key="your secret access key here" #boto session session=boto3.Session(aws_access_key_id=access_key,aws_secret_access_key=secret_access_key, region_name='us-east-1') client_dynamo=session.resource('dynamodb') #dynamodb table table=client_dynamo.Table('BooksCatalog') records="" #read json file with open('/content/sample_data/BookCatalog.json','r') as datafile: records=json.load(datafile) count=0 #for loop to iterate through JSON file items for i in records: i['BookID']=count print(i) # table.put_item() for inserting json data into dynamodb table response=table.put_item(Item=i) count+=1
- In this code, we need to update the access_key variable and secret_access_key variable values according to our own credentials which we got in the csv file when creating the IAM user. We also need to update the region_name accordingly.
- Now we can see the data populated in our dynamodb table. To check this, go to dynamodb dashboard, click on the table name and click on Explore Table Items.
- We can see the attributes and data populated in the table.
Hope this simple lab was useful for someone looking to load json data files quickly into the AWS Dynamodb.
Aurora Data Migration: Oracle Database to Aurora MySQL using DMS Every year more businesses lean towards the cloud. This has led to a decline…
In this project, we will show you how you can read JSON data that is stored into S3 bucket by connecting it to AWS…