![](https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/pipeline.png)
Getting Started With CDC Replication from MongoDB
Notebook
![](https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/pipeline.png)
![]() |
SingleStore's native data replication gives you the ability to do one-time snapshot, and continuous change data capture CDC from MongoDB® to SingleStoreDB. This provides a quick and easy way to replicate data and power up analytics on MongoDB® data.
What you will learn in this notebook:
Setup replication of a collection to SingleStore and see the live updates on MongoDB® collection replicate to SingleStore.
Install libraries and import modules
In [1]:
!pip3 install pymongo --quietimport pymongoimport random
Replicate a collection to Singlestore
In [2]:
%%sqlDROP DATABASE IF EXISTS cdcdemo;CREATE DATABASE cdcdemo;
In [3]:
source_mongo_url = "mongodb+srv://mongo_sample_reader:SingleStoreRocks27017@cluster1.tfutgo0.mongodb.net/?retryWrites=true&w=majority"
Create a link to Source MongoDB
In [4]:
s2client = pymongo.MongoClient(connection_url_kai) #Initiatizing client for Kais2db = s2client["cdcdemo"]res = s2db.command("createLink", "mongolink",uri=source_mongo_url)print(res, res["ok"])if res["ok"] != 1:raise Exception("Failed to create link: %s" % "local")
Specify the source database and collection and start replication
In [5]:
create_col_args = {"from": {"link": "mongolink", "database": "cdcdemo", "collection": "scores"}}res = s2db.create_collection("scores", **create_col_args)
The following command waits till the entire collection from MongoDB is synced to SingleStore
In [6]:
%%sqlUSE cdcdemo;SYNC PIPELINE scores;
Printing some documents that are replicated
In [7]:
s2collection = s2db["scores"]scores_cursor = s2collection.find().limit(5)for scores in scores_cursor:print(scores)
Total documents count
In [8]:
s2collection.count_documents({})
Insert a document in the source MongoDB collection
In [9]:
data = {"student_id": random.randint(0, 100),"class_id": random.randint(0, 500),"exam_score": random.uniform(0, 100) # Generate random score between 0 and 100 as a double}
In [10]:
sourceclient = pymongo.MongoClient(source_mongo_url)sourcecol = sourceclient["cdcdemo"]["scores"]res = sourcecol.insert_one(data)
In [11]:
sourcecol.count_documents({})
The newly added document is now replicated to singlestore, increasing the documents count by 1 demonstrating real time sync
In [12]:
s2collection.count_documents({})
This native replication capability from Singlestore makes it easy to setup and run continuous data replication from your MongoDB at no additional cost or infrastructure requirements
![](https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png)
Details
About this Template
Setup Zero ETL data replication from MongoDB to SingleStore
Tags
License
This Notebook has been released under the Apache 2.0 open source license.