MongoDB Indexing Basics with Python

MongoDB Improve performance

Why you should use Indexing in Databases
Faster Searches Indexes act as efficient lookup structures, enabling rapid retrieval of data matching query criteria.
Optimized Query Execution By using indexes, databases can strategically navigate the data path, minimizing processing time for queries.
Reduced Disk I/O Indexes allow the database to locate data directly, minimizing the need for extensive disk scans and improving overall performance.
Improved Concurrency Indexes enhance multi-user access by reducing resource lock times during query execution, leading to smoother concurrent operations.
Considerations Indexing offers significant search performance benefits, however, it requires additional storage space and can slightly impact write operations due to ongoing index maintenance.

Install python mongo client pymongo if you have not already install

I assume you have already a running mongodb server which is running on local host localhost:27017/ ( this is by default)

If you are using mongo atlas just change the host name which is given to you also you may need to add the ip address if you set up the security.

Now connect to the database with python mongo client

import pymongo

client = pymongo.MongoClient("mongodb://localhost:27017/")  
db = client["mydatabase"] 
collection = db["mycollection"] 

Create the Index

collection.create_index([("name", pymongo.ASCENDING)]) 

Add Data

collection.insert_many([
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25},
    {"name": "Charlie", "age": 35}
])

Test indexing

# Find "Bob" without the index 
start_time = time.time()
results = collection.find({"name": "Bob"})
print("Time without index:", time.time() - start_time)

# Find "Bob" with the index 
start_time = time.time()
results = collection.find({"name": "Bob"})
print("Time with index:", time.time() - start_time)

Run this code more than one time to get the differences.




import pymongo
import time
# Connect to MongoDB
client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client["mydatabase"]
collection = db["mycollection"]

# Create an index on the "name" field
collection.create_index([("name", pymongo.ASCENDING)])

# Insert some documents
collection.insert_many([
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25},
    {"name": "Charlie", "age": 35}
])

# Query without index (full collection scan)
start_time = time.time()
results = collection.find({"name": "Bob"})
print("Query time without index:", time.time() - start_time)

# Query with index
start_time = time.time()
results = collection.find({"name": "Bob"})
print("Query time with index:", time.time() - start_time)





Related Posts

Storing and retrieving data from sqlite with python
April 29, 2024

For making everything simpler we will use peewee orm with python sqlite3…. First create and connect to our database from peewee import * # SQLite database setup with Peewee db = SqliteDatabase(‘cv_database.db’) class User(Model): username = CharField(unique=True) # Add more user details as needed class Meta: database = db class CV(Model): user = ForeignKeyField(User, backref=’cv’) […]