Document Stores (NoSQL)

 

Document Stores (NoSQL)

What is a Document Store?

A document store is a NoSQL database that stores data as structured documents (rather than simple key–value pairs).

๐Ÿ‘‰ Unlike key-value stores:

  • The value is structured and searchable

  • You can query based on the content inside documents


Core Idea

“Instead of retrieving data only by key, document stores allow you to search and retrieve data based on any field within the document.”


Data Model: Tree Structure

Document stores organize data as a tree-like structure:

  • Root → starting point

  • Branches / Sub-branches → nested elements

  • Leaves → actual data values





๐Ÿ“Œ Example Structure

Person ├── id: 123 ├── name: Dan └── address ├── city: Minneapolis └── street ├── number: 107 └── name: Main

๐Ÿ‘‰ Data is stored hierarchically (like XML/JSON)



๐Ÿ” Key Feature: Full Content Querying

  • All document content is automatically indexed

  • You can search by:

    • Any field

    • Any value

    • Any nested structure

๐Ÿ“Œ Example:

  • Find all documents containing “Robert E. Lee”

  • Retrieve only specific parts of documents


⚙️ Document Path (Access Mechanism)

Document stores use path expressions to access data:

People/Person[id='123']/Address/Street/StreetName

๐Ÿ‘‰ This acts like a key to locate specific data within a document





๐Ÿ“‚ Document Collections

๐Ÿ“Œ What are Collections?

  • Groups of related documents

  • Similar to:

    • Folders (file systems)

    • Tables (loosely, in RDBMS)


✔ Uses of Collections:

  • Organize documents

  • Manage permissions

  • Define indexes and rules

  • Navigate large datasets

๐Ÿ‘‰ Collections can contain:

  • Documents

  • Other collections (hierarchical structure)


๐Ÿ“ฆ Application Collections

Document stores can also store entire applications:

  • Code (scripts)

  • Data

  • Configuration

๐Ÿ‘‰ Example:

  • Application packages like .xar files

✔ This makes document stores:

  • Not just databases

  • But also application platforms


⚙️ Document Store APIs

  • Provide simple query interfaces

  • Use:

    • Path expressions

    • Predicates (filters)

๐Ÿ‘‰ Example:

Person[7]

→ Selects the 7th person


๐Ÿงพ Document Formats

1. JSON (Lightweight)

  • Used in systems like MongoDB

  • Supports nested structures

  • Good for application data


2. XML (Complex Content)

  • Supports:

    • Rich text

    • Attributes (bold, links, etc.)

  • Used in true document stores


Comparison with Key-Value Stores

Feature    Key-Value Store        Document Store
Structure        Unstructured (BLOB)        Structured documents
Querying        By key only        By content
Indexing        Limited        Automatic indexing
Flexibility        High        Very high

๐ŸŒ Real-World Implementations

๐Ÿ”น MongoDB

  • Designed for:

    • High-performance applications

    • Real-time systems (e.g., ad serving)

  • Features:

    • Auto-sharding

    • Replication

    • Load balancing


๐Ÿ”น CouchDB

  • Focus on:

    • Distributed systems

    • Data synchronization

  • Features:

    • Document versioning

    • Offline sync

    • High reliability


๐ŸŽฏ Use Cases

1. ๐Ÿ“ข Ad Serving Systems

  • Real-time selection of ads

  • Personalized content delivery


2. ๐Ÿ“ฐ Content Management

  • Store web pages, blogs, media


3. ๐Ÿ“Š Real-Time Analytics

  • User behavior tracking

  • Social media monitoring


4. ๐Ÿ›’ Product Data Management

  • Complex product catalogs

  • Variable product attributes


5. ๐Ÿ‘ค User Data Management

  • Profiles in:

    • Social networks

    • Gaming systems


6. ๐Ÿ“ก High-Volume Data Feeds

  • Streaming data ingestion

  • Event-driven systems


⚖️ Advantages

  • Flexible schema (no fixed structure)

  • Powerful querying on document content

  • Efficient for hierarchical/nested data

  • Scalable and distributed


⚠️ Limitations

  • Large indexes → higher storage cost

  • Complex queries may impact performance

  • Less structured than relational systems


 Summary

“A document store is a flexible NoSQL database that stores data as structured documents (like JSON or XML), allowing efficient querying of any field within the document and making it ideal for applications with complex, evolving data.”


๐Ÿ Key Takeaways

  • Stores structured, hierarchical documents

  • Supports content-based queries

  • Uses collections for organization

  • Ideal for real-time, scalable applications

Comments

Popular posts from this blog

Database Management Systems DBMS PCCST402 Semester 4 KTU CS 2024 Scheme

Data Models, Schemas and Instances

Introduction to Database Management System -DBMS