WikiGalaxy

Personalize

Database Concepts in System Design

Normalization:

Normalization is the process of organizing data to minimize redundancy. It involves dividing a database into two or more tables and defining relationships between the tables.

Denormalization:

Denormalization is the process of combining tables to improve read performance. This is often used in data warehousing where query speed is crucial.

ACID Properties:

ACID stands for Atomicity, Consistency, Isolation, Durability. These properties ensure reliable processing of database transactions.

CAP Theorem:

CAP Theorem states that a distributed database system can only provide two out of the three guarantees: Consistency, Availability, and Partition Tolerance.

Indexing:

Indexing improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space.

Sharding:

Sharding involves partitioning a database into smaller, faster, more easily managed parts called shards.

Normalization

First Normal Form (1NF):

Ensures that the values in a table are atomic and each column contains unique data.

Second Normal Form (2NF):

Builds on 1NF by ensuring that all non-key attributes are fully functional dependent on the primary key.

Third Normal Form (3NF):

Ensures that all attributes are only dependent on the primary key, removing transitive dependencies.


      // Example of a table in 1NF
      Table: Students
      +---------+-----------+-------------------+
      | ID      | Name      | Courses           |
      +---------+-----------+-------------------+
      | 1       | Alice     | Math, Science     |
      | 2       | Bob       | English, History  |
      +---------+-----------+-------------------+
    

Explanation:

The table above violates 1NF because the 'Courses' column contains multiple values. To convert it into 1NF, we should split the values into separate rows.

Denormalization

Purpose:

Denormalization is used to improve the read performance of a database at the expense of write performance and storage.

Use Cases:

Commonly used in OLAP systems where complex queries are performed on large volumes of data.


      // Denormalized table example
      Table: Orders
      +---------+-----------+-------------------+
      | OrderID | Customer  | ProductDetails    |
      +---------+-----------+-------------------+
      | 101     | John Doe  | TV, 2, $400       |
      | 102     | Jane Doe  | Laptop, 1, $800   |
      +---------+-----------+-------------------+
    

Explanation:

In the denormalized table above, 'ProductDetails' combines multiple columns into one, making it easier to read but harder to update.

ACID Properties

Atomicity:

Ensures that each transaction is treated as a single unit, which either succeeds completely or fails completely.

Consistency:

Ensures that a transaction can only bring the database from one valid state to another, maintaining database invariants.

Isolation:

Ensures that concurrent execution of transactions leaves the database in the same state as if the transactions were executed sequentially.

Durability:

Ensures that once a transaction has been committed, it will remain so, even in the event of a power loss, crash, or error.


      // Pseudo-code illustrating ACID properties
      BEGIN TRANSACTION
        UPDATE Account SET Balance = Balance - 100 WHERE AccountID = 1;
        UPDATE Account SET Balance = Balance + 100 WHERE AccountID = 2;
      COMMIT;
    

Explanation:

The pseudo-code demonstrates a simple bank transfer operation where ACID properties ensure that the transaction is atomic, consistent, isolated, and durable.

CAP Theorem

Consistency:

Every read receives the most recent write or an error.

Availability:

Every request receives a response, without guarantee that it contains the most recent write.

Partition Tolerance:

The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.


      // Example of CAP theorem in distributed systems
      // Choose two: Consistency, Availability, Partition Tolerance
    

Explanation:

In distributed databases, it's impossible to achieve all three guarantees simultaneously. Systems must choose which two to prioritize based on their specific needs.

Indexing

Purpose:

Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed.

Types:

Common types include B-tree indexes and hash indexes, each suitable for different types of queries.


      // SQL command to create an index
      CREATE INDEX idx_customer_name ON Customers (Name);
    

Explanation:

The SQL command above creates an index on the 'Name' column of the 'Customers' table, improving the speed of queries searching by customer name.

Sharding

Purpose:

Sharding helps in scaling a database by distributing the data across multiple machines, allowing for more efficient querying and storage.

Types:

Horizontal sharding splits data across rows, while vertical sharding splits data across columns.


      // Example of horizontal sharding
      // Shard 1: User data for users with ID 1-1000
      // Shard 2: User data for users with ID 1001-2000
    

Explanation:

The example demonstrates horizontal sharding where user data is distributed across multiple shards based on user IDs, allowing for better load balancing and performance.

logo of wikigalaxy

Newsletter

Subscribe to our newsletter for weekly updates and promotions.

Privacy Policy

 • 

Terms of Service

Copyright © WikiGalaxy 2025