Skip to main content

Chapter 1: DBMS

DATABASE MANAGEMENT SYSTEM

1.1 Introduction to Data, Database, Database System, and DBMS

Data refers to raw facts and figures without context, which can be processed to generate meaningful information.Database is a structured collection of data that is stored and accessed electronically. It allows for efficient data management and retrieval.Database System combines the database and the Database Management System (DBMS), which is software used to manage and manipulate the database.DBMS (Database Management System) is a software tool that facilitates the creation, retrieval, updating, and management of data in a database. It ensures data integrity, security, and consistency while providing a user-friendly interface for interacting with the data.

1.2 Field, Record, Objects, Primary Key, Alternate Key, Candidate Key

  • Field: A field is a single piece of data; it represents a column in a database table. For example, "First Name" in a student database.
  • Record: A record is a collection of related fields that represent a single entity. For instance, a record for a student might include fields for "First Name," "Last Name," "Age," and "Grade."
  • Objects: In the context of databases, objects can refer to any defined entity within the database, such as tables, views, indexes, etc.
  • Primary Key: A primary key is a unique identifier for a record in a table. It ensures that no two records can have the same value for this field, such as a student ID.
  • Alternate Key: An alternate key is any candidate key that is not chosen as the primary key. It can also uniquely identify a record, such as an email address.
  • Candidate Key: A candidate key is a field or combination of fields that can uniquely identify a record in a table. There can be multiple candidate keys in a table.

1.3 Advantages of Using DBMS

  1. Data Independence: Applications are insulated from changes in data representation and storage.
  2. Efficient Data Access: DBMS uses sophisticated algorithms for data storage and retrieval.
  3. Data Integrity and Security: Enforces constraints to maintain data accuracy and restricts unauthorized access.
  4. Data Administration: Centralizes data management, improving consistency and reducing redundancy.
  5. Backup and Recovery: Provides mechanisms for data backup and recovery in case of failures.
  6. Concurrent Access: Allows multiple users to access the database simultaneously without conflicts.

Differences between DBMS and Flat file system :

FeatureFlat File SystemDatabase Management System (DBMS)
Data StructureData is stored in a single table or file, often in plain text.Data is organized into tables with rows and columns.
RelationshipsNo relationships between data entries; data is independent.Supports relationships between tables through keys (e.g., foreign keys).
Data RedundancyHigh redundancy; the same data may be repeated across files.Low redundancy; data is centralized and normalized.
Data IntegrityLow integrity; lacks mechanisms to enforce data accuracy.High integrity; constraints can be applied to ensure data consistency.
Query CapabilityLimited querying capabilities; often requires reading entire files.Advanced querying capabilities using SQL for efficient data retrieval.
ScalabilityLimited scalability; becomes cumbersome with large data sets.Highly scalable; can handle large volumes of data efficiently.
SecurityBasic security; less control over access and permissions.Robust security features, including user authentication and access control.
Backup and RecoveryManual backup processes; recovery can be difficult.Automated backup and recovery options are available.
ConcurrencyLimited or no support for multiple users accessing data simultaneously.Supports multiple users accessing and modifying data concurrently.


Differences between Database and DBMS.

FeatureDatabaseDBMS
DefinitionA collection of organized data that can be easily accessed, updated, and managed.Software used to create, manage, and maintain a database.
Data StructureData is stored in tables, files, or other formats.Provides a structured way to store and manage data.
Data ManipulationUsers directly interact with the data stored in the database.Provides an interface for users to interact with the database.
SecurityRelies on the DBMS for security features.Enforces security measures like user authentication and access control.
Backup and RecoveryDepends on the DBMS for backup and recovery mechanisms.Provides tools for backing up data and recovering from failures.
Concurrency ControlRelies on the DBMS to handle concurrent access to data.Manages concurrent access to data and ensures data consistency.
ExamplesA collection of customer records, financial transactions, or inventory data.MySQL, Oracle, PostgreSQL, SQL Server.


DDL and DML (Data Manipulation Language)

  • DDL (Data Definition Language): A subset of SQL used to define and manage all database objects. Common DDL commands include CREATEALTER, and DROP.
  • DML (Data Manipulation Language): A subset of SQL used for managing data within the database. Common DML commands include SELECTINSERTUPDATE, and DELETE.
  • 1.5 Database Models: Network Model, Hierarchical Model, Relational Database Model

    This section provides a detailed explanation of three major database models: the Network Model, Hierarchical Model, and Relational Database Model, including diagrams, advantages, and disadvantages for each.

    Network Model

    Explanation:
    The Network Model organizes data in a graph structure, allowing multiple relationships between entities. This model supports complex relationships, including one-to-one, one-to-many, and many-to-many associations. Each record can have multiple parent and child records, making it flexible for representing interconnected data.Diagram:


  • Advantages:
    • Flexibility: Supports complex relationships and multiple paths to the same record, enhancing data retrieval.
    • Data Integrity: Relationships are maintained through owner-member pairs, ensuring data consistency.
    • Efficient Access: Faster data retrieval due to multiple access paths.
    • Complex Relationships: Better representation of many-to-many relationships compared to hierarchical models.
    Disadvantages:
    • Complexity: The structure can become complicated, making management and updates challenging.
    • Navigation Difficulty: Requires a deep understanding of pointers for data navigation.
    • Lack of Standardization: No universally accepted query language, leading to potential compatibility issues.
    • Operational Anomalies: Changes in structure may require significant updates to application programs.

    Hierarchical Model

    Explanation:
    The Hierarchical Model organizes data in a tree-like structure, where each record has a single parent and can have multiple children. This model is simple and intuitive but can be inflexible due to its rigid structure.Diagram:


  • Hierarchical Model Diagram

    Advantages:
    • Simplicity: Easy to understand and implement due to its straightforward structure.
    • Data Integrity: Enforces a clear parent-child relationship, reducing data redundancy.
    • Fast Access: Efficient for queries that follow the hierarchical path.
    Disadvantages:
    • Lack of Flexibility: Difficult to reorganize or add new data relationships without significant restructuring.
    • Redundancy Issues: Can lead to data redundancy if the same data is stored in multiple branches.
    • Limited Relationships: Only supports one-to-many relationships, making it unsuitable for complex data scenarios.
    • Difficult Maintenance: Changes in the hierarchy require updates to all affected records.

    Relational Database Model

    Explanation:
    The Relational Database Model organizes data into tables (relations) that can be linked through foreign keys. This model is based on mathematical set theory and allows for a declarative query language (SQL) for data manipulation.Diagram:


  • Relational Model Diagram

    Advantages:
    • Data Independence: Changes in the database structure do not affect application programs.
    • Flexibility: Supports complex queries and relationships through SQL.
    • Normalization: Reduces data redundancy through normalization techniques.
    • Standardization: SQL provides a standardized way to interact with the database.
    Disadvantages:
    • Performance Overhead: Complex queries can lead to slower performance compared to other models.
    • Complexity in Design: Designing a relational database requires careful planning and normalization.
    • Scalability Issues: May face challenges when scaling to very large datasets or high transaction volumes.
    • Dependency on SQL: Requires knowledge of SQL for effective data manipulation and retrieval.
    This overview of the three database models highlights their unique characteristics, advantages, and disadvantages, providing a comprehensive understanding of how they function and their applicability in different scenarios.
  •  Database Security.
  • Database security refers to the measures and practices implemented to protect databases from unauthorized access, misuse, and threats. It encompasses a variety of controls, tools, and procedures designed to safeguard the confidentiality, integrity, and availability of data stored within a database. As data breaches and cyber threats become increasingly sophisticated, robust database security is essential for organizations to protect sensitive information and maintain trust with customers.

    Importance of Database Security

    1. Protects Sensitive Information: Database security measures prevent unauthorized access to personal, financial, and proprietary data, safeguarding it from malicious actors.
    2. Ensures Compliance: Organizations must adhere to regulatory requirements such as GDPR, HIPAA, and PCI DSS. Effective database security helps ensure compliance with these regulations, avoiding potential fines and legal issues.
    3. Maintains Trust: By ensuring the integrity and availability of data, organizations can maintain customer trust and confidence in their services.
    4. Prevents Data Breaches: Strong database security reduces the risk of costly and damaging security incidents that can lead to data loss or theft.

    Common Causes of Database Security Breaches

    1. Weak Authentication: The use of default, weak, or compromised credentials can allow attackers to gain unauthorized access. Many breaches occur due to weak passwords or failure to change default passwords.
    2. Inadequate Access Controls: Over-permissioned accounts and poorly defined role-based access controls increase the risk of accidental or malicious actions.
    3. Unpatched Vulnerabilities: Failure to apply timely software updates can leave databases exposed to known vulnerabilities that attackers can exploit.
    4. SQL Injection Attacks: Attackers can manipulate input fields in web applications to execute arbitrary SQL code, leading to unauthorized access or data manipulation.
    5. Insufficient Encryption: Data that is not encrypted is vulnerable to interception and unauthorized access, especially during transmission or when stored on compromised systems.
    6. Misconfiguration: Poorly configured database settings, such as open database ports or excessive permissions, can expose databases to attacks.
    7. Insider Threats: Malicious or negligent actions by employees or partners can lead to data breaches, whether intentional or accidental.
    8. Human Error: Mistakes such as sharing passwords or failing to follow security protocols can significantly increase the risk of a data breach.
    9. Malware: Software designed to exploit vulnerabilities can be introduced into the database environment through various means, including infected devices.
    10. Denial-of-Service Attacks: Attackers can overwhelm database servers with excessive requests, rendering them unavailable to legitimate users.

    Strategies for Hardening Database Security

    1. Implement Access Controls: Use role-based access control (RBAC) to ensure that users have the minimum necessary permissions to perform their job functions.
    2. Use Strong Authentication: Employ multi-factor authentication (MFA) and enforce strong password policies to enhance security.
    3. Encrypt Data: Encrypt sensitive data both in transit and at rest to protect it from unauthorized access.
    4. Regularly Update Software: Apply security patches and updates promptly to address known vulnerabilities.
    5. Conduct Security Audits: Regularly review and audit database security policies and practices to identify and address potential weaknesses.
    6. Monitor Database Activity: Implement real-time monitoring and logging to detect suspicious activity and respond to potential threats quickly.
    7. Educate Employees: Provide training on data security best practices to reduce the risk of human error and insider threats.
    8. Develop an Incident Response Plan: Prepare a clear plan for responding to security incidents to minimize damage and recover quickly.
    9. Utilize Firewalls and Intrusion Detection Systems: Protect databases from unauthorized access and malicious attacks with network security measures.
    10. Backup Data Regularly: Ensure that data is backed up securely and can be restored in case of data loss or corruption.
  • 1.6 Concept of Normalization: 1NF, 2NF, 3NF

    Normalization is a systematic approach to organizing data in a database to minimize redundancy and improve data integrity. It involves dividing a database into tables and defining relationships between them. The process is typically broken down into several normal forms, with the first three being the most commonly used: First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF).

    First Normal Form (1NF)

    Definition:
    A table is in First Normal Form (1NF) if:
    • All entries in a column are atomic (indivisible).
    • Each column contains only one value per row.
    • Each row is unique, which requires a primary key.
    Example:
    Consider a table of students with courses:
    Student_IDNameCourses
    1AliceMath, Science
    2BobHistory
    3CarolMath, History
    This table is not in 1NF because the "Courses" column contains multiple values (e.g., "Math, Science").To convert to 1NF, we separate the courses into individual rows:
    Student_IDNameCourse
    1AliceMath
    1AliceScience
    2BobHistory
    3CarolMath
    3CarolHistory
    Now, each cell contains atomic values, and the table meets the criteria for 1NF.

    Second Normal Form (2NF)

    Definition:
    A table is in Second Normal Form (2NF) if:
    • It is already in 1NF.
    • All non-key attributes are fully functionally dependent on the primary key (no partial dependency).
    Example:
    Using the previous example, assume we have the following table:
    Student_IDCourseInstructorInstructor_Age
    1MathDr. Smith45
    1ScienceDr. Jones50
    2HistoryDr. Brown40
    3MathDr. Smith45
    3HistoryDr. Brown40
    This table is in 1NF, but it is not in 2NF because the "Instructor_Age" is only dependent on "Instructor" and not on the entire primary key ("Student_ID", "Course").To convert to 2NF, we create two tables:
    1. Student_Course Table:
    Student_IDCourseInstructor
    1MathDr. Smith
    1ScienceDr. Jones
    2HistoryDr. Brown
    3MathDr. Smith
    3HistoryDr. Brown
    1. Instructor Table:
    InstructorInstructor_Age
    Dr. Smith45
    Dr. Jones50
    Dr. Brown40
    Now, all non-key attributes in the "Student_Course" table are fully dependent on the primary key, satisfying 2NF.

    Third Normal Form (3NF)

    Definition:
    A table is in Third Normal Form (3NF) if:
    • It is already in 2NF.
    • There are no transitive dependencies (non-key attributes should not depend on other non-key attributes).
    Example:
    Continuing from the previous example, consider the "Student_Course" table:
    Student_IDCourseInstructorInstructor_Age
    1MathDr. Smith45
    1ScienceDr. Jones50
    2HistoryDr. Brown40
    3MathDr. Smith45
    3HistoryDr. Brown40
    This table is in 2NF, but it is not in 3NF because "Instructor_Age" is dependent on "Instructor," which is a non-key attribute.To convert to 3NF, we can keep the "Instructor" and "Instructor_Age" in the separate "Instructor" table created earlier. The "Student_Course" table would then look like this:
    Student_IDCourseInstructor
    1MathDr. Smith
    1ScienceDr. Jones
    2HistoryDr. Brown
    3MathDr. Smith
    3HistoryDr. Brown
    The "Instructor" table remains the same:
    InstructorInstructor_Age
    Dr. Smith45
    Dr. Jones50
    Dr. Brown40
    Now, the "Student_Course" table is in 3NF because all non-key attributes are directly dependent only on the primary key.

    Summary of Normal Forms

    • 1NF: Ensures atomicity and uniqueness of rows.
    • 2NF: Eliminates partial dependencies on a composite primary key.
    • 3NF: Eliminates transitive dependencies among non-key attributes.
    Normalization is crucial for reducing redundancy, improving data integrity, and ensuring efficient data management in relational databases.
  • 1.7 Centralized Vs. Distributed Database
  • FeatureCentralized DatabaseDistributed Database
    DefinitionA single database located at one central location.Multiple databases spread across different physical locations.
    Data AccessAccessed from a single location, leading to potential bottlenecks.Accessed from multiple locations, improving speed and efficiency.
    ManagementEasier to manage, update, and back up due to a single location.More complex management due to multiple locations and synchronization needs.
    Data ConsistencyGenerally offers higher consistency as all data is in one place.May face challenges with consistency due to data replication across locations.
    Failure ImpactIf the central database fails, all users lose access.If one database fails, others remain accessible, enhancing reliability.
    CostTypically less expensive to set up and maintain.More costly due to the need for multiple systems and complex infrastructure.
    ScalabilityScaling can be difficult as it involves upgrading a single system.Easier to scale by adding more databases or nodes in different locations.
    PerformancePerformance can degrade with high user load due to a single point of access.Generally better performance as data can be retrieved from the nearest location.
    SecurityEasier to secure since data is stored in one location.More challenging to secure due to multiple access points and locations.
    Data RedundancyMinimal data redundancy as all data is centralized.Higher chances of data redundancy due to data being stored in multiple locations.

Comments

Popular posts from this blog

Software Process Model(SPM)

  Explain spiral development model with advantages and disadvantages The Spiral Development Model is a risk-driven software development approach that combines iterative and waterfall models. It involves repeated cycles (or spirals) to incrementally refine the software project. Each spiral consists of four major phases: Planning , Risk Analysis , Engineering , and Evaluation .                                                           Fig: Spiral Software development model Phases in the Spiral Model: Planning : Requirements are gathered and objectives are set for the project. Risk Analysis : Risks are identified, analyzed, and mitigation strategies are developed. Develop/Engineering : The actual development and testing of the product take place. Evaluation : Stakeholders/users review the progress and provide feedback, inf...

Solution of NEB model quetions

 solution to model questions are : 

DBMS-SQL

SQL DDL and DML Examples Examples of DDL (Data Definition Language) and DML (Data Manipulation Language) commands in SQL. DDL (Data Definition Language) Examples: 1. Create a new database: CREATE DATABASE SchoolDB;  2. Create a table: CREATE TABLE Students ( StudentID INT PRIMARY KEY, FirstName VARCHAR(50), LastName VARCHAR(50), Age INT );  3. Alter a table to add a new column: ALTER TABLE Students ADD Email VARCHAR(100);  4. Drop a table: DROP TABLE Students;  5. Rename a table: ALTER TABLE OldTableName RENAME TO NewTableName;  DML (Data Manipulation Language) Examples: 1. Insert data into a table:  INSERT INTO Students (StudentID, FirstName, LastName, Age) VALUES (1, 'John', 'Doe', 20);  2. Update data in a table: UPDATE Students SET Age = 21 WHERE StudentID = 1;  3. Delete data from a table: DELETE FROM Students WHERE StudentID = 1;  4. Select data from a table: SELECT * FROM Students;  5. Select data with conditions: SELECT FirstName...