RDBMS
Relational database management system.
A Relational Database Management System (RDBMS) is a
type of database management system that stores and retrieves data organized in
relational tables. It uses a relational model based on mathematical set theory
to represent and manage data
Example:
Consider a simple database for a library with two
tables: Books and Authors.
Table: Books
BookID |
Title |
AuthorID |
YearPublished |
1 |
"Data
Science 101" |
1 |
2020 |
2 |
"Programming
Basics" |
2 |
2018 |
3 |
"Database
Design" |
1 |
2019 |
Table:
Authors
AuthorID |
Author |
1 |
John
Smith |
2 |
Mary
Johnson |
In this example:
- The
Books table has columns for BookID, Title, AuthorID (foreign key),
and YearPublished.
- The
Authors table has columns for AuthorID and Author.
The AuthorID in the Books table is a
foreign key that relates to the AuthorID in the Authors table,
creating a relationship between the two tables.
Advantages of RDBMS:
- Data
Integrity:
- RDBMS
enforces data integrity through constraints like primary keys, foreign
keys, and unique constraints, ensuring accuracy and reliability of the
data.
- Relationships:
- RDBMS
allows the establishment of relationships between tables, promoting
efficient data organization and reducing data redundancy.
- Structured
Query Language (SQL):
- SQL
provides a standardized language for interacting with RDBMS. It
simplifies data retrieval, manipulation, and management.
- Normalization:
- RDBMS
supports normalization, which helps eliminate data anomalies and ensures
efficient data storage.
- Scalability:
- RDBMS
systems are scalable, allowing organizations to add more data and users
without significant changes to the database structure.
- Data
Security:
- RDBMS
provides access control mechanisms, allowing administrators to define and
control user access to the database. This enhances data security.
Disadvantages of RDBMS:
- Complexity:
- Designing
a normalized relational database can be complex, and it may require a
good understanding of database theory and design principles.
- Performance:
- In
some cases, the performance of complex queries involving multiple tables
can be slower compared to other types of databases, especially in
large-scale systems.
- Scalability
Challenges:
- While
RDBMS is scalable, scaling horizontally (across multiple servers) can be
challenging for some RDBMS systems.
- Not
Ideal for All Data Types:
- RDBMS
may not be the best fit for certain types of data, such as unstructured
or semi-structured data, which can be better handled by NoSQL databases.
- Cost:
- Commercial
RDBMS solutions often involve licensing fees, which can be a significant
cost for organizations.
- Limited
Flexibility:
- RDBMS
may be less flexible when dealing with changes in data models, especially
when compared to NoSQL databases designed for schema-less or
schema-flexible data.
RDBMS is a powerful and widely used technology for
managing structured data with well-defined relationships. It offers strong data
integrity, a standardized query language, and support for normalization.
However, its complexity, potential performance issues in certain scenarios, and
challenges with scalability in some cases are important considerations for
organizations.
Normalization in DBMS and its types
Normalization is a process used in database design to
organize data efficiently, reduce data redundancy, and ensure data integrity.
It involves breaking down a database into smaller, related tables and defining
relationships between them.
Normalization is used to remove anomalies like
insertion, updation and deletion.
Normalization is typically carried out through a
series of steps, each represented by a normal form. The common normal forms are
First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF),
Boyce-Codd Normal Form (BCNF), and Fourth Normal Form (4NF). Let's briefly
discuss each:
- First
Normal Form (1NF):
- In
1NF, data in a table is organized into rows and columns, and each cell
contains a single, atomic value. There should be no repeating groups, and
each column should have a unique name.
Example of First Normal Form (1NF) are
as follows:
Consider a table representing
students and their courses:
StudentID |
Courses |
1 |
Math, Physics |
2 |
Chemistry, Biology, Physics |
This table violates 1NF because the
"Courses" column contains multiple values. To bring it to 1NF, we can
split the data into separate rows:
StudentID |
Course |
1 |
Math |
1 |
Physics |
2 |
Chemistry |
2 |
Biology |
2 |
Physics |
Now, each cell contains a single, atomic
value.
- Second
Normal Form (2NF):
- 2NF
builds on 1NF by addressing partial dependencies.
- A
table is in 2NF if it is in 1NF
- All
non-key attributes are fully functionally dependent on the entire primary
key. This means that there should be no partial dependencies of any
column on the primary key.
Examples of Second Normal Form (2NF):
Consider a table with information
about courses and the instructors who teach them:
CourseID |
Instructor |
Office |
101 |
Dr. Smith |
Room 201 |
102 |
Dr. Johnson |
Room 203 |
103 |
Dr. Smith |
Room 201 |
This table violates 2NF because the
"Office" column depends on only part of the primary key (Instructor,
but not CourseID). To achieve 2NF, we split the table:
Table 1: Courses
CourseID |
Instructor |
101 |
Dr. Smith |
102 |
Dr. Johnson |
103 |
Dr. Smith |
Table 2: Instructors
Instructor |
Office |
Dr. Smith |
Room 201 |
Dr. Johnson |
Room 203 |
- Third
Normal Form (3NF):
- 3NF
builds on 2NF by addressing transitive dependencies.
- A table is in 3NF if it is in 2NF
- There
are no transitive dependencies. In other words, no non-key attribute
should depend on another non-key attribute.
Examples of Third Normal Form (3NF):
Consider a table representing
employees and their departments:
EmployeeID |
EmployeeName |
Department |
Location |
1 |
Alice |
HR |
Building A |
2 |
Bob |
IT |
Building B |
3 |
Charlie |
HR |
Building A |
This table violates 3NF because the
"Location" column depends on the "Department" column, which
is not the primary key. To achieve 3NF, we split the table:
Table 1: Employees
EmployeeID |
EmployeeName |
Department |
1 |
Alice |
HR |
2 |
Bob |
IT |
3 |
Charlie |
HR |
Table 2: Departments
Department |
Location |
HR |
Building A |
IT |
Building B |
These tables are now in 3NF as there are
no transitive dependencies.
- Boyce-Codd
Normal Form (BCNF):
- BCNF
is a stricter form of normalization that addresses certain cases that 3NF
might not fully cover. A table is in BCNF if, for every non-trivial
functional dependency, the determinant is a superkey. In simpler terms,
every determinant (attribute on the left side of a functional dependency)
must be a candidate key.
Examples of Boyce-Codd
Normal Form (BCNF):
Consider a table representing a library
with information about books and authors:
BookID |
Author |
Title |
1 |
John Smith |
"Data Science 101" |
2 |
Mary Johnson |
"Programming Basics" |
3 |
John Smith |
"Database Design" |
This table violates BCNF because the
"Author" column determines the "Title," but
"Author" is not a superkey. To achieve BCNF, we split the table:
Table 1: Authors
Author |
BookID |
John Smith |
1 |
Mary Johnson |
2 |
Table 2: Books
BookID |
Title |
1 |
"Data Science 101" |
2 |
"Programming Basics" |
3 |
"Database Design" |
Now, each table is in BCNF.
- Fourth
Normal Form (4NF):
- 4NF
addresses multi-valued dependencies. A table is in 4NF if it is in BCNF
and it does not contain any multi-valued dependencies. Multi-valued
dependencies arise when one attribute uniquely determines another,
independent of the primary key.
Examples of Fourth Normal Form (4NF):
Consider a table representing a
company's projects and employees assigned to those projects:
ProjectID |
EmployeeID |
Skills |
101 |
1 |
Java, SQL |
101 |
2 |
Python, Java |
102 |
2 |
SQL, JavaScript |
This table violates 4NF due to
multi-valued dependencies in the "Skills" column. To achieve 4NF, we
split the table:
Table 1: Project Employees
ProjectID |
EmployeeID |
101 |
1 |
101 |
2 |
102 |
2 |
Table 2: Employee Skills
EmployeeID |
Skills |
1 |
Java, SQL |
2 |
Python, Java, SQL, JavaScript |
Now, each table is in 4NF, and
multi-valued dependencies are eliminated.
Normalization helps in reducing data redundancy and
anomalies, ensuring data consistency and integrity. However, it's important to note
that normalization can lead to a higher number of tables and more complex
queries. The decision on how far to normalize a database depends on factors
such as the specific requirements of the application and the trade-off between
data redundancy and query complexity.
Comments
Post a Comment