inf3703_summary BEST FOR EXAM.
INF 3703 - DATABASES II Summary 2013 Ref Chap Title Page 1. 10 Distributed Databases 2 2. 11 Interacting with Databases Through the Web 16 3. 12 Database Administration & Security 28 4. 13 Managing Transactions and Concurrency 44 5. 14 Managing Database and SQL Performance 60 6. 15 Databases for Decision Support 72 Database Principles: Fundamentals of Design, Implementation, and Management 10th Edition - Coronel C, Morris S, Rob P. v1.00 September 2013 Ron Barnard Chapter 10 - Distributed Databases Chapter 10 - Distributed Databases Summary • A distributed database stores logically related data in two or more physically independent sites connected via a computer network. The database is divided into fragments, which can be a horizontal set of rows or a vertical set of attributes. Each fragment can be allocated to a different network node. • Distributed processing is the division of logical database processing among two or more network nodes. Distributed databases require distributed processing. A distributed database management system (DDBMS) governs the processing and storage of logically related data through interconnected computer systems. • The main components of a DDBMS are the transaction processor (TP) and the data processor (DP). The transaction processor component is the resident software on each computer node that requests data. The data processor component is the resident software on each computer node that stores and retrieves data. • Current database systems can be classified by the extent to which they support processing and data distribution. Three major categories are used to classify distributed database systems: single-site processing, single-site data (SPSD); multiple-site processing, single-site data (MPSD); and multiple-site processing, multiple-site data (MPMD). • A homogeneous distributed database system integrates only one particular type of DBMS over a computer network. A heterogeneous distributed database system integrates several different types of DBMSs over a computer network. • DDBMS characteristics are best described as a set of transparencies: distribution, transaction, performance, failure and heterogeneity. All transparencies share the common objective of making the distributed database behave as though it were a centralized database system; that is, the end user sees the data as part of a single, logical centralized database and is unaware of the systems' complexities. • A transaction is formed by one or more database requests. An undistributed transaction updates or requests data from a single site. A distributed transaction can update or request data from multiple sites. • Distributed concurrency control is required in a network of distributed databases. A two-phase COMMIT protocol is used to ensure that all parts of a transaction are completed. • A distributed DBMS evaluates every data request to find the optimum access path in a distributed database. The DDBMS must optimize the query to reduce associated access costs, communication costs, and CPU costs. Page 2 of 90 Chapter 10 - Distributed Databases • The design of a distributed database must consider the fragmentation and replication of data. The designer must also decide how to allocate each fragment or replica to obtain better overall response time and to ensure data availability to the end user. Ideally, a distributed database should evenly distribute data to maximize performance, availability, and location awareness. • A database can be replicated over several different sites on a computer network. The replication of the database fragments has the objective of improving data availability, thus decreasing access time. A database can be partially, fully, or not replicated. Data allocation strategies are designed to determine the location of the database fragments or replicas. • The CAP theorem states that a highly distributed data system has some desirable properties of consistency, availability, and partition tolerance. However, a system can only provide two of these properties at a time. Page 3 of 90 Chapter 10 - Distributed Databases Content 10.1 The Evolution of Distributed Database Management Systems A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems, in which both data and processing are distributed among several sites. 10.2 DDBMS Advantages and Disadvantages Advantages Disadvantages Data are located near the site of greatest demand - Data dispersed to match business requirements. Complexity of management and control - Working with data at various locations. Faster data access - Work with nearest stored data subset. Technological difficulty - More technical issues to deal with. Faster data processing - Data processed at several sites. Security - Probability of security lases increases when data are stored at multiple sites. Growth facilitation - New sites can be added without affecting the operation of other sites. Lack of standards - No standard communication protocols. Improved communications - Local sites are smaller and located closer to customers. Increased storage and infrastructure requirements - Multiple copies of data at multiple sites. Reduced operating costs - More cost-effective to add nodes to a network than upgrade a mainframe. Increased training cost - Higher than compared to a centralized model. User-friendly interface - PC's equipped with easyto-use GUI. Costs - Require duplicated infrastructure. Less danger of single-point failure - If one computer fails, workload is picked up by other computers. Processor independence - User can access any available copy of the data, and request is processed by any processor at the data location. 10.3 Distributed Processing and Distributed Databases Distributed processing - a databases' logical processing is shared among two or more physically independent sites that are connected through a network. The database is located on one computer, but several sites can access the data and update the database. Distributed database - stores a logically related database over two or more physically independent sites, connected via a network
Geschreven voor
- Instelling
- University of South Africa
- Vak
- INF3703 - Databases II
Documentinformatie
- Geüpload op
- 16 november 2021
- Aantal pagina's
- 90
- Geschreven in
- 2021/2022
- Type
- SAMENVATTING
Onderwerpen
- inf3703
-
inf3703summary best for exam