1 / 62

ARSITEKTUR DBMS TERDISTRIBUSI

ARSITEKTUR DBMS TERDISTRIBUSI. STANDARISASI DBMS. Berdasarkan Komponen . Komponen dari sistem didefinisikan bersama dengan keterkaitan antar komponen. Suatu DBMS terdiri dari sejumlah komponen, masing-masing menyediakan beberapa fungsi . Berdasarkan Fungsi .

Download Presentation

ARSITEKTUR DBMS TERDISTRIBUSI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ARSITEKTUR DBMS TERDISTRIBUSI

  2. STANDARISASI DBMS • Berdasarkan Komponen. Komponen dari sistem didefinisikan bersama dengan keterkaitan antar komponen. Suatu DBMS terdiri dari sejumlah komponen, masing-masing menyediakan beberapa fungsi. • Berdasarkan Fungsi. Kelas-kelas yang berbeda dari pengguna diidentifikasi dan fungsi bahwa sistem akan melakukan untuk masing-masing kelas didefinisikan. Spesifikasi Sistem dalam kategori ini biasanya menentukan struktur hirarki untuk kelas pengguna.

  3. STANDARISASI DBMS • Berdasarkan Data. Jenis data yang berbeda diidentifikasi, dan sebuah kerangka kerja arsitektur ditentukan yang mendefinisikan unit fungsional yang akan menyadari atau menggunakan data sesuai dengan pandangan yang berbeda. Pendekatan (juga disebut sebagai pendekatan datalogical) diklaim menjadi pilihan lebih baik untuk kegiatan standardisasi.

  4. STANDARISASI DBMSARSITEKTUR ANSI / SPARC ANSI / SPARC arsitektur diklaim didasarkan pada data organisasi. Ia mengakui tiga tampilan data:tampilan eksternal, yang adalah bahwa dari pengguna, yang mungkin programmer, pandangan internal, bahwa dari sistem atau mesin; dan pandangan konseptual, yaituperusahaan.Untuk masing-masing pandangan, definisi skema yang tepat diperlukan.

  5. STANDARISASI DBMSARSITEKTUR ANSI / SPARC

  6. STANDARISASI DBMSARSITEKTUR ANSI / SPARC Pada tingkat terendah arsitektur adalah pandangan internal, yang berkaitan dengan definisi fisik dan organisasi data.Pada ekstrem yang lain adalah pandangan eksternal, yang berkaitan dengan bagaimana para pemakai memandang database.Antara kedua ujung adalah skema konseptual, yang merupakan definisi abstrak dari database. Ini adalah "dunia nyata" pandangan dari perusahaan yang dimodelkan dalam database.

  7. STANDARISASI DBMSARSITEKTUR ANSI / SPARC

  8. STANDARISASI DBMSARSITEKTUR ANSI / SPARC • Kotak-kotak persegi merupakan fungsi pengolahan, sedangkan segi enam adalah peran administratif. • Tanda panah menunjukkan data, perintah, program, dan aliran deskripsi, sedangkan "Aku" bar berbentuk pada mereka merupakan antarmuka. • Komponen utama yang memungkinkan pandangan pemetaan antara data yang berbeda organisasi adalah data dictionary / directory (digambarkan sebagai segitiga), yang merupakan meta-database. • Database administrator bertanggung jawab untuk menentukan definisi skema internal. • Peran perusahaan administrator adalah untuk mempersiapkan definisi skema konseptual. • Administrator aplikasi bertanggung jawab untuk mempersiapkan skema eksternal untuk aplikasi.

  9. STANDARISASI DBMSARSITEKTUR ANSI / SPARC Sistem ditandai sehubungan dengan:(1) otonomi sistem lokal,(2) distribusi,(3) heterogenitas.

  10. MODEL ARSITEKTUR UNTUK DISTRIBUSI DBMS – Otonomi Otonomi mengacu pada distribusi kontrol, tidak ada data. Hal ini menunjukkan sejauh mana DBMSs individu dapat beroperasi secara independen.Tiga alternatif:ketat integrasisemiautonomous sistemisolasi total

  11. MODEL ARSITEKTUR UNTUK DISTRIBUSI DBMS – Otonomi Tight integrasi.Gambar-tunggal seluruh database tersedia untuk setiap pengguna yang ingin berbagi informasi, yang dapat berada di beberapa database. Dari sudut pandang pengguna, data secara logis terpusat dalam satu database. Semiautonomous sistem.The DBMSs dapat beroperasi secara independen. Masing-masing DBMSs menentukan bagian mana dari database mereka sendiri, mereka akan membuat diakses pengguna DBMSs lain. Total isolasi.Sistem DBMSs individu yang berdiri sendiri, yang tidak mengetahui tentang keberadaan DBMSs lain atau bagaimana berkomunikasi dengan mereka.

  12. MODEL ARSITEKTUR UNTUK DISTRIBUSI DBMS – Otonomi Distribusi mengacu pada distribusi data. Tentu saja, kita sedang mempertimbangkan distribusi fisik data melalui beberapa situs, pengguna melihat data sebagai salah satu kolam renang logis.Dua alternatif:client / server distribusipeer-to-peer distribusi (distribusi penuh)

  13. MODEL ARSITEKTUR UNTUK DISTRIBUSI DBMS – Distribusi Client / distribusi server.Klien / server distribusi konsentrat tugas manajemen data pada server sedangkan klien fokus pada penyediaan lingkungan aplikasi termasuk user interface. Tugas komunikasi yang dibagi antara mesin klien dan server. Client / server DBMSs merupakan usaha pertama di mendistribusikan fungsionalitas. Peer-to-peer distribusi.Tidak ada perbedaan dari mesin klien versus server. Setiap mesin memiliki fungsionalitas penuh DBMS dan dapat berkomunikasi dengan mesin lainnya untuk mengeksekusi query dan transaksi.

  14. MODEL ARSITEKTUR UNTUK DISTRIBUSI DBMS – Heterogenitas Heterogenitas dapat terjadi dalam berbagai bentuk dalam sistem terdistribusi, mulai bentuk heterogenitas perangkat keras dan perbedaan dalam jaringan protokol untuk variasi dalam manajer data.Mewakili data dengan alat pemodelan yang berbeda menciptakan heterogenitas karena kekuatan ekspresif yang melekat dan keterbatasan model data individu. Heterogenitas dalam bahasa query tidak hanya melibatkan penggunaan paradigma yang sama sekali berbeda akses data dalam model data yang berbeda, tetapi juga mencakup perbedaan dalam bahasa bahkan ketika sistem individu menggunakan model data yang sama.

  15. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES The dimensions are identified as: A (autonomy), D (distribution) and H (heterogeneity). The alternatives along each dimension are identified by numbers as: 0, 1 or 2. A0 - tight integration D0 - no distribution A1 - semiautonomous systems D1 - client / server systems A2 - total isolation D2 - peer-to-peer systems H0 - homogeneous systems H1 - heterogeneous systems

  16. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A0, D0, H0) If there is no distribution or heterogeneity, the system is a set of multiple DBMSs that are logically integrated. (A0, D0, H1) If heterogeneity is introduced, one has multiple data managers that are heterogeneous but provide an integrated view to the user. (A0, D1, H0) The more interesting case is where the database is distributed even though an integrated view of the data is provided to users (client / server distribution).

  17. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A0, D2, H0) The same type of transparency is provided to the user in a fully distributed environment. There is no distinction among clients and servers, each site providing identical functionality. (A1, D0, H0) These are semiautonomous systems, which are commonly termed federated DBMS. The component systems in a federated environment have significant autonomy in their execution, but their participation in the federation indicate that they are willing to cooperate with other in executing user requests that access multiple databases.

  18. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A1, D0, H1) These are systems that introduce heterogeneity as well as autonomy, what we might call a heterogeneous federated DBMS. (A1, D1, H1) System of this type introduce distribution by pacing component systems on different machines. They may be referred to as distributed, heterogeneous federated DBMS. (A2, D0, H0) Now we have full autonomy. These are multidatabase systems (MDBS). The components have no concept of cooperation. Without heterogeneity and distribution, an MDBS is an interconnected collection of autonomous databases.

  19. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A2, D0, H1) These case is realistic, maybe even more so than (A1, D0, H1), in that we always want to built applications which access data from multiple storage systems with different characteristics. (A2, D1, H1) and (A2, D2, H1) These two cases are together, because of the similarity of the problem. They both represent the case where component databases that make up the MDBS are distributed over a number of sites - we call this the distributed MDBS.

  20. DISTRIBUTED DBMS ARCHITECTURE • Client / server systems - (Ax, D1, Hy) • Distributed databases - (A0, D2, H0) • Multidatabase systems - (A2, Dx, Hy)

  21. DISTRIBUTED DBMS ARCHITECTURECLIENT / SERVER SYSTEMS • This provides two-level architecture which make it easier to manage the complexity of modern DBMSs and the complexity of distribution. • The server does most of the data management work (query processing and optimization, transaction management, storage management). • The client is the application and the user interface (management the data that is cached to the client, management the transaction locks).

  22. DISTRIBUTED DBMS ARCHITECTURECLIENT / SERVER SYSTEMS • This architecture is quite common in relational systems where the communication between the clients and the server(s) is at the level of SQL statements.

  23. DISTRIBUTED DBMS ARCHITECTURECLIENT / SERVER SYSTEMS Multiple client - single server From a data management perspective, this is not much different from centralized databases since the database is stored on only one machine (the server) which also hosts the software to manage it. However, there are some differences from centralized systems in the way transactions are executed and caches are managed. Multiple client - multiple server In this case, two alternative management strategies are possible: either each client manages its own connection to the appropriate server or each client knows of only its “home server” which then communicates with other servers as required.

  24. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS • The physical data organization on each machine may be different. • Local internal scheme (LIS) - is an individual internal schema definition at each site. • Global conceptual schema (GCS) - describes the enterprise view of the data. • Local conceptual schema (LCS) - describes the logical organization of data at each site. • External schemas (ESs) - support user applications and user access to the database.

  25. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS

  26. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS In these case, the ANSI/SPARC model is extended by the addition of global directory / dictionary (GD/D) to permits the required global mappings. The local mappings are still performed by local directory / dictionary(LD/D). The local database management components are integrated by means of global DBMS functions. Local conceptual schemas are mappings of global schema onto each site.

  27. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS The detailed components of a distributed DBMS. Two major components: • user processor • data processor

  28. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS User processor • user interface handler - is responsible for interpreting user commands as they come in, and formatting the result data as it is sent to the user, • semantic data controller- uses the integrity constraints and authorizations that are defined as part of the global conceptual schema to check if the user query can be processed, • global query optimizer and decomposer - determines an execution strategy to minimize a cost function, and translates the global queries in local ones using the global and local conceptual schemas as well as global directory, • distributed execution monitor - coordinates the distributed execution of the user request.

  29. DISTRIBUTED DBMS ARCHITECTUREPEER-TO-PEER DISTRIBUTED SYSTEMS Data processor • local query optimizer - is responsible for choosing the best access path to access any data item, • local recovery manager - is responsible for making sure that the local database remains consistent even when failures occur, • run-time support processor - physically accesses the database according to the physical commands in the schedule generated by the query optimizer. This is the interface to the operating system and contains the database buffer (or cache) manager, which is responsible for maintaining the main memory buffers and managing the data accesses.

  30. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE Models using a Global Conceptual Schema (GCS) The GCS is defined by integrating either the external schemas of local autonomous databases or parts of their local conceptual schemas. If the heterogeneity exists in the system, then two implementation alternatives exists unilingual and multilingual. Models without a Global Conceptual Schema (GCS) The existence of a global conceptual schema in a multidatabase system is a controversial issue. There are researchers who even define a multidatabase management system as one that manages “several databases without the global schema”.

  31. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models using a GCS

  32. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models using a GCS • A unilingual multi-DBMS requires the users to utilize possibly different data models and languages when both a local database and the global database are accessed. • Any application that accesses data from multiple databases must do so by means of an external view that is defined on the global conceptual schema. • One application may have a local external schema (LES) defined on the local conceptual schema as well as a global external schema (GES) defined on the global conceptual schema.

  33. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models using a GCS • An alternative is multilingual architecture, where the basic philosophy is to permit each user to access the global database by means of an external schema, defined using the language of the user’s local DBMS. • The multilingual approach obviously makes querying the databases easier from the user’s perspective. However, it is more complicated because we must deal with translation of queries at run time.

  34. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models without a GCS

  35. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models without a GCS • The architecture identifies two layers: the local system layer and the multidatabase layer on top of it. • The local system layer consists of a number of DBMSs, which present to the multidatabase layer the part of their local database they are willing to share with users of the other databases. This shared data is presented either as the actual local conceptual schema or as a local external schema definition. • The multidatabase layer consist of a number of external views, which are constructed where each view may be defined on one local conceptual schema or on multiple conceptual schemas. Thus the responsibility of providing access to multiple databases is delegated to the mapping between the external schemas and the local conceptual schemas.

  36. DISTRIBUTED DBMS ARCHITECTUREMDBS ARCHITECTURE - models without a GCS The MDBS provides a layer of software that runs on top of these individual DBMSs and provides users with the facilities of accessing various databases. Fig. represents a nondistributedmulti-DBMS. If the system is distributed, we would need to replicate the multidatabase layer to each site where there is a local DBMS that participates in the system.

  37. DISTRIBUTED DBMS ARCHITECTUREGLOBAL DIRECTORY ISSUE • The global directory includes information about the location of the fragments as well as the makeup of the fragments. • The directory is itself a database that contains meta-data about the actual data stored in the database. • We have three dimensions: 1.type 2.location 3.replication

  38. DISTRIBUTED DBMS ARCHITECTUREGLOBAL DIRECTORY ISSUE Type A directory maybe either global to the entire database or local to each site. In other words, there might be a single directory containing information about all the data in the database, or a number of directories, each containing the information stored at one site. Location The directory maybe maintained centrally at one site, or in a distributed fashion by distributing it over a number of sites. Replication There maybe a single copy of the directory or multiply copies.

  39. DISTRIBUTED DBMS ARCHITECTUREGLOBAL DIRECTORY ISSUE These three dimensions are orthogonal to one another. The unrealistic combination have been designed by a question mark.

  40. DISTRIBUTED DATABASE DESIGN

  41. DISTRIBUTED DATABASE DESIGN The organization of distributed systems can be investigated along three orthogonal dimensions: 1. Level of sharing 2. Behavior of access patterns 3. Level of knowledge on access pattern behavior

  42. DISTRIBUTED DATABASE DESIGN Level of sharing • no sharing - each application and its data execute at one site, • data sharing - all the programs are replicated at all the sites, but data files are not, • data plus program sharing - both data and programs may be shared. Behavior of access patterns • static - access patterns of user requests do not change over time, • dynamic - access patterns of user requests change over time. Level of knowledge on access pattern behavior • complete information - the access patterns can reasonably be predicted and do not deviate significantly from the predictions, • partial information - there are deviations from the predictions.

  43. ALTERNATIVE DESIGN STRATEGIES Two major strategies that have been identified for designing distributed databases are: • the top-down approach • the bottom-up approach

  44. ALTERNATIVE DESIGN STRATEGIESTOP-DOWN DESIGN PROCESS

  45. ALTERNATIVE DESIGN STRATEGIESTOP-DOWN DESIGN PROCESS • view design -defining the interfaces for end users, • conceptual design - is the process by which the enterprise is examined to determine entity types and relationships among these entities. One can possibly divide this process into to related activity groups: • entity analysis - is concerned with determining the entities, their attributes, and the relationships among these entities, • functional analysis - is concerned with determining the fundamental functions with which the modeled enterprise is involved.

  46. ALTERNATIVE DESIGN STRATEGIESTOP-DOWN DESIGN PROCESS • distributions design - design the local conceptual schemas by distributing the entities over the sites of the distributed system. The distribution design activity consists of two steps: • fragmentation • allocation • physical design - is the process, which maps the local conceptual schemas to the physical storage devices available at the corresponding sites, • observation and monitoring - the results is some form of feedback, which may result in backing up to one of the earlier steps in the design.

  47. ALTERNATIVE DESIGN STRATEGIESBOTTOM-UPDESIGN PROCESS Top-down design is a suitable approach when a database system is being designed from scratch. If a number of databases already exist, and the design task involves integrating them into one database - the bottom-up approach is suitable for this type of environment. The starting point of bottom-up design is the individual local conceptual schemas. The process consists of integrating local schemas into the global conceptual schema.

  48. DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION • The important issue is the appropriate unit of distribution. For a number of reasons it is only natural to consider subsets of relations as distribution units. • If the applications that have views defined on a given relation reside at different sites, two alternatives can be followed, with the entire relation being the unit of distribution. The relation is not replicated and is stored at only one site, or it is replicated at all or some of the sites where the applications reside. • The fragmentation of relations typically results in the parallel execution of a single query by dividing it into a set of subqueries that operate on fragments. Thus, fragmentation typically increases the level of concurrency and therefore the system throughput.

  49. DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION • There are also the disadvantages of fragmentation: • if the application have conflicting requirements which prevent decomposition of the relation into mutually exclusive fragments, those applications whose views are defined on more than one fragment may suffer performance degradation, • the second problem is related to semantic data control, specifically to integrity checking.

  50. DISTRIBUTION DESIGN ISSUESFRAGMENTATION ALTERNATIVES • The are clearly two alternatives: • horizontal fragmentation • vertical fragmentation • The fragmentation may, of course, be nested. If the nestings are of different types, one gets hybrid fragmentation.

More Related