Database lectures MSBC...
free content
Database lectures MSBC...
E N D
Presentation Transcript
Whatisa database? • Databasemanagementsystemsorganiseand structuredataforeasyretrievalandmanipulation • Thedatastructuresandaccesstechniquesprovided byaparticularDBMSarecalleditsdatamodel • ThestructuredQueryLanguage(SQL)isa databaselanguageforrelationaldatabasesandit usestherelationaldatamodel
Introduction • Arelationaldatabaseisadatabasewherealldata visibletotheuserisorganisedstrictlyastablesofdata valuesandwherealldatabaseoperationsworkonthese tables
Tables • Theorganisationalprincipleoftherelationaldatabaseis atable • Atableisarectangularrow/columnarrangementofdata values • Eachtableinadatabasehasauniquetablenamethat identifiesitscontents.
Tablesinsampledatabase • Weshalldealwithasampledatabasewithfivetables • 🞑Landoffices:whichstoresdataaboutthelandsoffices • 🞑Beneficiaries:whichstoresdataonbeneficiariesofproducts • 🞑Officers:whichstoresdataofeach officer • 🞑MachineOrders:Whichstoresdataonordersplacedbyeach beneficiary • 🞑MachineProducts:Whichstoresdataoneachproduct availableforsaletobeneficiaries
Tables • Eachhorizontalrow represents a single physicalentity • Inthisexamplesingle landsoffices • Togetherfiverows represent5offices • Theverticalcolumn representsoneitem of datathat isstored inthe databaseforeachoffice
Tables • Eachrowofatable containsexactlyonedata value ineachcolumn • Allthedatavaluesin thecolumnshouldbeone datatypee.g..Numeric, text,currencyetc. • Atablewithzerorowsis perfectlylegal
Primarykey • Becauserowsofarelationaltableareunorderedyou cannotselectaspecificrowbyitspositioninthetable • Thusinawelldesignedrelationaldatabaseeachtable musthaveacolumnoracombinationofcolumnswhose valuesuniquelyidentifieseachrowinthetable. • Thisistheprimarykey
Relationships • Oneof majordifferencesbetweentherelationalmodel andearliermodelsistheexplicitpointersand parent/childrelationshipsarebannedfromtherelational database • Relationshipsexistintherelationaldatabase.
Relationships • Therelationshipsare representedbycommon datavaluesstoredintwo tables • One ofthe goals of SQListoletyouretrieve datafromthesetablesby manipulatingthese relationships
Foreignkeys • Acolumninatable whosevaluematchesthe primarykeyinsomeother tableiscalledaforeign key • Foreignkeysarean importantpartofthe relationalmodel because theycreaterelationships amongtablesinthe database
Summary • Eachtablehasanamewhichuniquelyidentifiesit • Eachtablehasoneormorenamedcolumnswhich are arrangedinlefttorightorder • Eachtablehaszeroormorerows • Alldatavaluesinagivencolumnareofsamedatatype andaredrawnfromadomainofthecolumn • Aprimarykeyisacolumnofacombinationofcolumns whosevalue(s)uniquelyidentifytherow • Aforeignkeyisacolumnorcombinationofcolumns whosevaluesareaprimarykeyofsomeothertable
Summary • Aprimarykey/foreignkeycombinationcreatesa parent/childrelationshipbetweentablesthatcontainthem
SIMPLEQUERIESIN RELATIONALDATABASES Dr.A.Murwira UniversityofZimbab,DepartmentofGeographyand EnvironmentalScience
TheSELECTstatement • The SELECTstatementretrievesdatafromadatabase andreturnstoyouqueryresults
TheSELECTstatement SELECTLANDOFFICES.CITY,LANDOFFICES.TARGET,LAND OFFICES.SALES FROMLANDOFFICES;
TheSELECTstatement SELECTLANDSOFFICES.CITY,LANDSOFFICES.TARGET, LANDSOFFICES.SALES,LANDSOFFICES.REGION FROMLANDSOFFICES WHERE(((LANDSOFFICES.REGION)="eastern"));
TheSELECTstatement SELECTLANDSOFFICES.CITY,LANDSOFFICES.TARGET, LANDSOFFICES.SALES,LANDSOFFICES.REGION, [SALES]>[TARGET]ASExpr1 FROMLANDSOFFICES WHERE(((LANDSOFFICES.REGION)="Eastern"));
DATAINTEGRITY Dr.A.UniversityofZimbabwe,Departmentofphyand EnvironmentalScience
DATAINTEGRITY • Dataintegrityreferstothecorrectnessandcompleteness ofthedatainadatabase • Dataintegritycanbelostindifferentways: • 🞑Invaliddatamaybeaddedtothedatabasesuchasanorder thatspecifiesanon-existentproduct • 🞑Existingdatamaybemodifiedtoanincorrectvalue • 🞑Changesindatabasemaybelostduetosystemerrororpower failure • 🞑Changesmaybepartiallyapplied
WHATISDATAINTEGRITY? • DBMSnormallyimplementdataintegrityconstrainsasa waytopreserveconsistencyandcorrectness.Theseareof differenttypes: • Requireddata:somecolumnsinthedatabasemustcontain validdatavalueineveryrow.Theyarenotallowedto containnullvalues • Validitychecking:Everycolumninthedatabasehasaset ofvaluesthatarelegaltothatcolumncalledadomain • Entityintegrity:Theprimarykeyofatablemustcontaina uniquevalueineachrow.Duplicatevaluesare • notallowed
WHATISDATAINTEGRITY? • Referentialintegrity:Aforeignkeyinarelational databaselinkseachrow inachildtablecontaining the foreignkeytotherowoftheparenttablecontainingthe primarykeyvalue. • Businessrules:Updatestoadatabasemaybe constrainedbybusinessrulesgoverningrealworld business. • Consistency:Manyrealworldtransactionscausemultiple updatestoadatabase.
WHATISFORM • Aformistheinterfacebetweentheuserandthedatabase • Formsdisplaydatafromthedatabaseinafamiliar, attarctive andconvenientlayout
Vectordatamodel:GISdatabase structure • VectorGISdata structureconsistsof the Coordinate databaseandthe Attribute tables connectedvia the uniqueIdentifier commonto boththe coordinatetable and attribute data table
Thespatial database • Create a spatial databaseoffarms • Maketheattribute tablehaveaforeign keythatwillserveasa linktoyourDBMS • Inthiscaseitis CUST_NUM
ConnectingthespatialdatabasewithDBMS • InArcViewGISyou canconnectyour spatialdatabasewith theDBMSthoughSQL connect
ConnectingthespatialdatabasewithDBMS • Youthenselectthe appropriateDBMSto connecttoand itwill exposeallthetables • Youthenselectthe tablewhichhasthe primarykeywhichwill thenenableyoutolink tothe Spatial database withthe use oftheforeign key
ConnectingthespatialdatabasewithDBMS • Youthen selectthe primarykeyinthe DBMStableaswellas theforeignkeyinthe spatialdatabase • Thisisfollowedbythe joincommandwhich joinsthetwotables
ConnectingthespatialdatabasewithDBMS • Thisisfollowedby thejoincommand whichjoinsthetwo tables • Thismeansthe DBMSdatacanbe usedtoquerythe spatialdatabase
2 Spatial data, which means data related tospace. Data that pertains to the space occupied byobjects. Data that define alocation. These are in the form of graphic primitives that usually either points, lines, polygons orpixels. Spatial data includes location, shape, size andorientation. For example: consider a particularsquare: • Its center ( the intersection of its diagonals ) specifiesits location • Its shape issquare • The length of one of its sides specifies itssize • The angle of its diagonals make with, say, x-axis specifies itsorientation. SpatialData
3 The space of interest canbe, • For example, the two-dimensional abstraction of (parts of) the surface of the earth – that is, geographicspace, • The most prominent example – a man-made space like the layout of a VLSIdesign, • A volume containing a model of the human brain,or another 3d-space representing the arrangement of chains of proteinmolecules. SpatialData
4 Non-spatial data ( also called attribute or characteristic data) is that information which is independent of all geometricconsiderations. For example: a person's height, mass, and ageare non-spatial data because they are independent of person'slocation. It's interesting to note that, while mass isnon-spatial data, weight is spatial data in the sense that something's weight is very much dependent on its location. Non-SpatialData
5 Two types of spatial data are particularlyimportant: • Computer-aided-design (CAD) data, which includes spatial information about how objects such as buildings, cars, or aircraft, are constructed. Other important examples of computer-aided-design databases are integrated-circuitand electronic-devicelayouts. • Geographicdata such as road maps, land-usage maps, topographic elevation maps, political maps showing boundaries, land-ownership maps, and so on. Geographic information systems are special-purpose databases tailored for storing geographicdata. SpatialData
6 Coordinates are used to specify location of geographic objects in either two or three dimensional space. The coordinates can be specified as(x,y) in 2D or (x,y,z) in 3D or spherical coordinates (latitude,longitude). Discrete geographic features like points, lines and polygons can beused to represent different types objects. Points might be a house address, a line might be a road and a polygon might be a land parts or buildingfoot- prints. These are also known as vector datatypes. Continuous geographic features describe phenomena that exist continuously in landscape. Examples include: elevation, temperature, relative humidity, gravity, wind, atmospheric pressure and so on. Theseare considered as raster datatypes. The features can also be summarized by ageographicarea. Examples include population, socio-economic characteristics and otherdemographic information. How Are Spatial DataOrganized?
7 There are four main properties of the spatial data that set it apart from traditional relationaldata. • Geometry • Distribution of Objects inSpace • TemporalChanges • DataVolume Properties of SpatialData
8 Geometry deals with the mathematical properties of an object. These properties include measurement (metric), relationships of points, lines, angles, surfaces, and solids (topology), andorder. A simple geometry is usually constructed fromgeometric primitives such as points, lines, curves, andareas. Complex geometries are constructed from collectionsof simple geometries. In addition, there are a number of geometric relationships between geometries that are important in handling spatialdata. Geometry
9 Usually spatial objects are very irregularly distributedin space. Consider the case where we model the town halls of allthe cities in the United States as spatial objects(points). The distribution of cities on the east coast is very dense compared to the distribution of cities in Arizona andNevada, which is sparse. In addition, different objects have largely varying extents. Distribution of Objects inSpace
10 Spatial data often has an associated temporal property. An example is a navigation system that helps travelers find directions from place A to B in amajor city. If there is an accident and some road is temporarily closed, the system has to incorporate this newdata and recompute a suitable path from point A toB. TemporalChanges
11 Several GIS applications deal with very large databases of the order ofterabytes. For example, remote sensing applications gather terabytes of data from satellites everyday. Similarly, data warehousing applications and NASA’s Earth Observation System are other examples ofsystems with terabytes of spatialdata. DataVolume
13 Spatial data types are special data types necessary to model geometry and to suitably represent geometric data in database systems. These datatypesare: point, line, and region but also include more complex types like partitions (maps) and graphs (networks). Conceptually, points, lines, rectangles, surfaces, volumes andetc. Physically, cities, rivers, roads, states, crop coverage,mountain rangesetc. Spatial data types provide a fundamental abstraction formodeling the geometric structure of objects in space, their relationships, properties andoperations. Spatial DataTypes
14 A spatial database system is a full-fledged database systemwith additional capabilities for handling spatialdata. Spatial database system is a database systemwith: • Offersspatialdatatypesinitsmodelandquerylanguage. • Supports spatial data types in its implementation providing at least spatial indexing and efficient algorithms for spatialjoin. • Spatial data types, e.g. POINT, LINE, REGION, provide a fundamental abstraction for modeling the structure of geometric entities in space as well as their relationships (l intersects r), properties (area(r) > 1000), and operations (intersection(l, r) – the part of l lying withinr). • SpatialDatabase
15 In general, a spatial database stores objects that have spatial characteristics that describe them and that have spatial relationships amongthem. The spatial relationships among the objects are important, and they are often needed when querying thedatabase. A spatial database is optimized to store and query data related to objects in space, including points, lines andpolygons. Whereas typical databases process numeric and character data, additional functionalityneeds to be added for databases to process spatial datatypes. Spatialdatabases
16 Queries posed on these spatial data, where predicates for selection deal with spatial parameters, are called spatialqueries. For example, a query such as “List all the customers located within twenty miles of company headquarters” will require the processing of spatial datatypes. Effectively, each customer will be associated to a <latitude, longitude>position. A traditional B+-tree index based on customers’ zip codes or other non-spatial attributes cannot be used to process this query since traditional indexes are not capable of ordering multidimensional coordinatedata. Spatialdatabases
17 Applications of spatial data initially stored data as files in a file system, as did early-generation businessapplications. But as the complexity and volume of the data, and the number of users, have grown, ad hoc approaches to storing and retrieving data in a file system have proved insufficient for the needs of many applications that use spatialdata. Spatial-data applications require facilities offered by a database system— in particular, the ability to store and query large amounts of dataefficiently. Why Spatialdatabases?
18 Therefore, there is a special need for databases tailored for handling spatial data and spatialqueries. Spatial data support in databases is important for efficiently storing, indexing, and querying of data on the basis of spatial locations. Efficient processing of the above query would require special- purpose index structures, such as R-trees for thetask. Why Spatialdatabases?