1 / 25

EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS

EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS. 05011004 Ayhan KARGIN 05011027 Ahmet MERAL. Contents. External Sorting Needs and Usage Areas External Sorting Algorithms Environments for Implementing External Sorting Algorithms Used Technologies Phases of External Sorting

gladys
Download Presentation

EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EXTERNAL SORTING ALGORITHMSAND IMPLEMENTATIONS 05011004 Ayhan KARGIN 05011027 Ahmet MERAL

  2. Contents • External Sorting Needs and Usage Areas • External Sorting Algorithms • Environments for ImplementingExternal SortingAlgorithms • UsedTechnologies • Phases of ExternalSorting • K-Way Merge Sort • Multi-Step K-Way Merge Sort • Replacement Selection Sort • SimpleDB • Layered Components of SimpleDB • Classes of SimpleDB • Query Layer of SimpleDB • Relational Algebra,that SimpleDB Supports • RelationalAlgebra • PreparatoryWork • ExternalSorting on SimpleDB

  3. External Sorting Needsand Usage Areas • DBMS • Group By, Join, Order By • Data Warehouse (ETL) • Data Mining • Data Processing

  4. ExternalSorting Algorithms • K-Way Merge Sort • Multi-Step K-Way Merge Sort • Replacement Selection Sort

  5. Environments for ImplementingExternal Sorting Algorithms • MinSQL • PosgreSQL • SimpleDB

  6. Used Technologies • Java VM • Java SE • JDBC • Java RMI • Eclipse

  7. Phases of ExternalSorting • Run GenerationPhase • MergePhase

  8. K-WayMergeSort

  9. Multi-Step K-WayMergeSort

  10. Multi-Step K-WayMergeSort

  11. ReplacementSelection

  12. ReplacementSelection Test resultsdepending on stagearea size Note: Main memory is 8x.

  13. SimpleDB • The Client Side • Thatcontainsthe JDBC interfacesandimplementsthe JDBC driver • The Basic Server • Whichprovidescompletefuncionalityof DB but ignoresefficiencyissues • Extensions • Tothebasic server thatsupportefficientqueryprocessing.

  14. Layered Components of SimpleDB • Remote • Perform JDBC requests received from clients. • Parse • Extract the tables, fields, and predicate mentioned in an SQL statement. • Planner • Create an execution strategy for an SQL statement, and translate it to a relational algebra plan. • Query • Implementqueriesexpressed in relationalalgebra. • Metadata • Maintain metadata about the tables in the database, so that its records and fields are accessible.

  15. LayeredComponents of SimpleDB • Record • Provide methods for storing data records in pages • Transaction • Support concurrency by restricting page access. Enable recovery by logging changes to pages. • Buffer • Maintain a cache of pages in memory to hold recently-accessed user data. • Log • Append log records to the log file, and scan the records in the log file. • File • Read and write between file blocks and memory pages.

  16. Classes of SimpleDB

  17. Classes of SimpleDB

  18. Classes of SimpleDB

  19. Query Layer of SimpleDB • RelationalAlgebra • Select • Project • Product • Sort

  20. RelationalAlgebra, that SimpleDB Supports RelationalAlgebra Select Project Product Sort SQL Where Select Join OrderBy

  21. RelationalAlgebra • Select SId, SName, DNamefrom STUDENT, DEPTwhere MajorId=DIdand DName='math' • Q1: Product (STUDENT, DEPT) Q2: Select (Q2, MajorId=DId ) Q3: Select (Q2, DName='math') Q4: Project (Q3, {SId, SName,DName})

  22. RelationalAlgebra

  23. PreparatoryWork • We have to introduce order by operator (in SQL) to parser layer of SimpleDB. • The columns, which will be sort, must be presenced in QueryData class in structural form. • If sorting will be happened, sorting plans must be called by BasicQueryPlanner class. • Calculating number of randomaccesses per file on FileMgr layer. • Calculating duration of transaction with adding start time and end time to Transaction class. • Calculating number of unsorted records on each transaction. • We may have to compare, copy and exchange congeneric records on different scans. So, RecordExchange class is written for these purposes. • StepCalculatorclass is written. It splits any number to two close integer multipliers. For example; from 29, 6 and 5; from 49, 7 and 7; from 50, 8 and 7; etc.

  24. ExternalSorting on SimpleDB • Materialize Plan • Edward Sciore’sMergeSortPlan&Scan • K-WayMergeSortPlan&Scan • Multi-Step K-WayMergeSortPlan&Scan • ReplacementSelectionSortPlan&Scan • Multi-Step ReplacementSelectionSortPlan&Scan

  25. AlgorithmsChoice Dependency

More Related