1 / 31

Technical Design

Technical Design. Technology choices. Sebastien Ponce, Giuseppe Lo Presti, German Cancio CERN / IT. Castor Readiness Review – June 2006. Outline. About languages C/C++, perl/python UML diagrams Component, activity, state, sequence, class Code generation Objects Converters SQL code

joannamills
Download Presentation

Technical Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technical Design Technology choices Sebastien Ponce, Giuseppe Lo Presti, German Cancio CERN / IT Castor Readiness Review – June 2006

  2. Outline • About languages • C/C++, perl/python • UML diagrams • Component, activity, state, sequence, class • Code generation • Objects • Converters • SQL code • Database centric software • The database schema as a description of CASTOR

  3. About languages • Object Orientation • Java like inheritance • Simple, multiple only for interfaces • Heavy use of (pure) abstract interfaces • Each service/component has one • Some services even have several implementations • IGCSvc  RemoteGCSvc and OraGCSvc • Implementation language is C++ • Still some subparts in C (but OO !) • Happy mix, special thanks to code generation • Castor 1 is pure C, no OO

  4. Still around languages • Admin scripts are mostly perl • Latest ones in python, future is python • No imposed dev environment • Wide range of tools among developpers • vi(m), (x)emacs • jedit, kdeveloper • nedit • Even different linux distributions • Documentation • Recent documents are tex • Older are more word

  5. Architecture design • Based on UML • Component views for the rough architecture • Activity diagrams per component • State diagrams per object • Sequence diagrams per use case • Class diagrams for implementation details, par compenent • Using umbrello • Essentially because of its code generation

  6. Component views • Used for designing the overall architecture and the relations between components • The building blockis a component • The relations meanthat the 2 componentscommunicate at somestage

  7. Activity diagrams • Used for describingthe work flow of acomponent • The building blockis a simple (atomic)operation • Arrows indicate theflow of time

  8. State Diagrams • Used for describing the possible states and state transitions of objects • Allows to easily find out implications of adding a new state or state transition • Building block isa state • Arrows indicatestate transitions

  9. Sequence diagrams • Used for analyzing interactions between components and their timing • Useful to avoiddeadlocks andinefficiencies dueto bad granularity • Blocks are actions • Arrows arecommunicationsbetweencomponents Atomic interactionwith the DB

  10. Class diagrams • Used of fine graineddesign. • Blocks are the itemshandled by the code(mapped to classes) • Relations are linksfrom one object toanother • This is the input ofthe codegeneration

  11. Usage of UML diagrams • For the design • At the project level • At the code level • For the database schema • class diagram maps to the DB schema • For code generation • Of header files and object implementation (data objects) • Of DB scripts for creating/deleting the schema • Of a DB access layer in the code • allows to store/retrieve/update/link/unlink objects • Of I/O libraries allowing to send/receive objects

  12. castor::IObject* obj = sock->readObject() … castor::BaseAddress ad; ad.setCnvSvcName(“DbCnvSvc”); … svcs()->createRep(&ad, obj); Core of the Request Handler code Goal of code generation • Saves typing for class declaration/implementation • Automates some facilities (object printing, introspection) • Automates streaming and DB access • Generates C interface to C++ code • Allows easy maintenance

  13. Class implementation class MessageAck : public virtual IObject { … private: /// Status of the request bool m_status; /// Code of the error if status shows one int m_errorCode; .hpp C++ code void castor::MessageAck::print(…) const { … stream << indent << "status : “ << m_status << std::endl; stream << indent << "errorCode : “ << m_errorCode << std::endl; .cpp int C_MessageAck_create(struct C_MessageAck_t**); int C_MessageAck_print(struct C_MessageAck_t*); .h int C_MessageAck_create(castor::MessageAck** obj) { *obj = new castor::MessageAck(); return 0; } int C_MessageAck_print(castor::MessageAck* instance) { instance->print(); return 0; } And C interface CInt.cpp

  14. Converters Stream*Cnv.h/cpp virtual void createRep(IAddress*, IObject*); Streaming code void … StreamMessageAckCnv::createRep(…) { … ad->stream() << obj->type(); ad->stream() << obj->ipAddress(); ad->stream() << obj->port(); ad->stream() << obj->id(); Ora*Cnv.h/cpp virtual void createRep(IAddress*, IObject*); void … StreamMessageAckCnv::createRep(…) { … const std::string insertStmStr = "INSERT INTO Client (ipAddress, port, id)…"; … insertStmt = createStatement(insertStmStr); … insertStm->executeUpdate(); And DB code

  15. DB scripts castor_oracle_create.sql /* SQL statements for type Client */ CREATE TABLE Client (ipAddress NUMBER, port NUMBER, id INTEGER PRIMARY KEY) INITRANS 50 PCTFREE 50; castor_oracle_drop.sql /* SQL statements for type Client */ DROP TABLE Client;

  16. Database Centric software • Reliability • No single point of failure • By replicating all components • Locking handled by the DB • Backups handled by the DB • Scalability • All component can be replicated • No catalog in memory • Limited by DB scalability • No risk from the space point of view • CPU is the limit. A lot of tuning done by DB people. • Has to be measured properly once tuning is over • No fear so far MigHunter RH GC DB Stager Recall LSF plugin

  17. The database schema • Actually presentingthe class diagram • A simplified version • Only stager objects • No core framework • No streaming/DB • Only interfaces • FileRequest • No status enums

  18. Catalogue DB schema Tape oriented Request oriented Disk oriented

  19. Request oriented classes Tape oriented Request oriented Disk oriented

  20. SvcClass ≈ batch queue. The SvcClass also names the CASTOR file management policies that should be used. Request oriented classes

  21. Requestor information Request oriented classes

  22. FileRequests are requests requiring access to resources: • stageGet • stagePut • stageUpdate • prepareToXXX • Streaming mode Request oriented classes

  23. File requests Tape oriented Request oriented Disk oriented

  24. Each file requested is associated with one or more SubRequest, which is the working unit of the new stager File requests

  25. Disk residence Tape oriented Request oriented Disk oriented

  26. A given CASTOR file may exist with multiple disk copies. Allows for loadbalancing the access to “hot” files. The first write operation on one of the copies will immediately invalidates all other copies. The maximum number of replicas can be specified in the SvcClass Disk residence

  27. The file systems are monitored by the rmnode daemon running on the disk servers. The filesystem selection policy takes the load monitoring data as input and gives the weight and fsDeviation as output. Disk residence

  28. Tape migration/recall Tape oriented Request oriented Disk oriented

  29. Each migration/recall candidate has a TapeCopy • Eligible migration candidate have TapeCopies associated with a Stream • Eligible recall candidate is a TapeCopy associated with a Segment • The nbCopies attribute in the FileClass determines how many TapeCopies are associated with the CastorFile Tape migration/recall

  30. Tape migration/recall • Streams are containers of migration candidates. • Running Streams are associated with a Tape. • The maximum number of streams is defined by nbDrives attributes in SvcClass • Finished Streams (no associated TapeCopy) are automatically deleted.

  31. A SvcClass should be associated with one or more TapePools to be used for migration. • SvcClass can be associated with several TapePools (and vice versa) • Adding/changing associations or deleting TapePools are administrative actions Tape migration/recall

More Related