1 / 37

Tutorial

Sector & Sphere. Tutorial. Yunhong Gu Univ. of Illinois at Chicago @Booz Allen Hamilton, Aug 6, 2009. Outline. Installation Sector File System Sphere Programming. Installation: System Requirement. Linux (debian recommended, XFS recommended) gcc 3.4 or above openssl development library

dmitri
Download Presentation

Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sector & Sphere Tutorial Yunhong Gu Univ. of Illinois at Chicago @Booz Allen Hamilton, Aug 6, 2009

  2. Outline • Installation • Sector File System • Sphere Programming

  3. Installation: System Requirement • Linux (debian recommended, XFS recommended) • gcc 3.4 or above • openssl development library • FUSE development library (optional)

  4. System Architecture security_node.key ./users slave_acl.conf master_acl.conf security_node.cert master_node.key master.conf, topology.conf slaves.list master_node.cert client.conf Security Server Masters Clients SSL SSL Data slaves slaves master_node.cert slave.conf

  5. ls ./codeblue2 • Makefile • client • conf • gmp • master • slave • common • doc • lib • security • udt

  6. Configure Security Server • For a testing system, you can use the default configurations • Otherwise, update slave ACL, master ACL, and user accounts

  7. Access Control List (ACL) • Format IP1 IP2 IP3/Mask • Example: 10.0.0.1 192.168.0.0/24

  8. Access Control List (ACL)

  9. User Account • All accounts in ./conf/users • One account per file • Example: ./conf/users/test is the account configuration for account “test”

  10. User Account PASSWORD xxx READ_PERMISSION / WRITE_PERMISSION /test /angle EXEC_PERMISSION TRUE ACL 0.0.0.0/0 QUOTA 1000000

  11. Start the Security Server • ./sserver <port> • Default port is 5000

  12. Configure the Master Server • ./conf/master.conf SECTOR_PORT 6000 SECURITY_SERVER ncdm161.lac.uic.edu:5000 REPLICA_NUM 2 DATA_DIRECTORY /home/u2/yunhong/work/data/

  13. Configure the Slaves • ./conf/slave.conf MASTER_ADDRESS ncdm161.lac.uic.edu:6000 DATA_DIRECTORY /raid/sector/data/

  14. Start masters and slaves • ./start_master • ./start_slave • ./start_all • ./stop_all • Password-free SSH • ./conf/slaves.list

  15. ./conf/slaves.list gu@192.168.136.1 /home/gu/codeblue2/slave/ gu@192.168.136.2 /home/gu/codeblue2/slave/ gu@192.168.136.3 /home/gu/codeblue2/slave/ username@slave_ip BLANK/TAB slave_path • NOT the slave data directory path! • Sector will automatically restart an offline slave, if its address is on this list

  16. Configure the Client • ./conf/client.conf • Optional, but useful for client tools and examples MASTER_ADDRESS ncdm161.lac.uic.edu:6000 USERNAME test PASSWORD xxx CERTIFICATE /home/gu/codeblue2/conf/master_node.cert

  17. Check System Status $cd client $cd tools $./sysinfo Display system information: list of masters, slaves, available disk spaces, etc. ./master/sector.log

  18. Accessing Sector FS • Tools: ./client/tools • ls, mkdir, stat, rm, download, upload, cp, mv • FUSE: ./client/fuse • make • mount: ./sector-fuse <local dir> • unmount: fusermount -u <local dir>

  19. Programming with Sector • #include <fsclient.h> • Sector::init(master_ip, master_port); • Sector::login(username, password, cert); • Sector::logout(); • Sector::close();

  20. Programming with Sector • Sector::list(path, vector<SNode>& attr) • Sector::stat(path, SNode& attr) • Sector::mkdir(path) • Sector::move(src, dst) • Sector::remove(path) • Sector::copy(src, dst) • Sector::utime(path, ts)

  21. SNode • std::string m_strName; • bool m_bIsDir; • std::set<Address, AddrComp> m_sLocation; • int64_t m_llTimeStamp; • int64_t m_llSize;

  22. Sector Files • SectorFile handle; • handle.open(path, mode); • handle.read(buf, size); • handle.write(buf, size); • handle.close(); • seekp, seekg, tellp, tellg, upload, download

  23. Sphere Programming for each file F in (SDSS datasets) for each image I in F findBrownDwarf(I, …); SphereStream sdss; sdss.init("sdss files"); SphereProcess myproc; myproc->run(sdss,"findBrownDwarf", …); myproc->read(result); findBrownDwarf(char* image, int isize, char* result, int rsize);

  24. Record Offset Index • Data Text1 text1 text1 text1 Text2 text2 Text3 text3 text3 • Index 0 23 44 61 • Index is a binary file with 64-bit integers, with a postfix of “idx” • user.dat / user.dat.idx

  25. Hashing and Bucket Files • Similar to the Reduce process in MapReduce • Each output record is assigned a bucket ID • Records with the same bucket ID will be sent to the same bucket file

  26. User Defined Function (UDF) • int _FUNCTION_(const SInput* input, SOutput* output, SFile* file)

  27. UDF::SInput struct SInput{  char* m_pcUnit;   int m_iRows;   int64_t* m_pllIndex;   char* m_pcParam;   int m_iPSize; };

  28. UDF::SOutput struct SOutput{  char* m_pcResult;   int m_iBufSize;   int m_iResSize;   int64_t* m_pllIndex;   int m_iIndSize;   int m_iRows;   int* m_piBucketID;   int64_t m_llOffset;   string m_strError; };

  29. UDF::SOutput • If m_pcResult or m_pllIndex is not large enough, resize it • When processing a file, if the result is too large, set m_llOffset to record the current file position and the UDF will be called again to restart processing from m_llOffset, until m_llOffset is set to -1.

  30. UDF::SFile struct SFile{  std::string m_strHomeDir;   std::string m_strLibDir;   std::string m_strTempDir;   std::set <std::string> m_sstrFiles; }; Results can be written into local files, the paths should be put into m_sstrFiles

  31. UDF • __FUNCTION__.cpp #include <sphere.h> extern “C” { int _FUNCTION_(const SInput* input, SOutput* output, SFile* file) { } } • generate FUNC.so file

  32. A Sphere Program #include <dcclient.h> Sector::init(); Sector::login(…) SphereStream input; SphereStream output; SphereProcess myProc; myProc.loadOperator(“func.so”); myProc.run(input, output, func, 0); myProc.read(result) myProc.close(); Sector::logout(); Sector::close();

  33. Sphere Stream • Input vector<string> files;files.insert(files.end(), "/html");SphereStream s;s.init(files); • Output SphereStream temp;temp.setOutputPath("/result", "bucket");temp.init(256);

  34. Upload UDF and related files • SphereProcess::loadOperator(path) • Send UDF to all selected slaves for the current process • Can also send any other files (applications, parameter data, etc.) • The path will be stored in SFiles::m_strLibDir

  35. Run a Sphere Process • int run(const SphereStream& input, SphereStream& output, const string& op, const int& rows, const char* param = NULL, const int& size = 0); • rows: number of rows to pass to UDF each time • N > 0: N rows • 0: the whole segment • -1: the whole file

  36. Read Result and Check Progress • SphereProcess:read(SphereResult*& res,   const bool& inorder = false,    const bool& wait = true); • If output.init(0), results will be sent back to the client • int checkProgress();

  37. Demo

More Related