1 / 27

File Systems and Storage

File Systems and Storage. Rich Sudlow and Paul Brenner University of Notre Dame Center for Research Computing. Overview. File System Concepts Aspects and Types Why do we have so many? Redundancy and Performance RAID Examples on X4500 “thumper” CRC Supported File Systems

maya
Download Presentation

File Systems and Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File Systems and Storage Rich Sudlow and Paul Brenner University of Notre Dame Center for Research Computing

  2. Overview • File System Concepts • Aspects and Types • Why do we have so many? • Redundancy and Performance • RAID • Examples on X4500 “thumper” • CRC Supported File Systems • Centralized (ext3, fat32/ntfs) • Distributed (AFS, NFS) • Comparision – capabilities • AFS – crc.nd.edu, nd.edu cells • Backup Storage • Software, Cache, Tape Silo • Using CRC Storage • Scratch space, User workspace, Backup

  3. Disclaimer • This is: • A broad overview • Operational viewpoint • Starting point • This is not: • Comprehensive • Authoritative analysis of the industry/technology • For more info consider: • ND courses relevant to this topic • Contacting CRC for specific individual requirements

  4. File Systems Concepts • Aspects • Filenames • Meta data (size, # blocks, time, security) • Hierarchical vs Flat • Secure access • Capabilities/Facilities (move, delete, append) • Why so many? Why not use just one? • Types • Disk/Flash • Database & Transactional • Network/Distributed • Special Purpose

  5. Wikipedia – List of File Systems http://en.wikipedia.org/wiki/List_of_file_systems

  6. Redundancy and Performance • File system design is strongly influenced by the target feature set • Bandwidth, security, latency, distributed access, fault tolerance, etc... • File systems can be tuned and tiered to exploit feature sets of Operating Systems and filesystems. E.g. Solaris vs Linux • Underlying ZFS or ext3 filesystem for NFS/AFS • Scalability – Ability to run across multiple nodes – by multiple users – local vs distributed

  7. Wikipedia – Comparison of File Systems http://en.wikipedia.org/wiki/Comparison_of_file_systems

  8. RAID • Redundant Array(s) of Inexpensive/Independent Disks • Utilize multiple/many disks to provide • Capacity • Reliability/Redundancy • Performance – configurable based on user requirements IOPS – bandwidth – various block sizes • Hardware and Software Implementations • Performance, Flexibility, Boot ‘chicken before egg’ • RAID ‘Levels’ • Disk utilization configurations to optimize cost vs capability • Tiered/Nested RAID Levels • One RAID level on top another raid 0+1 vs 10

  9. RAID Levels • RAID 0 • Striped : Performance and Capacity • RAID 1 • Mirrored : Read Performance and Fault Tolerance (FT) • RAID 3 & 4 • Striped with Dedicated Parity • RAID 5 • Striped with Distributed Parity : Performance, Capacity, N+1 FT • RAID 6 • Striped with Distributed Parity Performance, Capacity, N+2 FT • RAID 0+1 • Striped sets in a mirrored set • RAID 1 + 0 generally just called RAID 10 • Mirrored sets in a striped set • RAID 50 • Striped (0) Across Distributed Parity RAID (5)s

  10. RAID Reference http://en.wikipedia.org/wiki/RAID

  11. Raid example - Sun Thumper X4500

  12. c5 vicepb – single disk vicepc – 2 disk stripe vicepd – 3 disk stripe vicepe – 4 disk stripe vicepf – 6 disk stripe vicepg – single disk mirror viceph – 2 disk stripe mirror vicepi – 3 disk stripe mirror vicep{ j, k, l } – 3 disk stripe – only used for read problem encountered in multiclient test. C5 C4 C7 C6 C1 C0

  13. Links to RAID examples Solaris/Red Hat on Sun X4500 (thumper) UFS tests on Solaris 10 using Sun Volume Manager http://www.nd.edu/~rich/afsbpw2007/thumper01-solaris-ufs-tests UFS tests on Linux (RH4U4) http://www.nd.edu/~rich/afsbpw2007/thumper02-linux-ext3-tests

  14. CRC Supported File Systems Why do we have so many? • Centralized • Ext3 (Linux) – Red Hat 4 & 5 • FAT32/NTFS (Windows) • Distributed • NFS • AFS • Others • ZFS

  15. CRC Supported File Systems http://www.nd.edu/~rich/CRC_filesystems.html Scratch Space User Workspace Storage Backup – Available for backup of CRC and research machines on campus.

  16. Scalability Scalability of filesystem Bottlenecks Scalability of network Scalability of codes Simple testing tools – http://ndt.hpcc.nd.edu:7123 (simple but not always accurate) nuttcp - /opt/und/local/bin/nuttcp - firewalls nuttcp –t (-r) opteron.hpcc.nd.edu (Don’t abuse) diskrate – diskrate –n 10m –f trash

  17. File Permissions (Linux) • What the heck does this mean? • drwxr-xr-- • File permissions for user, group, and all • 10 spaces the first indicates ‘if directory’ • Triples of rwx indicate read, write, and execute for user, group, and all • Change file permissions with ‘chmod’ • Examples: • chmod a+r filename • chmod go+w filename • chmod 1777 directory • chmod 700 directory

  18. File Permissions (AFS) • fs setacl -dir $HOME -acl pat all terry none • fs is the command suite. • setacl is the operation code, which directs the File Server process to set an access control list. • -dir $HOME and -acl pat all terry none are arguments. Implies that terry previously had access • -dir and -acl are switches; -dir indicates the name of the directory on which to set the ACL, and -acl defines the entries to set on it. • $HOME and pat all terry none are instances of the arguments. $HOME defines a specific directory for the directory argument. The -acl argument has two instances specifying two ACL entries: pat all and terry none. • Command abbreviations • fs listacl (full command) , fs lista (abbreviation), fs la (alias)

  19. File Permissions (AFS) AFS gives each user the permission to create their own groups Common to use syntax owner:group pts creategroup cvrl:cvrl_group pts adduser rich cvrl:cvrl_group pts membership cvrl:cvrl_group To recursively set permissions: find ./ -type d –print –exec fs setacl {} cvrl:cvrl_group read \; Special groups: system:anyuser, system:authuser, nd_campus IP based Access Control Lists

  20. AFS References AFS Reference Links http://crcmedia.hpcc.nd.edu/wiki/index.php/AFS_References_and_Resources Some AFS / NFS Storage comparisons http://crcmedia.hpcc.nd.edu/wiki/index.php/CRC_Storage_Comparisons Sometimes the system is more than just storage – features are important – but need to be the ones users use.

  21. AFS – crc.nd.edu – nd.edu cell nd.edu is the campus legacy OpenAFS cell – started May 1990 – uses ND.EDU Kerberos realm – run by OIT staff Currently the default cell for most CRC logins and batch system – opteron, opterona, stats crc.nd.edu is the “new” cell run by CRC staff – Started October 2007 – uses CRC.ND.EDU The future cell for CRC logins and batch system – target for rollout June 2008 – hardware, and administrative differences. Kerb 4 EOL scheduled for 12 / 2008 for nd.edu cell

  22. AFS – crc.nd.edu – nd.edu cell CRC Wiki Links http://crcmedia.hpcc.nd.edu/wiki/index.php/CRC_AFS_Cell Accessing multiple cells http://crcmedia.hpcc.nd.edu/wiki/index.php/Automatic_CRC/ND_AFS_cell_setup Recommendations on cells for primary access – interactive use Methods to migrating data – tar, up, cp, vos dump/restore, start fresh. Issues with interactive use – references to nd.edu that you don’t know about – e.g. mozilla, etc

  23. Software - Teradactyl Inc. – True Incremental Backup System – TiBS http://www.teradactyl.com Available for backup of CRC and any research machines in colleges – On-site training June 16-20th, 2008. Supported architectures include OpenAFS, Solaris, Linux, Windows, MacOSX. Hardware Backup server – Dell Power Edge 6950 server – utilizing 10 Gb ethernet & fiber channel interfaces. Cache – 16 TB Infortrend Fibre Channel Array CRC Storage Backup –B023 Malloy Hall

  24. Storage Backup Sony – Consolidated Storage Management System (CSM 200) Capacity of 604 tapes – 3 LTO4 drives with 1 TB tapes – 2 TB per tape with 2:1 compression- Library will hold > 1 PB without reloading tapes – expands to 2,988 tapes with 96 drives.

  25. References • Wikipedia: FileSystems • Advanced File Systems Issues-Andy Wang FSU http://www.cs.fsu.edu/~awang/courses/cop5611_s2004/ • ND CRC wiki http://crc.nd.edu/wiki • OpenAFS User Guide http://www.openafs.org/doc/index.htm • OpenAFS Best Practices 2007 – Sudlow http://crc.nd.edu/facilities/documents/afsbpw2007.pdf

  26. Questions ? • How can we improve this class? • Additional topics? • Cover one topic more thoroughly? • Remove topics? • Thanks for the feedback?

More Related