slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Challenges Running an NFSv4-backed OSG Cluster PowerPoint Presentation
Download Presentation
Challenges Running an NFSv4-backed OSG Cluster

Loading in 2 Seconds...

play fullscreen
1 / 28

Challenges Running an NFSv4-backed OSG Cluster - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Kevin Coffman kwc@citi.umich.edu Center for Information Technology Integration University of Michigan. Challenges Running an NFSv4-backed OSG Cluster. Overview. Basic NFSv4 in production Open Science Grid (OSG) Overview OSG Installation OSG Configuration Submitting a job!

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Challenges Running an NFSv4-backed OSG Cluster' - luana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Kevin Coffman

kwc@citi.umich.edu

Center for Information Technology Integration

University of Michigan

Challenges Running an NFSv4-backed OSG Cluster
overview
Overview
  • Basic NFSv4 in production
  • Open Science Grid (OSG) Overview
  • OSG Installation
  • OSG Configuration
  • Submitting a job!
  • Authentication differences (AFS vs. NFSv4)
  • Authentication futures
basic nfsv4 file service in production
Basic NFSv4 file service in production
  • Basic file storage
  • User name mappings
  • Home directories
  • Kernel builds, etc.
open science grid overview
Open Science Grid Overview
  • Architecture
    • Head node & worker notes
    • Core is NSF Middleware Initiative (including Globus, Condor, kx.509)
  • Authentication
    • X.509, kx.509, proxy certs
  • No cluster file-system required
    • “Home”, Base, Data, Apps, Temp, Worker node temp
osg installation
OSG Installation
  • New Linux kernels, new NFSv4 code, new OSG releases, repeat!
  • Base installation is done solely on head node
  • Credentials needed
    • Root access assumed for local file system access
      • Mapping machine cred now necessary
    • Kerberos credentials for NFS file system access
  • Name-to-UID mapping issues
    • Found the need for tools/scripts for flushing mappings
osg configuration
OSG Configuration
  • Daemons (i.e., MonALISA and Condor) on head node and worker nodes require authentication for file system access
    • Keytabs
    • More name to UID mapping required
  • Virtual Organization (VO) accounts
    • DN to UNIX account name via grid-mapfile
    • Name to UID mappings required for file system access
submitting a job
Submitting a job!
  • Job submission uses X.509 authentication
    • Need Kerberos authentication for file-system access
    • Gatekeeper forks a job manager process for each job
      • Job manager process runs as the original user and needs user’s credentials
  • Verified works as expected using AUTH_SYS w/o requiring Kerberos credentials
mgrid architecture

mod_ssl

Browser

mod_kct

libpkcs11

KCT

kx509

mod_kx509

KCA

kinit

KDC

CHEF

Authorization

GateKeeper

Resource Mgr

Authorization

Resource

MGRID Architecture

MGRID Portal

User Workstation

Apache

SSL (Client Certificate required)

3

Kerberos V5

4

Kerberos

2

5

Kerberos

mod_

jk

mod_

php

1

6

Tomcat

GSI

Grid Resource

LDAP

6

SASL

7

LDAP

SASL

8

grid job authentication issues
Grid job authentication issues
  • Jobs scheduled to run in the future
  • Long-running jobs (refreshing credentials)
  • Combination of both (future and long-running)
  • Distribution of user credentials to worker nodes for file system access
current architecture
Current Architecture

KDC

TGS

AS

6

1

client

server

5

9

SVC GSSD

GSSD

user

process

user

7

kernel

12

8

10

13

3

4

gss context

cache

gss context

cache

NFS

NFSD

11

Credentials

on Disk

2

keytab

authentication futures
Authentication futures
  • SPKM3
    • Allows us to stay in X.509 world
    • Anonymous (DH)
      • Certificate on server to prevent MIM
    • X.509 Certificates
  • LIPKEY
    • Built on top of SPKM3
    • Allows TLS-like password authentication
linux kernel keys support a k a keyring
Linux kernel keys support(a.k.a. keyring)
  • General credential storage in-kernel
    • thread-specific keyring
    • process-specific keyring
    • session-specific keyring (PAG-like via JOIN_SESSION_KEYRING)
  • Different key types: ‘user’, ‘rpcsec_gss context’
  • Create, delete, link, search, revoke, etc.
  • Quotas and permissions
  • Referenced by serial # and description
mit kerberos ccache using keyring as backing storage
MIT Kerberos ccache using keyring as backing storage
  • Assumes a single “active” credentials cache
  • Can store more than one ccache in same session keyring
  • All user-level code

Session

|

+---> krb5_cc_active (key: contains 0x00004f12)

|

+---> /tmp/krb5cc_20010_XF45C2 (keyring: id is 0x000023cd)

| |

| +---> kwc@CITI.UMICH.EDU (principal info)

| +---> krbtgt/CITI.UMICH.EDU@CITI.UMICH.EDU

| +---> nfs/screamer.citi.umich.edu@CITI.UMICH.EDU

| +---> nfs/troy.citi.umich.edu@CITI.UMICH.EDU

| +---> pop/citi.umich.edu@CITI.UMICH.EDU

| +---> afs@CITI.UMICH.EDU

|

+---> /tmp/krb5cc_20010_umich (keyring: id is 0x00004f12)

|

+---> kwc@UMICH.EDU (principal info)

+---> krbtgt/UMICH.EDU@UMICH.EDU

+---> imap/tremors.itd.umich.edu@UMICH.EDU

mount using keyring support
Mount using keyring support
  • Mount program will use keytab to set up machine credentials in keyring
  • /sbin/request-key invoked and finds machine credentials
  • Context is negotiated and “rpcsec_gss context” key instantiated
user access using keyring support
User access using keyring support
  • Assumes they have credentials in keyring via kinit or PAM
    • No more looking around blindly for creds in filesystem
    • /sbin/request-key invoked and finds user’s session-specific credentials
keyring issues
Keyring issues
  • Upcalls from asynchronous events
  • Still need to tie “rpcsec_gss context” keys to Kerberos credentials
future architecture
Future Architecture

KDC

TGS

AS

4

1

client

server

7

SVC GSSD

request-key handler

user

process

user

5

kernel

10

6

8

11

TGT

2

3

gss context

cache

gss contextcache(in keyring)

NFS

NFSD

9

keytab

questions discussion
Questions / Discussion

http://www.citi.umich.edu/projects

references
References
  • Open Science Grid
    • http://www.opensciencegrid.org
  • MonALISA
    • http://monalisa.cacr.caltech.edu
  • Condor
    • http://www.cs.wisc.edu/condorCondor
  • Keyring
    • Kernel Source: /Documentation/keys.txt
krb5 obtaining gss context
Krb5: Obtaining gss context
  • TGT: currently stored in file system
  • Per NFSD service ticket: currently stored in file system
  • GSSD locates user credentials by convention (/tmp/krb5cc_uid)
  • Synchronizing gss_context and credential problematic
linux credential interface
Linux credential interface
  • New system calls for kernel credential placement
  • Available for upcoming PAG implementation
  • Passed via upcall to GSSD
  • Credential vs. gss context management no longer a problem
linux krb5 kernel credential
Linux Krb5 kernel credential
  • Pass TGT to kernel as credential
  • Stored in user process (PAG)
  • Passed to GSSD via gss_init_sec_context upcall
  • GSSD manages Krb5 NFSD service tickets
  • Multiple in kernel TGTs vs. cross realm authentication
client lipkey with spkm3
Client: LIPKEY with SPKM3
  • Initiator
    • Anonymous SPKM3 client
  • Credential:
    • LIPKEY username and password
    • sent to server encrypted in SPKM3 session key
  • Context
    • per <user, nfsd> LIPKEY(?) and SPKM3 gss context
linux lipkey kernel credential
Linux LIPKEY kernel credential
  • LIPKEY credential (username and password) is per server.
  • Not stored in kernel
  • Instead, store information to be passed to GSSD which will prompt user for LIPKEY password for each NFSD.
client spkm with x509
Client: SPKM with X509
  • Initiator
    • password for long term user X.509 private key
  • Credential
    • short term proxy X509 credential and private key (grid-proxy-init)
  • Context
    • per <user, nfsd> SPKM gss context
linux spkm kernel credential
Linux SPKM kernel credential
  • Pass proxy (short term) X509 credential and private key to kernel as credential
  • Stored in user process (PAG)
  • Passed to GSSD via gss_init_sec_context upcall
  • GSSD manages CA hierarchy and credential checking