1 / 13

LCG Incident Response

LCG Incident Response. Ian Neilson LCG Security Officer Grid Deployment Group CERN. Background. LCG – Large Hadron Collider ( L HC) C omputing G rid Computing environment for the 4 LHC experiments ALICE, ATLAS, CMS, LHCb LHC operation in 2007

hang
Download Presentation

LCG Incident Response

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LCG Incident Response Ian Neilson LCG Security Officer Grid Deployment Group CERN GGF12 – 20 Sept 2004 - 1

  2. Background • LCG – Large Hadron Collider (LHC) Computing Grid • Computing environment for the 4 LHC experiments • ALICE, ATLAS, CMS, LHCb • LHC operation in 2007 • Required 12-14 PetaBytes/year, equivalent 70,000 PCs compute • * LCG1/2003 * LCG2/2003-4 * EGEE • 70+ sites in Europe, USA, Asia, S. America …… • 7000+ CPUs • 6000GB+ Storage • Software certification, testing, deployment group • Distributed GOCs • UK • http://goc.grid-support.ac.uk/gridsite/gocmain/monitoring/ • Taiwan • http://goc.grid.sinica.edu.tw/goc/ www.cern.ch/lcg GGF12 – 20 Sept 2004 - 2

  3. Grid monitoring GGF12 – 20 Sept 2004 - 3

  4. EGEE - Enabling Grids for E-science in Europe • 12 federations with 70 partner institutions • 2 year + 2 project • Operate a service grid facility for e-science • Initial built on LCG2 infrastructure • Re-engineer a robust middleware layer • glite • Attract new users • Research and Industry • Broader focus than HEP: Biomedical, Earth Science …….. www.cern.ch/egee GGF12 – 20 Sept 2004 - 4

  5. GOC Guides Policy – the Joint Security Group Incident Response Certification Authorities Audit Requirements Usage Rules Security & Availability Policy Application Development & Network Admin Guide User Registration http://cern.ch/proj-lcg-security/documents.html GGF12 – 20 Sept 2004 - 5

  6. Incident Response Policy • Agreement on Incident Response • June 2003 for LCG1 • What is an incident? • Security investigation causing service interruption • Suspected misuse of resources beyond site • “Reasonable possibility” of stolen credentials • Not to expire or be revoked within 3 days • Classifications • Identity theft • Suspected / Probable / Confirmed • Actions • Misuse / Enforcement / Restoration / Escalation GGF12 – 20 Sept 2004 - 6

  7. Incident Response - Communications • Site enrolment collects 2 entries per site • Registration questionnaire • Site Contacts mail list • Closed list of named individuals • email, telephone • CSIRT list mail • List-of-lists (Open) • 1 entry per site • Updated list circulated to contacts list as sites enrol • Pointers to policy documents for responsibilities • Channels • Users - local site contacts (& GOC) • Contacts - discussion and information exchange • CSIRT - incident notification, update • Roll-out - system administrators GGF12 – 20 Sept 2004 - 7

  8. Incident Response – management issues • LCG “community” known at CERN, EGEE community is broader • User enrolment is well controlled, site enrolment is not • Incomplete questionnaires • Personal instead of list • List instead of personal • Undeliverable addresses • Delayed delivery • Moderated delivery • Enrolment information not circulated • SPAM, SPAM, SPAM, SPAM • Lists need active management! • Can we “see” all the sites? • CERN/GOC view • VO “private” information systems GGF12 – 20 Sept 2004 - 8

  9. Incident response – operational issues • Recognising and reporting  • What is a local CSIRT? • Scale of coverage • 24x7 site/campus network operations team • Department Security Officer • LCG system administrator • Who is a security contact? • as above • Intersection with local CSIRT procedures • Local quarantine and analysis • Keeping emergency channels clear • Discussions, cross-postings GGF12 – 20 Sept 2004 - 9

  10. Incident response – near-term • JSG, EGEE MWSG/JRA3, OSG, …… • Site and VO registration policy and process • Control gathering, distribution and management of data • Sites need to understand requirements and responsibilities • Coverage, access, audit • Needs to be actively managed (? Self managed) • Operational Security Co-ordination Team (OSCT) • Ownership of security incidents • From notification to resolution • Liaise with national/institute CERTs • Ownership of known problems • Liaise with development & deployment groups • Co-ordination of monitoring • Post-mortem analysis • Team of experts GGF12 – 20 Sept 2004 - 10

  11. Security Co-ordination • How does OSCT map onto EGEE operations structures? • Resource Centres (lots) • Regional Operations Centres - ROC (~9) • Core Infrastructure Centres - CIC (~5) • Operations Management Centre - OMC (1) • Co-ordination with Open Science Grid ……… • Adopt same co-ordinating model GGF12 – 20 Sept 2004 - 11

  12. 2004 Security Service Challenges • Objectives • Evaluate the effectiveness of current procedures by simulating a small and well defined set of security incidents. • Use the experiences of a) in an iterative fashion (during the challenges) to update procedures. • Formalise the understanding gained in a) & b) in updated incident response procedures. • Provide feedback to middleware development and testing activities to inform the process of building security test components. • Exercise response procedures in controlled manner • Non-intrusive • Compute resource usage trace to owner • Run a job to send an email • Storage resource trace to owner • Run a job to store a file • Disruptive • Disrupt a service and map the effects on the service and grid GGF12 – 20 Sept 2004 - 12

  13. LCG/EGEE Incident Response Thank You Thank you to UK PPARC GGF12 – 20 Sept 2004 - 13

More Related