1 / 18

Incident Response in EGEE - Creating a Capability/Service for Effective Incident Management

This presentation discusses the goal and motivation behind creating an Incident Response capability/service in EGEE. It covers definitions of incidents and incident response, possible steps for establishing a capability, and standards and practices to follow. The presentation also explores the specifics of incidents in the Grid and the relationship between incident response and intrusion detection.

mmoe
Download Presentation

Incident Response in EGEE - Creating a Capability/Service for Effective Incident Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MWSG2 June 16, 2004 www.eu-egee.org JRA3 - Incident Response General IssuesYuri Demchenko<demch@science.uva.nl> EGEE is a project funded by the European Union under contract IST-2003-508833

  2. Outlines Goal and motivation Incidents and Incident Response – Definitions Creating Incident Response Capability/Service Incident Response in EGGE Possible steps - Discussion

  3. Goal and Motivation The goal of this presentation is to introduce into and rise awareness about the Incident Response problem area • How to create an Incident Response Capability? • What to respond? • What standards and practices to follow? • What may be the first steps?

  4. Incident Response – Definitions • Incident • Specifics of perceived Grid Incidents • Incident Response • Incident Response vs Intrusion Detection

  5. Incident • A computer/ITC security incident is defined as any real or suspected adverse event in relation to the security of a computer or computer network. Typical security incidents within the ITC area are: a computer intrusion, a denial-of-service attack, information theft or data manipulation, etc. • An incident can be defined as a single attack or a group of attacks that can be distinguished from other attacks by the method of attack, identity of attackers, victims, sites, objectives or timing, etc. • An Incident in general is defined as a security event that involves a security violation. This may be an event that violates a security policy, UAP, laws and jurisdictions, etc. • A security incident may be logical, physical or organisational, for example a computer intrusion, loss of secrecy, information theft, fire or an alarm that doesn't work properly. A security incident may be caused on purpose or by accident. The latter may be if somebody forgets to lock a door or forgets to activate an access list in a router.

  6. Incident – any specifics for Grid? • Depends on the scope and range of the Security Policy, ULA, or SLA • Should be based on threats analysis and vulnerabilities model • Should be based on Grid processes/workflow analysis • Is there a definite model and clear vision of these processes? • LCG definition of the Grid Job/Task submission • Job submission will normally progress from a User Interface (UI) machine, through a Resource Broker (RB) to a Computing Element (CE) and hence to the compute resource (usually a batch system). In some cases the RB is not used and the UI submits the job directly to the CE. Data access is through a Storage Element (SE) service • Q: Should we distinguish between Incidents with the Grid applications and processes and those with the underlying infrastructure? • Who will handle either of them?

  7. Grid risks and threats analysis • LCG Risk Analysis – is a good starting point • http://proj-lcg-security.web.cern.ch/proj-lcg-security/RiskAnalysis/risk.html • Classified by Misuse, Confidentiality and Data integrity, Infrastructure disruption and Accidental categories • Known analyses of Grid Security Incidents nature mostly focus on vulnerabilities of AuthN/Z and Certificate compromise • E.g., Dane Skow’s “A walk through a Grid Security Incident” • However, question remains: • How to define at early stage that PKC compromised?

  8. Incident response Incident response includes three major groups of actions/services • Incident Triage • Assessing and verification incoming Incident Reports (IR) • Incident Coordination • Categorisation Incident information, forwarding IR around and arranging interaction with other CSIRTs, ISPs and sites • Incident Resolution • Helping a local site (victim) to recover from an incident - in most cases offered as optional services.

  9. Incident Response and Intrusion Detection Intrusion Detection normally is a component of the network infrastructure/services Intrusion Detection Systems (IDS) or Sensors are installed on or close to Firewalls, Routers, Switches or run as a special program on logfiles ID produces alerts to prevent suspected activity escalation to Incident ID is rather proactive service Incident Response is a complex of designated people, policies and procedures Incident Response is a reactive function Q: Do we need to tackle Intrusion Detection in JRA3? ID/Network protection is a responsibility of Network Operator or Team May be outsourced to network provider or hosting organisation CSIRT often has an influence on network security policy and IDS policy/criteria

  10. Incident Response Infrastructure/Components • CSIRTs • Organisational form depends on type of organisation and required level of support to community • Security Policy • Define what is required/allowed/acceptable • Incident Response Policy • What is provided, who receives it and who provides support • Incident Response Plan • Which incidents will be responded and how • RFC 2350 – defines template for Incident Response Policy

  11. Types of CSIRTs • Security Group • Not formally a CSIRT but may be a first step to create a CSIRT • Distributed (Internal) CSIRT • Has well defined constituency, central office and (minimum) designated staff • Most of staff is sharing responsibility or on duty • Maintains common Security and Incident Response policy • Publish Advisories, Warnings, Reports, Recommendations • Coordinating CSIRT • Coordinates wide range of Incident Response activities • Creates and maintains common Security and Incident Response policy • Publish Advisories, Warnings, Reports, Recommendations

  12. Incident Response Policy • Types of Incidents and Level of Support • Ordered by severity list of Incident categories • Co-operation, Interaction and Disclosure of Information • Based on organisation’s Security Policy • Availability of information and ordered list of information being considered for release both personal and vendor’s • Communication and Authentication • Information protection during communication • Mutual authentication between communicating parties • Also depending on information category

  13. Incident Response Procedures Should be documented in full or in critical parts • Initial Incident Reporting and Assessment • Progress Recording • Identification and Analysis • Notification – initial and in the progress • Escalation – by Incident type or service level • Containment • Evidence collection • Removal and Recovery

  14. Incident Response in EGEE • Actual Incident Response will be done at GOC • By Security Groups or Internal/External CSIRTs • Incident Coordination for EGEE • Coordinating Central or Distributed CSIRT servicing EGEE infrastructure • To start this activity • Inventory and Taxonomy • Contacting GOC/sites and building awareness • Training and Education • First CSIRT Training workshop at 2nd EGEE (or even around GGF12?) • Establishing central EGEE coordinating CSIRT • Staffing • Defining policies and procedures, formats and forms • Promoting and building network of contacts

  15. What do we have? LCG documents for sites – good starting point and initial framework • Organisation of security on LCG-1 • To implement the LCG-1 security procedures and to respond to security incidents, each LCG-1 Regional Centre and each LCG-1 site must designate a security officer • Rem: Need to be structured according to common CSIRT practices • LCG Security Policy specifies (not detailed) • Physical Security • Network Security • Access Control • Rem: Refers to site Policies but are they defined?

  16. Standards and Practices • Incident Response and Incident Handling • Standards and Recommendations on Incident Response procedures and CSIRT operation • IETF, NIST, TI/TF-CSIRT (TERENA), CERT/CC • Formats and Protocols • IDMEF – Intrusion Detection Message Exchange Format • IODEF – Incident Object Description and Exchange Format • Emerging RID – Real-time Internetwork Defense (supported by US AFC) • Trace Security Incidents to the Source • Stop or Mitigate the Effects of an Attack or Security Incident • CSIRT community and CSIRT certification • Important component of creating world-wide Incident Response infrastructure

  17. Tools • Intrusion Detection automation • Snort with IDMEF support (by Silicon Defense) • Benefits in simple integration, information exchange and easy outsourcing • Implemented also by CERT/CC in their AirCERT distributed System • Incident Handling • Mostly proprietary systems with growing move to standardisation of exchange format based on IODEF • IODEF Pilot implementation • CERT/CC AirCERT Automated Incident Reporting - http://www.cert.org/kb/aircert/ and http://aircert.sourceforge.net/ • JPCERT/CC: Internet Scan Data Acquisition System (ISDAS) - http://www.jpcert.or.jp/isdas/index-en.html • eCSIRT.net: The European CSIRT Network - http://www.ecsirt.net

  18. Summary – next steps • Inventory and Taxonomy • Contact with GOC/ROC • Decide on organisational structure for EGEE Incident Response Capability/Infrastructure • Prepare 1st CSIRT Workshop

More Related