pattern mining in system logs opportunities for process improvement n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Pattern mining in system logs: opportunities for process improvement PowerPoint Presentation
Download Presentation
Pattern mining in system logs: opportunities for process improvement

Loading in 2 Seconds...

play fullscreen
1 / 15

Pattern mining in system logs: opportunities for process improvement - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Pattern mining in system logs: opportunities for process improvement. Dolev Mezebovsky, Pnina Soffer, and Ilan Shimshoni. BPMDS, Amsterdam, June 2009. Background . The implementation of enterprise systems is often a driver for business process change.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Pattern mining in system logs: opportunities for process improvement' - sue


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
pattern mining in system logs opportunities for process improvement

Pattern mining in system logs: opportunities for process improvement

Dolev Mezebovsky, Pnina Soffer, and Ilan Shimshoni

BPMDS, Amsterdam, June 2009

background
Background
  • The implementation of enterprise systems is often a driver for business process change.
    • System implementation as an opportunity for redesigning business processes
    • Changes motivated by the need to adapt the enterprise to the system rather then the other way around
  • “Vanilla” implementations:
    • Implement basic functionality without modifications and make improvements afterwards
    • Cases of partial support to existing processes – people are forced to make workarounds and work inefficiently for the process to achieve its goal.
example
Example
  • Process: change a student’s study program
the problem addressed
The problem addressed
  • Many such cases may exist in an organization
  • At first: all users complain
  • With time, some users may get used to the inefficient way of working
  • The question: How to identify the inefficient processes and prioritize their improvement?
solution approach
Solution approach
  • The cases we are looking for include some repetition of a set of operations, as part of one “logical” task
  • These situations should be reflected in the event log of the system
  • Solution approach: mine for recurrent patterns of operations
defining a pattern basic concepts
Defining a pattern: basic concepts
  • Log entry=<User, Timestamp, Operation, ORSO>
    • ORSO: an ordered set of operands
    • Example: YPRESS, 13:50, Detach course, Fredrick, Linear Algebra, CS Minor.
  • For two entries in a log:
    • Invariant set: set of entry elements whose values are equal for the two entries
    • Variant set: set of entry elements whose values are different for the two entries
pattern identification
Pattern identification
  • Two entries are potentially in the same pattern if:
    • User  {Invariant}
    • Timestamp  {Variant}; |TS(1)-TS(2)| < Timeframe
    • {Operation, ORSO}  {Invariant} 
    • {Operation, ORSO}  {Variant} 
  • Potential pattern entry: <User, TimeRange, Operations, ORSOs>
  • The algorithm dynamically aggregates entries into potential pattern entries, seeking for largest possible patterns.
example1
Example
  • [(1),(2)] = [(1): < YPRESS, 13.45.52, Attach course, Fredrick, Linear Algebra, MIS Major>, (2): < YPRESS, 13.46.26, Attach course, Fredrick, Algorithms, MIS Major>]
    • (1, 2) : < YPRESS, (13.45.52, 13.46.26), Attach course, Fredrick, (Linear Algebra, Algorithms), MIS Major>
  • Second iteration: 
  • [(1, 2), (3)] = [(1, 2) : < YPRESS, (13.45.52, 13.46.26), Attach course, Fredrick, (Linear Algebra, Algorithms), MIS Major>, (3): < YPRESS, 13.47.44, Attach course, Fredrick, Data Structures, MIS Major>]
    • (1, 2, 3): < YPRESS, (13.45.52, 13.47.44), Attach course, Fredrick, (Linear Algebra, Algorithms, Data Structures), MIS Major>
from potential pattern to pattern type
From potential pattern to pattern type
  • Pattern type definition: <I, V>.
  • I: a set of invariant element types (Operation, operand type)
  • V: a set of variant element types (Operation, operand type)
  • Example:
    • I = {Operation, Student, Program}
    • V = {Course}
pattern metrics

.

Pattern metrics
  • The count CP of a pattern type P: the number of patterns of this type in the log file.
  • The average sizeASP of a pattern type P: the average number of entries in patterns of type P. Let P occur CP times in a log file, so occurrence i includes ni entries. Then:
  • The average timeATP of a pattern type p: the average time range (difference between the maximal and minimal timestamps) in patterns of type p.
identifying and prioritizing process improvement requirements
Identifying and prioritizing process improvement requirements
  • Find out which of the identified patterns reflects inefficient processes
    • By interviewing users
  • Prioritize patterns to be automated
    • By size-weighted count: SCP = ASP*CP
    • By time-weighted count: TCP = ATP*CP
conclusions
Conclusions
  • We address a situation where technology drives processes in an undesirable way
  • We utilize mining technology to identify and prioritize requirements for automating inefficient processes.
  • Our solution identifies recurrent patterns in the system log and provides metrics for prioritization.
future research
Future research
  • Finalize the overall algorithm
  • Experiment with the university log to evaluate the proposed method
    • Is it capable of identifying patterns that are a-priori known?
    • Ratio of real problems identified vs. patterns that reflect “normal” processes
    • Sensitivity to the timeframe parameter
  • Experiment with logs from other domains