1 / 33

Process-oriented System Analysis

Process-oriented System Analysis. Process Mining. BPM Lifecycle. Motivation. Up until now : Designed or pre-defined models Assumption that they are appropriate Process Mining Consideration of information from the execution of proceses This is covered in log data Logs

akiva
Download Presentation

Process-oriented System Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Process-orientedSystem Analysis Process Mining

  2. BPM Lifecycle

  3. Motivation • Upuntilnow: • Designedorpre-definedmodels • Assumptionthattheyareappropriate • Process Mining • Considerationofinformationfromtheexecutionofproceses • This iscovered in log data • Logs • Sequenceof log entries, whichcaptureevents in a companythatrelatetoprocesses

  4. Log entries • Examples of log entries • Check Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:19:57 • Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“) completed on 12.11.2010 at 9:22:24 • Send Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:23:18 • Function ContactCustomer(c1987, PromoMailing) completed on 12.11.2010 at 9:24:10 • Function StoreCustomerData(„Miller“, c1988, „Osnabrück“) completed on 12.11.2010 at 9:26:08 • Check Invoice for Invoice No. 4568 completed on 12.11.2010 at 9:26:38 • Function ContactCustomer(c1988, PromoMailing) completed on 12.11.2010 at Send 9:27:32

  5. Logs bearvaluableinformation Logs bear valuable information to answer questions like • When and how many process instances have been executed? • Are there recurring patterns in the execution of activities? • Can process models be derived from the data? • Which paths of execution are used how often in the process models? • Are there paths which are never taken?

  6. Process Discovery • Process Discovery is a technique for deriving a process model from log data • Input: execution logs as ordered lists of activities with time stamp and case id • Output: process model which could have generated the execution logs • The caseidisoften not directlycovered in thedata, andneedstobegenerated in pre-processing

  7. Process Conformance • Process Conformance is a technique to analyze the relationship between log data and process models • Input: Logs and process model • Output: information on the relationship, e.g. fitness

  8. Overview

  9. Execution Logs • Assumption • Execution log definescompleteorderofevents, whichcan all berelatedtoprocessactivities • All events in theexecution log relatetoprocessinstancesoftheconsideredprocess • Hint • Often log entriesreferto different processmodels • This warrantsfilteringactivities • Abstraction • Techniquesoftenwork on abstractionoflogs • Focus on caseidandactivities

  10. Execution Log Format • Log format • (caseID, activity) • Example • Check Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:19:57 • Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“) completed on 12.11.2010 at 9:22:24 • Send Invoice for Invoice No. 4567 completed on 12.11.2010 at 9:23:18 • Resulting Log • (4567, Check Invoice), (c1987, StoreCustomerData), (4567, Send Invoice), etc.

  11. Execution Log • Further abstraction • A‘sandB‘s • (caseid, taskid) • Additional information • Event type, time, resource, data • Not consideredhere • Assumption • Activityexecutioncapturedbyoneevent • No intermediate activities case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

  12. The Alpha Algorithm

  13. Process Discovery Algorithms • SimplestAlgorithm: The α – Algorithm • Relatively simple, somepropertiescanbeproofed • AffectedbyNoise, therefore not firstchoice in practice • Noise referstoincompleteorerroneouslogs • Furthermore, the α+(+) – Algorithms • α+ and α++ areextensionstothe α – Algorithmforrecognizingmorefine-granular structure in theprocess model • Also affectedbyNoise • Finally, techniquesfordealingwithNoise

  14. Definitions • LetTbe a setofactivities (Tasks) andT *thesetof all sequencesofarbitrarylengthover T, thenwehave: • σ T * iscalledexecutionsequence, if all activitiesin σ belongtothe same processinstance • W T * iscalledexecution log (workflow log) • Assumptions • In eachprocessmodel, eachactivityappearsatmostonce • Eachdirectneighborrelationbetweenactivitiesisrepresentedat least once

  15. Execution Logs case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

  16. Execution Logs Execution sequences: Case 1: ABCD Case 2: ACBD Case 3: ABCD Case 4: ACBD Case 5: EF Resultingworkflow log: W = {ABCD, ACBD, EF} case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

  17. Order relations Log based order relationsforpairs of activitiesa, b  T in a workflow log W: • Directsuccessora>w bi.e. in an executionsequencebdirectlyfollows a • Causalityaw bi.e. a >w band not b >w a • Concurrencya║w bi.e.a >w bandb >w a • Exclusivenessaw bi.e. not a >w band notb >w a • Activitypairswhichneversucceedeachother

  18. Execution log analysis case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D • W = {ABCD, ACBD, EF} • Directsuccessor • Causality • Concurrency

  19. Execution log analysis • W = {ABCD, ACBD, EF} • Directsuccessor • Causality • Concurrency case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D 1) A>B A>C B>C B>D C>B C>D E>F 2) AB AC BD CD EF 3) B||C C||B

  20. α-Algorithm • The ideaistoutilize order relationsforderiving a workflownetthatiscompliantwiththeserelations • Precisely, each order relationresults in a petrinetfragment, whichimposestherespectiverelationship

  21. α-Algorithm • Idea (a)  a b

  22. α-Algorithm • Idea (b)  a b, a c andb #c

  23. α-Algorithm • Idea (c)  b d, c d andb #c

  24. α-Algorithm • Idea (d)  a b, a c andb || c

  25. α-Algorithm • Idea (e)  b d, c d andb || c

  26. The Alpha-Algorithm (simplified) • 1. Identify the set of all tasks in the log as TL. • 2. Identify the set of all tasks that have been observed as the first task in some case as TI. • 3. Identify the set of all tasks that have been observed as the last task in some case as TO. • 4. Identify the set of all connections to be potentially represented in the process model as a set XL. Add the following elements to XL: • a. Pattern (a): all pairs for which hold a→b. • b. Pattern (b): all triples for which hold a→(b#c). • c. Pattern (c): all triples for which hold (b#c)→d. • Note that triples for which Pattern (d) a→(b||c) or Pattern (e) (b||c)→d hold are not included in XL.

  27. The Alpha-Algorithm (cont.) • 5. Construct the set YL as a subset of XL by: • a. Eliminating a→band a→cif there exists some a→(b#c). • b. Eliminating b→cand b→dif there exists some (b#c)→d. • 6. Connect start and end events in the following way: • a. If there are multiple tasks in the set TI of first tasks, then draw a start event leading to an XOR-split, which connects to every task in TI. Otherwise, directly connect the start event with the only first task. • b. For each task in the set TO of last tasks, add an end event and draw an arc from the task to the end event.

  28. The Alpha-Algorithm (cont.) • 7. Construct the flow arcs in the following way: • a. Pattern (a): For each a→bin YL, draw an arc a to b. • b. Pattern (b): For each a→(b#c) in YL, draw an arc from a to an XOR-split, and from there to b and c. • c. Pattern (c): For each (b#c)→d in YL, draw an arc from b and c to an XOR-join, and from there to d. • d. Pattern (d) and (e): If a task in the so constructed process model has multiple incoming or multiple outgoing arcs, bundle these arcs with an AND-split or AND-join, respectively. • 8. Return the newly constructed process model.

  29. α-AlgorithmExample case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

  30. α-AlgorithmExample α-Algorithm case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D a(W):

  31. Log Completeness • Level ofcompletenessrequiredfor a log • Assumefortheexecutionsequence EF, thereis a log missing • Then, thecorrectprocessmodelcannotbederived • Basic assumption: eachexecutionsequence must bepartofthe log • Consequence: thecompletebehaviourisvisible • Problem: amountofrequiredinstancesgrowsdramatically • Example: • 10 activitiesareexecuted in parallel • Amountof potential executionsequences:10! = 3.628.800

  32. Log Completeness • Result • Fortheα-Algorithmitissufficienttohavecompleteness in termsofthesuccessorrelationship (>w) • Reason • All otherrelationsarederivedfromdirectsuccessorship • Interpretation • Each time twoactivitiesmaysucceedeachother, this must bevisible in at least oneexecutionsequence • Hint • In caseofhighlyconcurrentprocessmodels, thisreducestheamountofrequiredexecutionsequencesdramatically

  33. Summary • Execution Logs • Process Mining usingthe Alpha-Algorithm

More Related