1 / 28

A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows

A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows. Claudio Di Ciccio and Massimo Mecella. Process Mining. Definition.

dior
Download Presentation

A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows Claudio Di Ciccio and Massimo Mecella Claudio Di Ciccio (cdc@dis.uniroma1.it) IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2013) Wednesday, April the 17th, Singapore

  2. Process Mining Definition A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • Process Mining [Aalst2011.book], also referred to as Workflow Mining, is the set of techniques that allow the extraction of process descriptions, stemming from a set of recorded real executions (logs). • ProM [AalstEtAl2009] is one of the most used plug-in based software environment for implementing workflow mining (and more) techniques. • www.processmining.org 2

  3. A different context for process mining Artful processes and knowledge workers • Artful processes [HillEtAl06] • informal processes typically carried out by those people whose work is mental rather than physical (managers, professors, researchers, engineers, etc.) • “knowledge workers”[ACTIVE09] • Knowledge workers create artful processes “on the fly” • Though artful processes are frequently repeated, they are not exactly reproducible, even by their originators, nor can they be easily shared • Loosely structured • Highly flexible A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 3

  4. MINERful++ Our mining algorithm A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • MINERful++ is the workflow discovery algorithm of MailOfMine • Its input is a collection of strings T and an alphabet ΣT • Each stringt is a trace • Each character is an event (enacted task) • The collection represents the log • Its output is a declarative process model • What is a declarative process model? 4

  5. On the modeling of processes The imperative model • Represents the whole process at once • The most used notation is based on a subclass of Petri Nets (namely, the Workflow Nets) A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 5

  6. On the modeling of processes If A is performed, B must be perfomed, no matter before or afterwards (responded existence) The declarative model Whenever B is performed, C must be performed afterwards and B can not be repeated until C is done (alternate response) The notation here is based on [Pesic08,MaggiEtAl11] (ConDec, Declare) A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • Rather thanusing a procedural language for expressing the allowed sequence of activities, it is based on the description of workflows through the usage of constraints • the idea is that every task can be performed, except those which do not respect such constraints • this technique fits for processes that are highly flexible and subject to changes, such as artful processes 6

  7. Declare constraint templates Constraint templates as Regular Expressions (REs) A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 7

  8. On the modeling of processes Imperative vs. declarative Declarative Declarative models work better in presence of a partial specification of the process scheme Imperative A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 8

  9. A real discovered process model “Spaghetti process” [Aalst2011.book] A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 9

  10. The declarative specification of an artful process (e.g.) Scenario A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • A project meeting is scheduled • We suppose that a final agenda will be committed (“confirmAgenda”) after that requests for a new proposal (“requestAgenda”), proposals themselves (“proposeAgenda”) and comments (“commentAgenda”) have been circulated. • Shortcuts for tasks (process alphabet): • p (“proposeAgenda”) • r (“requestAgenda”) • c (“commentAgenda”) • n (“confirmAgenda”) 10

  11. The declarative specification of an artful process (e.g.) Constraints on activities • Existence constraints • Participation(n) • Uniqueness(n) • End(n) • Relation constraints • Response(r, p) • RespondedExistence(c, p) • Succession(p, n) • The agenda • must be confirmed, • only once: • it is the last thing to do. • During the compilation: • the proposal follows a request; • if a comment circulates, there has been / will be a proposal; • after the proposal, there will be a confirmation, and there can be no confirmation without a preceding proposal. A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 11

  12. MINERful++ Workflow discovery by constraints inference A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • MINERful++ is a two-stepalgorithm • Construction of a Knowledge Base • Constraints inference by means of queries evaluated on the KB • This allows the discovery of constraints through a fasterprocedure on data which are smaller in size than the whole input • Returned constraints are weighted with their support • the normalized fraction of cases in which the constraint is verified over the set of input traces 12

  13. MINERful++ by example Workflow discovery by constraints inference A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • In order to see how it works now • we see a run of MINERful++ over a string, compliant with the previous example:r r p c r p c r c p c n • We start with the construction of the algorithm’s Knowledge Base 13

  14. MINERful++ by example Building the “ownplay” of p and n r r p c r p c r c p c n r r p c r p c r c p c n A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • p • p occurred 3 times in 1 stringγp(3) = 1 • For each m ≠ 3γp(m) = 0 • p did not occur as the first nor as the last charactergi(p) = 0gl(p) = 0 • n • γn(1) = 1 • For each m ≠ 1, γn(m) = 0 • n occurred as the last character in 1 stringgi(n) = 0gl(n) = 1 14

  15. MINERful++ by example Building the “interplay” of p and n • With respect to the occurrence of p, n occurred… • Never before: 3 timesδp,n(-∞) = 3 • 2 char’s after: 1 timeδp,n(2) = 1 • 6 char’s after: 1 timeδp,n(6) = 1 • 9 char’s after: 1 timeδp,n(9) = 1 • Repetitions in-between: • onwards: 2 timesb→p,n = 2 • backwards: neverb←p,n = 0 • Looking at the string • r r p c r p c r c p c n • r r p c r p c r c p c n • r r p c r p c r c p c n • r r p c r p c r c p c n • r r p c r p c r c p c n • r r p c r p c r c p c n A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 15

  16. MINERful++ by example Building the “interplay” of r and p b→r,p = 1 b←r,p = 0 r rp c r p c r c p c n A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 16

  17. MINERful++ Computing the support of constraints A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • Let Gr and Gp be the number of times in which r and p respectively appear in the log we have, e.g., • support for Response(r, p) • hint: how many times p was not read in the traces after r occurred?In those cases, Response(r, p) does not hold • support for Succession(r, p) • how many times p was not read in the traces after r occurred, nor r was read before p occurred?In those cases, Succession(r, p) does not hold • … 17

  18. MINERful++ Computing the support of constraints A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 18

  19. MINERful++ by example Computing the support of constraints • r r p c r p c r c p c n • r c c r p p c p c r c p p n • c c c c c r r p c r r p r r p r p p p n • c p r p n • r p p n • r r r c r c p n • r p c n • c r p r r c c p p n • p c c n • c r r c c c r r p c p c c p n • Support for… • Response(r, p)1.0 • Succession(r, p)0.96364 • Precedence(c, n)0.9 • The support can be used to prune out those constraints falling under a given threshold • E.g., 0.95 • A thresholdequal to 1.0 selects those constraints which are always validon the log A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 19

  20. Relation constraint templates subsumption Constraint templates are not independent of each other A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • E.g., • A trace likea b a b c a b c csatisfies (w.r.t. b and a): • RespondedExistence(a, b), RespondedExistence(b, a),CoExistence(a, b), CoExistence(b, a), Response(a, b), AlternateResponse(a, b), ChainResponse(a, b), Precedence(a, b), AlternatePrecedence(a, b), ChainPrecedence(a, b),Succession(a, b), AlternateSuccession(a, b),ChainSuccession(a, b) • The mining algorithm would have to show the most strict constraint only (ChainSuccession(a, b)) • MINERful++faces this issue, by pruning the returned constraints on the basis of the subsumption hierarchy of constraints 20

  21. Relation constraint templates subsumption Constraint templates are not independent of each other A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 21

  22. Relation constraint templates subsumption A hint on the pruning procedure ✖ ✖ 1.0 1.0 ✖ ✖ ✖ 1.0 1.0 1.0 ✖ ✖ ✖ 1.0 1.0 1.0 ✖ ✖ 1.0 1.0 1.0 A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 22

  23. Relation constraint templates subsumption A hint on the pruning procedure Support ✖ ✖ 0.9 0.9 ✖ ✖ 0.9 0.9 0.9 ? ✖ ✖ ? ✖ 0.9 0.7 0.8 ✖ ? ✖ ? 0.9 0.7 0.8 A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 23

  24. Evaluation Characteristics of MINERful++ Sony VAIO VGN-FE11H Intel Core Duo T2300 1.66 GHz 2 MB L2 cache 2 GB of DDR2 RAM @ 667 Mhz A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • MINERful++ is • Independent on the formalism used for expressing constraints • Modular (two-phase) • Capable of eliminating redundancy in the process model • Fast 24

  25. Evaluation On the complexity of MINERful++ • Linear w.r.t. the number of traces in the log|T| • Quadratic w.r.t. the size of traces in the log|tmax| • Quadratic w.r.t. the size of the alphabet|ΣT| • Hence, polynomial in the size of the inputO(|T|·|tmax |2·|ΣT |2) A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 25

  26. Preliminary results MINERful++ on real data • Logs extracted from 2 mailboxes • [DiCiccioMecella2012] • 5 traces • 34.75 events each on average • 139 events read in total • Evaluation of inferred constraints conducted with an expert domain • Precision ≈ 0.794 A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows 26

  27. Future work Research in progress A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • Integrate MINERful++ with ProM • MINERful++ is already capable of reading/writing XES logs • Study the effects of errors in logs on the inferred workflow • Error-injected synthetic logs can help us conduct an automated analysis on the quality of results • Refine the estimation calculi for support • Auto-tune the support threshold • Depending on the constraint template, which can be more or less “robust” • Enlarge the set of discovered constraint templates to the branching Declare constraints 27

  28. References Cited articles and resources, in order of appearance A Two-Step Fast Algorithm for the Automated Discovery of Declarative Workflows • [Aalst2011.book] van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer (2011). • [AalstEtAl2009] van der Aalst, W.M.P., van Dongen, B.F., Güther, C.W., Rozinat, A., Verbeek, E., Weijters, T.: Prom: The process mining toolkit. In de Medeiros, A.K.A., Weber, B., eds.: BPM (Demos). Volume 489 of CEUR Workshop Proceedings., CEUR-WS.org (2009) • [HillEtAl06] Hill, C., Yates, R., Jones, C., Kogan, S.L.: Beyond predictable workflows: Enhancing productivity in artful business processes. IBM Systems Journal 45(4), 663–682 (2006) • [ACTIVE09] Warren, P., Kings, N., et al.: Improving knowledge worker productivity - the active integrated approach. BT Technology Journal 26(2), 165–176 (2009) • [Pesic08] Pesic, M.: Constraint-based Workflow Management Systems: Shifting Control to Users. Ph.D. Thesis. Technische Universiteit Eindhoven, 2008. • [MaggiEtAl11] Maggi, F.M., Mooij, A.J., van der Aalst, W.M.P.: User-guided discovery of declar- ative process models. In: CIDM, IEEE (2011) 192–199 • [DiCiccioEtAl2012] Di Ciccio, C., Mecella, M., Scannapieco, M., Zardetto, D., Catarci, T.: MailOfMine – analyzing mail messages for mining artful col- laborative processes. In: Data-Driven Process Discovery and Analysis. Springer, 55-81 (2012). 28

More Related