Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert.

Similar presentations


Presentation on theme: "Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert."— Presentation transcript:

1 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert Podhorszki Ilkay Altintas Bertram Ludaescher in collaboration with Shawn Bowers Timothy McPhillips

2 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Initial Provenance Framework (IPAW’06, Altintas et al.) Vision: –Modeled as a separate concern in the system Optional drag and drop feature –Listen to execution and save information (customizable): Context: who, what, where, when, and why that is associated with the run Input data and its associated metadata Workflow outputs and intermediate data products Workflow definition (entities, parameters, connections): a specification of what exists in the workflow and can have a context of its own Information about the workflow evolution -- workflow trail

3 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Kepler System Architecture Authentication GUI Vergil SMS Kepler Core Extensions Ptolemy …Kepler GUI Extensions… Actor&Data SEARCH Type System Ext Provenance Recorder Kepler Object Manager Documentation Smart Re-run / Failure Recovery IPAW’06-Altintas et al.

4 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Kepler Provenance Recorder (IPAW’06, Altintas et al) Parametric and customizable –Different report formats –Variable levels of verbosity all, some, medium, on error –Multiple cache destinations Saves information on –User name, Date, Run, etc…

5 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Read-Write-ReSet Model (IPAW’06, McPhillips et al) r, r …. r, w, w, … w, r, … r, w,... w, … firing what about actor state? what about “real” dependencies? reset event s defines when actor “cuts off” dependencies –a semantic notion, known to the actor [developer] (or part of a higher- order scheme) r, r …. r, w, w, … w, [s!] r, … r, w,... w, … A3 r … rw…w [s!] PS ???

6 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Goals of the PR+RWS Experiments Use the RWS model for Kepler workflows –both single-level and nested workflows (fun starts here :-) Extend the Kepler Provenance Recorder –Modify the methods of the provenance listener class –Classes to store execution data about the workflow To generate the send-receive relations of the tokens correctly To count actor firings correctly Disclaimer: Initially only one workflow run is targeted –(but approach can handle multiple actor firings due to pipeline parallelism.. ) –future: queries over several runs and workflow-provenance –(others in Kepler already doing this  merge efforts in the future)

7 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Implementation: Data Model Port-actor relationship –portTable(Port, Actor, type) type is r as real and v as virtual (transparent) Token-object relationship –tokenTable(Token, Object) Object-value relationship –objectTable(Object, Value, Type) type is currently not recorded RWS trace –traceTable(Port, Event, Token, FiringCounter) event: r as read, w as write or s as state-reset

8 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Implementation: Class Hierarchy Extends the existing provenance execution listener with –Methods –More event listeners –Supporting classes RWSPortInfo, RWSActorInfo –Data structures for building and containing info about the workflow (and counters for event record RWSEvent –Handles RWS events

9 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 initialize() Generate RWS portMap Generate RWS actorMap Record static wf info Create new RWS event list Initialization phase RWSPortInfo (info locally known at a port) RWSPortInfo (build connection info) for each port RWSActorInfo for each actor portTable Execution: Initialization

10 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Execution: Event Handling and Modifications validate() Before model is executed. Subscribe to token listeners TokenSend TokenGet changeExecuted() Sth is changed in the workflow Re-generate RWS portMap Just before run When the workflow is modified event handling methods are extended here

11 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Execution: During the workflow run TokenSendEvent() New RWS event w When a token event occurs TokenGetEvent() Print sent token’s info (token id, object id, value) Generate virtual TokenGet event For each connected transparent port New RWS event r Generate virtual TokenSend event If it is a transparent port tokenTable traceTable objectTable

12 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 A Kepler Workflow Implementation RWS TRACE Table # of elements size in KB portTable 81 4 tokenTable 30 2 objectTable 30 3 traceTable 86 6

13 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Query 1.a Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc. Answer a. list of actors that contributed to the result: (21 actors). They appear in reversed order as they were executed. ?- q1b_actors('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ActorList), print(ActorList). [.pc.Convert_x,.pc.Slicer_x,.pc.SoftMean,.pc.Reslice3,.pc.Reslice2,.pc.Reslice4,.pc.Reslice1,.pc.AlignWarp3,.pc.RefImg,.pc.RefHdr,.pc.InputHdr3,.pc.InputImg3,.pc.AlignWarp2,.pc.InputHdr2,.pc.InputImg2,.pc.AlignWarp4,.pc.InputHdr4,.pc.InputImg4,.pc.AlignWarp1,.pc.InputImg1,.pc.InputHdr1 ]

14 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Query 1.b Answer b. list of intermediate values created by the workflow (26 values). ?- q1b_values('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ValueList), print(ValueList). ["/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage4/atlas-x.pgm", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp3.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp2.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp4.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy4.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp1.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.hdr” ]

15 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Improved PC workflow (cf. COMAD wf) RWS TRACE Table # of elements size in KB portTable 42 2 tokenTable 51 3 objectTable 39 4 traceTable 150 9 A more generic workflow to accepts any number of images Smaller number of actors This effects the number of values as it requires additional array operations cf. also COMAD approach and Taverna approach (but we fire AlignWrap individually here)

16 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Improved PC workflow

17 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Query 1 Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc. Answer a. list of actors that contributed to the result: (15 actors). They appear in reversed order as they were executed. ?- q1b_actors('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ActorList), print(ActorList). [.pca.Convert,.pca.Slicer,.pca.hdrrepeat,.pca.seqXYZ,.pca.imgrepeat,.pca.SoftMeanArray,.pca.imgarray,.pca.hdrarray,.pca.Reslice,.pca.AlignWarp,.pca.RefHdr,.pca.InputHdr,.pca.InputImg,.pca.RefImg,.pca.Ramp ]

18 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Query 1 Answer b. list of intermediate values created by the workflow (33 values). It includes internal data values (arrays) additionally to the original file names. ?- q1b_values('"/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif"', ValueList), print(ValueList). [ "/usr/home/pnorbert/Provenance/ProvCh/data/output/atlas-x.gif", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage4/atlas-x.pgm", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.hdr", "x", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage3/atlas.img", { "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img" }, { "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.hdr" }, "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced2.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced3.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage2/resliced4.img", "/usr/home/pnorbert/Provenance/ProvCh/data/out-stage1/warp1.warp", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.hdr", "/usr/home/pnorbert/Provenance/ProvCh/data/input/anatomy1.img", "/usr/home/pnorbert/Provenance/ProvCh/data/input/reference.img", 1, etc...

19 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Nested workflow tricky example S

20 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 The trick Multi-port of Ptolemy –two distinct channels going into S and out from S –A’s output is delivered to S.C –B’s output is delivered to S.D –S.C’s output is delivered to E –S.D’s output is delivered to F

21 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Lineage of actors and values Who contributed to value D.2 arrived at F? ?- q1('"D.2"', ActorList, ValueList). ActorList = ['.WF15.S.D', '.WF15.S', '.WF15.B'] ValueList = ['"D.2"', '2', '2'] Who contributed to value C.1 arrived at E? ?- q1('"C.1"', ActorList, ValueList). ActorList = ['.WF15.S.C', '.WF15.S', '.WF15.A'] ValueList = ['"C.1"', '1', '1']

22 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Single-level lineage of actors and values Who contributed to value D.2 arrived at F? ?- q1b('"D.2"', ActorList, ValueList). ActorList = ['.WF15.S', '.WF15.B'] ValueList = ['"D.2"', '2'] Who contributed to value C.1 arrived at E? ?- q1b('"C.1"', ActorList, ValueList). ActorList = ['.WF15.S', '.WF15.A'] ValueList = ['"C.1"', '1']

23 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Conclusions 1 st attempt combining Kepler PR & Kepler RWS provenance model –Both published in IPAW 2006 Query 1 was successfully answered. Queries 2 and 3 are answerable, but hadn’t been implemented yet. Queries on multiple runs and workflow design provenance is out of the scope of this initial prototype. –Other groups in Kepler focusing on this.

24 Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance Challenge @ GGF18 Some related references Provenance Framework/Recorder: –Provenance Collection Support in the Kepler Scientific Workflow System,I.Altintas, O. Barney, E. Jaeger-Frank, IPAW2006, Chicago, Illinois, May 2006.Provenance Collection Support in the Kepler Scientific Workflow System RWS Model: –A Model for User-Oriented Data Provenance in Pipelined Scientific Workflows, Shawn Bowers, Timothy McPhillips, Bertram Ludaescher, Shirley Cohen, Susan B. Davidson. International Provenance and Annotation Workshop (IPAW'06), Chicago, Illinois, USA, May 3-5, 2006.A Model for User-Oriented Data Provenance in Pipelined Scientific WorkflowsIPAW'06


Download ppt "Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert."

Similar presentations


Ads by Google