Using Provenance to Support Real-Time Collaborative Design of Workflows Workflow evolution provenance and OPM Tommy Ellkvist and Juliana Freire
Workflow Evolution Version Tree Workflows Data Products
Action based representation of workflows u Nodes represents workflows u Edges represents actions u Actions are transformations on workflows u Actions are performed by users Add Module(0) Add Module(1) 3 Add Connection(0,1)
OPM XML schema: Example of OPM (The OPM, 2007)
OPM XML schema: Translated OPM Example 1 G O 1 G O … 1 G 2 O 3 O … G O O O O O G …
Vistrails XML Model
Vistrails XML Model: Translated to OPM <Used ProcessId = "1" Role = "in" ArtifactId = "0"stopTimeBegin = " :35:39" stopTimeEnd = " :35:39"> G G G <WasGeneratedBy ArtifactId = "1" Role = "out" ProcessId = "1” stopTimeBegin = " :35:39” stopTimeEnd = " :35:39"> G G G <WasControlledBy ProcessId = "1" AgentId = "concat.xml" startTimeBegin = " :35:39” startTimeEnd = " :35:39” stopTimeBegin = " :35:39” stopTimeEnd = " :35:39"> G G G concat.xml G 0 G 1 G 2 G 3 G 1 G 2 G 3
Observations u General model –Only contains enough information to traverse the provenance graph –No additional information stored u Different ways of representing workflow design provenance –Edges as actions –Edges as version differences
Observations u What is the time? –How to interpret a time T of a process? –Does interpretation affect querying –Semantics of intervals u Who is the Agent? –Users –Workflow system –The session –Workflow specification u ”OPM Level 2”? –Are ther workflow specifics we want to express
Interoperability