Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataWarp: Making Progress Despite Inconsistent Data Stephen Crouch Peter Henderson Robert John Walters School of Electronics and Computer Science, University.

Similar presentations


Presentation on theme: "DataWarp: Making Progress Despite Inconsistent Data Stephen Crouch Peter Henderson Robert John Walters School of Electronics and Computer Science, University."— Presentation transcript:

1 DataWarp: Making Progress Despite Inconsistent Data Stephen Crouch Peter Henderson Robert John Walters School of Electronics and Computer Science, University of Southampton, UK

2 Outline Background Traditional Philosophy DataWarp Example Conclusion

3 Modern Systems: No longer exist in private environments Are connected to each other Use data which Is (at least partially) replicated Can be out of date Contains errors They don’t own

4 Traditional approach “Everything must have a correct value” We must drive out the imperfections Implement systems to make sure data remains consistent Don’t do anything unless sure it is right

5 Examples Transactions Elaborate schemes which ensure data remains consistent Compensations Less elaborate and restrictive Relax some restrictions of transactions but expose intermediate states

6 Single Datum World Transactional systems never leave left-most column Compensation systems can, but Temporarily Make sure they know how to get back

7 But we can never achieve full consistency “Inconsistencies” which are deliberate Different notions of consistency Ownership Cost The accumulated body of data is too big

8 DataWarp, an alternative We can’t “fix” the data so: We have to “fix” the applications DataWarp Can’t give up when inconsistency found Do the best you can with what you have Be prepared to make corrections

9 Single Datum World DataWarp: Accepts being in leftmost column is unlikely

10 Grid Scheduling Example Classical approach to any workflow Find and execute the first task Wait for it to complete Execute the next task … Works, but time wasted waiting

11 Example Workflow as Text Data DI # Input Data DJK # Output J or K Data DA,DB,DC,DH # Other Output Job A,B,C,H,J,K # Tasks A.submitJob(DI) A.waitFor() DA = A.getResults() parallel { B.submitJob(DA) B.waitFor() DB = B.getResults() } and { H.submitJob(DA) DH = H.getResults() if ( some_predicate(DH) ) { J.submitJob(DH) J.waitFor() DJK = J.getResults() } else { K.submitJob(DH) K.waitFor() DJK = K.getResults() } C.submitJob(DB, DJK) C.waitFor() DC = C.getResults()

12 Example Workflow as Diagram

13 Notice Both B and H can start as soon as A completes and can run at the same time Whether we do J or K depends on result of H C needs output from B and J or K Processing time for each job includes waiting in the queue

14 Execution times:

15 Optimisations 1 Anticipation Put jobs in the queue so they come to the head of the queue just as we have the data to execute them Run more than one job at a time Users do this manually Jobs put in slow moving queues ready for when needed

16 The Schedule ProcessExecution TimeDelay for placeholder job A70 B207 C4327 H57 J1112 K8

17 Features Start B, H together Sequentially C finishes at 116 By running B in parallel with H,J,K this improves to 88 Anticipating need for jobs this is improved to 76

18 Optimisations 2 Suppose queue prediction is too pessimistic Jobs for J,K arrive at head of queue while H still working Start both Abandon one when H completes Suppose H fails/still working when B finishes and C ready Pick output from one of J,K Complete the workflow

19 Conclusion Applications have to manage in connected environment Insisting on complete, consistent data is no longer acceptable DataWarp applications can live with uncertain data They continue where others fail

20


Download ppt "DataWarp: Making Progress Despite Inconsistent Data Stephen Crouch Peter Henderson Robert John Walters School of Electronics and Computer Science, University."

Similar presentations


Ads by Google