Presentation is loading. Please wait.

Presentation is loading. Please wait.

3/7/2003Bioinformatics1 How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspect-oriented Programming Techniques.

Similar presentations


Presentation on theme: "3/7/2003Bioinformatics1 How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspect-oriented Programming Techniques."— Presentation transcript:

1 3/7/2003Bioinformatics1 How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspect-oriented Programming Techniques + Overview of Bioinformatics at NEU. Karl Lieberherr (lieber@ccs.neu.edu) College of Computer and Information Science Northeastern University Boston

2 3/7/2003Bioinformatics2 Motivation  From: Computational Challenges in Structural and Functional Genomics by J. Head-Gordon,  From: Computational Challenges in Structural and Functional Genomics by J. Head-Gordon, IBM SYSTEMS JOURNAL, VOL 40, NO 2, 2001.

3 3/7/2003Bioinformatics3 Some Quotes From Head- Gordon.   Although techniques for warehousing techniques are as vital in the sciences as in business, functional warehouses tailored for specific scientific needs are few and far between.   A key technical reason for this discrepancy is that our understanding of the concepts being explored in an evolving scientific domain change constantly, leading to rapid changes in data representation.

4 3/7/2003Bioinformatics4 Some Quotes From Head- Gordon (Refinement).   … evolving scientific domain change constantly, leading to rapid changes in data representation.   Not only changes in data representation but also changes in interfaces – need protection against changes in interfaces.   Examples: additional or modified fields or arguments; additional or modified types.

5 3/7/2003Bioinformatics5 More Quotes From Head- Gordon.   When the format of source data changes, the warehouse must be updated to read that source or it will not function properly. The bulk of these modifications involve extremely tedious, low-level translation and integration tasks that typically require the full attention of both database and domain experts. Given the lack of the ability to automate this work, warehouse maintenance costs are prohibitive, and warehouse “up-times” severely restricted.

6 3/7/2003Bioinformatics6 Protect Against Changes.  Protection against changes in data representation and interfaces. Traditional technique: information-hiding is good to protect against changes in data representation. Does not help with changes to interfaces.  Need more than information hiding to protect against interface changes: restriction through shy programming, called Adaptive Programming (AP). ImplementationInterfaceClient Information HidingShy Programming

7 3/7/2003Bioinformatics7 Problem with Information Hiding  Shy Programming builds on the observation that traditional black-box composition is not restricting enough. We use the slogan: information hiding is not hiding enough. Blackbox composition isolates the implementation from the interface, but does not decouple the interface from its clients.

8 3/7/2003Bioinformatics8 Cover unimportant parts of the interface  To permit interfaces to evolve, self-discipline is required to prevent from programming extensively against the interface. Certain parts of the interface are best left as if they were covered. ImplementationInterfaceClient Information HidingShy Programming

9 3/7/2003Bioinformatics9 Shy Programming = Adaptive Programming  This disciplined programming is referred to as shy programming. Shy programming lets the program recover from (or adapt to) interface changes. Shy programming is also called Adaptive Programming (AP). This is similar to the shyness metaphor in the Law of Demeter (LoD): structure evolves over time, thus communicate with just a subset of the visible objects.

10 3/7/2003Bioinformatics10 Decoupling of Interface  We summarize the commonalities and differences between black-box composition and Shy Programming into two principles. –Black-box Principle: the representation of objects can be changed without affecting clients. –Shy-Programming Principle: the interface of objects can be changed within certain parameters without affecting clients.  It is important to notice that the Shy-Programming Principle builds on top of the Black-Box principle.

11 3/7/2003Bioinformatics11 Manager Metaphor.  A manager M is managing a set of group leaders G, each one managing a set of workers W. We consider issues related to informing M and requesting information from M. We use this example to illustrate three points. –Micromanager – no information restriction. –Shyness – helps information restriction. –Complex requests – help information restriction and optimization. Want to learn about organizing bioinformatics knowledge. M G W

12 3/7/2003Bioinformatics12 Manager Metaphor.  Micromanager – no information restriction. –If the manager is a micromanager (a manager that wants to know about and rely on all the details of the worker’s projects), the managing approach is brittle because when there is a change in the details of one of the worker’s projects, the manager needs to be notified. M G W

13 3/7/2003Bioinformatics13 Manager Metaphor.  Micromanager – no information restriction (continued). –An object-oriented program written in the usual way corresponds to the manager that likes to micromanage. It is full of detailed knowledge of the class graph. An alternative way of formulating the same idea is to observe that it is good when the workers are shy. A shy worker will only share minimal, high-level information with the group leader. And this will prevent a brittle situation where the group leaders and manager rely on too much detail. M G W

14 3/7/2003Bioinformatics14 Manager Metaphor.  Shyness – helps information restriction –It is good for the workers to be shy and only talk to their group leader and not to the manager directly. (Shyness has two facets: talk only to a few friends AND share minimal information with them. Here we use the first facet while in the previous point we used the second facet.) The group leader will abstract the information from the workers and only pass on the abstract information to the manager. This will prevent the manager from micromanaging. This variant can be viewed as an application of the Law of Demeter (LoD) which states that an object should talk only to closely related objects. The closely related object for a worker is the group leader and not the manager. MG W

15 3/7/2003Bioinformatics15 Manager Metaphor.  Shyness – helps information restriction (continued). –The motivation is that when things change at the worker level, the manager does not have to be informed necessarily. The group leader will be informed and will decide whether the information needs to be passed up. M G W shielded

16 3/7/2003Bioinformatics16 Manager Metaphor.  Complex requests – help information restriction and optimization. –The manager does not want to be bothered by many simple requests from the many workers. Instead the manager prefers to get a complex request from time to time from a group manager. The complex request offers the manager the possibility to see all the requests as a whole and to optimize the overall result which would not be possible if simple requests come one by one and need to be satisfied immediately before the totality of all simple requests is seen.

17 3/7/2003Bioinformatics17 Manager Metaphor.  Complex requests – help information restriction and optimization (continued). –The same point applies to programming: instead of sending an object a lot of individual data access requests, it is better to send one complex request that can be treated as a whole and optimized accordingly.

18 3/7/2003Bioinformatics18 Aspect-oriented Programming (AOP).  AOP is programming with aspects. An aspect is a complex request to modify the execution of a program. May expose a large interface. This can be implemented efficiently by inserting code at compile time into the program. An aspect should be shy with respect to the program it modifies.

19 3/7/2003Bioinformatics19 AOSD: not every concern fits into a component: crosscutting Goal: find new component structures that encapsulate “rich” concerns

20 3/7/2003Bioinformatics20 A Reusable Aspect. abstract public aspect RemoteExceptionLogging { abstract pointcut logPoint(); after() throwing (RemoteException e): logPoint() { log.println(“Remote call failed in: ” + thisJoinPoint.toString() + “(” + e + “).”); } public aspect MyRMILogging extends RemoteExceptionLogging { pointcut logPoint(): call(* RegistryServer.*.*(..)) || call(private * RMIMessageBrokerImpl.*.*(..)); } abstract

21 3/7/2003Bioinformatics21 Good Aspects Are Shy. abstract aspect CapabilityChecking { pointcut invocations(Caller c): this(c) && call(void Service.doService(String)); pointcut workPoints(Worker w): target(w) && call(void Worker.doTask(Task)); pointcut perCallerWork(Caller c, Worker w): cflow(invocations(c)) && workPoints(w); before (Caller c, Worker w): perCallerWork(c, w) { w.checkCapabilities(c); }

22 3/7/2003Bioinformatics22 Lessons From Manager Metaphor.  Information hiding does not hide enough. Information hiding makes all public interfaces available and (Micromanager) makes the point that only an abstraction of those interfaces should be visible at higher levels.

23 3/7/2003Bioinformatics23 Lessons From Manager Metaphor (Continued).  In Shy Programming, only high-level information about the class or call graph is visible at the (shy) programming level and this shields the program from many changes to the class or call graph in the same way as the manager is shielded from many of the changes in the workers’ projects. The role of the group leader is played by the glue code that maps high-level information to low-level information and vice-versa. Shy Programming is graph-shy.

24 3/7/2003Bioinformatics24 Application to Bioinformatics Knowledge  Need shy programming and shy knowledge representation techniques for Bioinformatics.  Need domain-specific languages to define function in a structure-shy way.

25 3/7/2003Bioinformatics25 Another Good Example of AOP. BusRoute BusStopList BusStop BusList BusPersonList Person passengers buses busStops waiting 0..* find all persons waiting at any bus stop on a bus route OO solution: one method for each red class

26 3/7/2003Bioinformatics26 Traversal Strategy. BusRoute BusStopList BusStop BusList BusPersonList Person passengers buses busStops waiting 0..* from BusRoute through BusStop to Person find all persons waiting at any bus stop on a bus route A complex request

27 3/7/2003Bioinformatics27 Robustness of Strategy. BusRoute BusStopList BusStop BusList BusPersonList Person passengers buses busStops waiting 0..* from BusRoute through BusStop to Person VillageList Village villages 0..* find all persons waiting at any bus stop on a bus route Complex request is class-graph shy

28 3/7/2003Bioinformatics28 Writing Aspect-oriented Programs With Strategies. class BusRoute { int countWaitingPersons() { Integer result = (Integer) Main.cg.traverse(this, WPStrategy, new Visitor(){ int r ; public void before(Person host){ r++; } public void start() { r = 0;} public Object getReturnValue() { return new Integer(r);} }); return result.intValue();} } String WPStrategy=“ from BusRoute through BusStop to Person” A complex request Complex request plays role of manager Complex request is class-graph shy

29 3/7/2003Bioinformatics29 Writing Aspect-Oriented Programs With Strategies. // Prepare current class graph Main.cg = new ClassGraph(); int r = aBusRoute.countWaitingPersons(); String WPStrategy=“ from BusRoute through BusStop to Person”

30 3/7/2003Bioinformatics30 ObjectGraph: in UML Notation. Route1:BusRoute :BusStopList busStops CentralSquare:BusStop :PersonList waiting Paul:PersonSeema:Person :BusList buses Bus15:Bus :PersonList passengers Joan:Person Eric:Person

31 3/7/2003Bioinformatics31 ObjectGraphSlice. Route1:BusRoute :BusStopList busStops CentralSquare:BusStop :PersonList waiting Paul:PersonSeema:Person BusList buses Bus15:Bus :PersonList passengers Joan:Person Eric:Person

32 3/7/2003Bioinformatics32 Summary So Far.  Aspect-oriented software development helps to create software that is –More flexible; supports easy adaptation to rapidly changing interfaces. –Easier to understand and also shorter. –Supports the Shy Programming Principle.

33 3/7/2003Bioinformatics33 Institute for Complex Scientific Software Institute Home Page: http://www.icss.neu.edu/ http://www.icss.neu.edu/

34 3/7/2003Bioinformatics34 What?  Problem driving institute: –Complexity of building software systems to enable scientific research Objective: – Develop general methodologies for building complex scientific software using latest computer science research

35 3/7/2003Bioinformatics35 Goals. Applications Computer Science The Institute Scientific Software Solutions New Methodologies

36 3/7/2003Bioinformatics36 Applicable Computer Science Research.  Aspect-Oriented Software Development  Software Components  Parallelism  Domain Specific Languages  Visualization  Knowledge-Based Support Systems

37 3/7/2003Bioinformatics37 Three Testbeds.  THEMATICS (M. Ondrechen; protein function from structure; high external visibility) –Proc. Nat. Academy of Science publication –Featured in popular scientific magazines: Nature, American Chemical Society, Science Daily  Subsurface Sensing and Imaging (many Institute participants from this area)  Parallel Geant4 (CERN; Cooperman, Reucroft and Swain; particle matter interaction -- million line program)

38 3/7/2003Bioinformatics38 Some Other Faculty Highlights.  Valentin Ilyin. –Protein structure analysis: novel structural alignment method which produces high quality alignments. –visual analytical bioinformatics interface (Friend).  Roger Giese. –The long term goal is to learn whether the measurement of DNA adducts in people can help to individualize cancer prevention, analogous to the measurement of cholesterol as a biomarker for risk of a heart attack.

39 3/7/2003Bioinformatics39 Some Other Faculty Highlights.  Bob Futrelle. –I'm particularly interested in the relations between bio-ontologies and text and diagrams.

40 3/7/2003Bioinformatics40 Conclusions  Northeastern University and the Institute for Complex Scientific Software create knowledge of significant interest to bioinformatics.  Aspect-Oriented Software Development is a useful technology for the rapidly evolving area of bioinformatics.

41 3/7/2003Bioinformatics41 The End

42 3/7/2003Bioinformatics42 PathSet Algorithm  We have developed an efficient graph search algorithm that solves the following problem:  Input: –Graph G1 = (V1, E1) with source s and target t. –Graph G2 = (V2, E2) where V1 is a subset of V2.  Question: Does G2 contain a path that is an expansion of a path in G1 from s to t (the algorithm works even if s and t are sets of nodes.)

43 3/7/2003Bioinformatics43 Explanation.  Given a path p, a path p' is called an expansion, if p' can be obtained by inserting one or more elements between elements of p.  More generally, we can find a third graph that succinctly represents all possible such paths in G2.  Do you see applications of such an algorithm in biology?

44 3/7/2003Bioinformatics44 Motivation.  G1 is a “small” graph that lists “important” nodes.  G2 is a “large” graph in which we want to recognize paths that are expansions of paths in the the “small” graph.  Expansions of paths may contain additional nodes that are “noise” nodes.

45 3/7/2003Bioinformatics45 Notes  There is a path in G2 iff the traversal graph of G1 and G2 is not empty.  G1 may have exponentially many paths from s to t.

46 3/7/2003Bioinformatics46 Topic Switch.

47 3/7/2003Bioinformatics47 Lessons From Manager Metaphor (Continued).  AOP is related to (Micromanager) through the observation that aspects should be loosely coupled to the base programs they modify. The aspect should not be brittle with respect to the detailed calling structure of the base program in the same way as the manager should not rely on the details of the workers’ project. There is an intermediary, called glue code, that maps the aspect to the detailed usage context. AOP is call- graph shy.


Download ppt "3/7/2003Bioinformatics1 How To Address Rapidly Changing Data Representations in an Evolving Scientific Domain Using Aspect-oriented Programming Techniques."

Similar presentations


Ads by Google