Download presentation
Presentation is loading. Please wait.
1
Regions of Interest
2
What’s in a ROI? Use cases Requirements Current Storage System Problems Alternative Storage
3
ROI Geometry Measurements ROI on Channel Annotations ▪ ROI ▪ Measurement ▪ Links
4
User created ROI Measurement tools HCS generated ROI Automatic External External analysis Particle Tracking Other Templates ROIs without images
5
Human generated More interactions ▪ Merge, Propagate, Split, Delete Measurements ▪ Geometry ▪ Intensity ▪ Path ROI/ROI Links Tags mostly on ROI Write Many/Read Many
6
HCS Generated ROI Lots of ROI Attached to Channel Measurements Attached ▪ Multiple measurements Tags on ROI, Measurements ▪ Analysis, results and meta. Write Once, Read Many
7
External Tool can Generate ROI (+ scripts) Can be tagged Links (ROI/ROI, ROI/Image) Results can be in any format
8
ROI need not be attached to image Template to define other ROI
9
N-Dimensional Data Storage of Image data simple ROI more complex ▪ Database entry, file format We don’t just want to store in HDF
10
Database ROI ROI Annotations PyTables Mask ROI Measurements
11
Pytables ROI are heterogeneous Concurrency Python behind a core service call Measurements are optimal Tagging is an issue ▪ Inside file ▪ Multiple annotations reported to be slow
12
ROI can be stored in database Mask data can be an issue Tagging in RBD not best Many more annotations than we’d like Link to external source for measurements
13
Key-Value Pair Stores Berkeley DB Project Voldermort Tokyo Cabinet Document DB MongoDB CouchDB Graph DB Neo4J InfoGrid Table DB Cassandra Hypertables HBase
14
Other opinions on the storage solutions MongoDB vs CouchDB, Cassandra,.. MongoDB vs CouchDB, Cassandra,.. CouchDB vs MongoDB CouchDB vs MongoDB Pros and cons of MongoDB Pros and cons of MongoDB Digg on Cassandra Digg on Cassandra What is a supercolumn What is a supercolumn Cassandra talk Cassandra talk Indexing nodes in Neo4J Indexing nodes in Neo4J
15
Document Database NOSQL movement Schemaless No Tables ▪ Collections of like data No Joins ▪ Document is equivalent of row of data ▪ Distributed file system (GridFS)
16
Pros It has bindings to numerous languages (C++, C#, Java, Python,...). Allows storage, indexing, linking of any user data Annotations are now very easy, efficient Has mechanisms for schema upgrade Dynamic Queries Replication Sharding. Map-Reduce framework. Fast. GridFS is a distributed file storage mechanism within Mongo. Easy to install Cons Schemaless, data integrity will need to be worked on. Graph structures not inherently supported.
17
DEPLOYMENTS SourceForge http://sourceforge.net/ http://sourceforge.net/ BusinessInsider http://www.businessinsider.com/ http://www.businessinsider.com/ New York Times http://www.nytimes.com/ http://www.nytimes.com/ Disqus http://www.disqus.com/ http://www.disqus.com/
18
Human Interaction Merge, Propagate, Split ✓ Geometry ✓ Intensity ✓ Path ✓ ROI/ROI Links ✓ Tags ✓ HCS Many ROI ✓ Tags on ROI ✓ Tags on Measurement ✓ Tables of Measurements ✓ Externally Generated Tags ✓ ROI/ROI Links, ROI/Image Links Many formats, unknown types ✓ Other N-Dimensional ROI ✓ Hierarchical Structures ✓
19
connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.insert({"tags" : [ ], "label" : “MyROI”, "shapes" : [{ "tags" : [{"tag" : "foo1", "namespace" : "bob"}], "rx" : 17, "ry" : 17, "label" : null, "cy" : 75, "cx" : 3, "t" : 0, "z" : 0, "type" : "Ellipse", "id" : 3 }, { "tags" : [{"tag" : "foo2", "namespace" : "bob"}], "rx" : 10, "ry" : 16, "label" : null, "cy" : 82, "cx" : 45, "t" : 0, "z" : 0, "type" : "Ellipse", "id" : 5 }], "type" : "Roi", "id" : 565 })
20
connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.find({"shapes.tags.tag":'/.*mitosis.*/i'}) connection = Connection(); db = connection['databaseName']; collection = db.['collectionName']; collection.find({”shapes.tags.tag”:”foo1”,”tags.tag”:”foofoo”}) Find roi with tag foofoo and shapes with tag foo1 Find roi shapes with tag containing mitosis
21
Graph Database use nodes to represent objects User specifies relationship between nodes Allows complex traversal of node structures
22
PROS Handles graph structures nicely Transactional Supported by Gremlin Gremlin Gremlin Native RDF http://components.neo4j.org/neo- rdf-sail/ http://components.neo4j.org/neo- rdf-sail/ Easy to install CONS No C++ language binding. Not distributed. Tables are not so easily modeled. Difficult to query on node contents
23
DEPLOYMENTS The Swedish Defence forces http://www.mil.se http://www.mil.se Windh Technologies http://www.windh.com http://www.windh.com Flextoll http://www.flextoll.se http://www.flextoll.se
24
public enum OMERORelations implements RelationshipType { ASSOCIATE, DERIVE, AGGREGATE, COMPOSE } Node image = neo.createNode(); image.setProperty("IObject",imageI); image.setProperty("id",imageI.getId().getValue()); image.setProperty("name",imageI.getName().getValue()); Node derivedImage = neo.createNode(); derivedImage.setProperty("IObject",derivedImageI); derivedImage.setProperty("id",derivedImageI.getId().getValue()); derivedImage.setProperty("name",derivedImageI.getName().getValue()); Relationship relationship = image.createRelationshipTo( derivedImage, OMERORelations.DERIVE ); relationship.setProperty("type","ROI"); relationship.setProperty("operation","crop"); relationship.setProperty("roi",cropRoiI);
25
Human Interaction Merge, Propagate, Split ✓ Geometry Intensity Path ✓ ROI/ROI Links ✓ Tags HCS Many ROI ✓ Tags on ROI ✓ Tags on Measurement ✓ Tables of Measurements Externally Generated Tags ✓ ROI/ROI Links, ROI/Image Links ✓ Many formats, unknown types Other N-Dimensional ROI Hierarchical Structures ✓
26
Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table. A sophisticated toolset is required to get the most out of this solutions, for instance Google has created sawzall to query this system. Digg have released a language to work with Cassandra called LazyBoy. sawzall LazyBoy Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).
27
Pros Quick Handles heterogeneous data well Different rows can have different columns Can manage distributed data Map/Reduce Focus on writes not reads Scales nicely Easy to Install Cons Not simple to work with Building hierarchical structures Sorting Querying ▪ Ad Hoc Queries are bad, Digg still use MySQL for certain queries. Have to manage secondary indexes, (K/V) Version 0.5
28
Deployments Facebook (MAYBE!!) http://www.facebook.comhttp://www.facebook.com Digg http://www.digg.comhttp://www.digg.com
29
Human Interaction Merge, Propagate, Split ✓ Geometry ✓ Intensity ✓ Path ROI/ROI Links Tags ✓ HCS Many ROI ✓ Tags on ROI ✓ Tags on Measurement ✓ Tables of Measurements ✓ Externally Generated Tags ✓ ROI/ROI Links, ROI/Image Links ✓ Many formats, unknown types Other N-Dimensional ROI ✓ Hierarchical Structures
30
Implementation of Google’s BigTables, is a complex implement of a key/value store to represent a table. A sophisticated toolset is required to get the most out of this solutions, for instance Google has created sawzall to query this system. HyperTable has a query language call HQL. sawzall Works by creating a table which has columns linked together called column families, like data will exist in the same column family (Ellipse ROI).
31
Pros Quick Handles heterogeneous data well Different rows can have different columns Can manage distributed data Map/Reduce Scales nicely Easy to Install Cons GPL License Building hierarchical structures Docs are weak HQL works for simple queries only Map/Reduce for other work limit of 255 column families Secondary keys
32
Deployments Rediff http://www.rediff.comhttp://www.rediff.com Zvents http://www.zvents.com/http://www.zvents.com/
33
Human Interaction Merge, Propagate, Split ✓ Geometry ✓ Intensity ✓ Path ROI/ROI Links Tags ✓ HCS Many ROI ✓ Tags on ROI ✓ Tags on Measurement ✓ Tables of Measurements ✓ Externally Generated Tags ✓ ROI/ROI Links, ROI/Image Links ✓ Many formats, unknown types Other N-Dimensional ROI ✓ Hierarchical Structures
34
Why do we have an RDMS We don’t normalise the data Each import will normalise on: ▪ Image, ObjectiveSettings, LogicalChannel, LightSettings, Detector Settings. Object Penalty Difference between normalisation and view
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.