Download presentation
Presentation is loading. Please wait.
Published byCharlene Nelson Modified over 9 years ago
1
(def functor BeeSpace v3)
2
Core BSv3 Features Personalized Collections –All functions operate on virtual collections. Gene Analysis Functions –Gene Annotation, Summarization, etc. Topic Exploration –Evolve/extract/expand/map/compare topics.
3
Challenges New problems: Modifying all functions to operate on virtual collections. Build Intelligent Gene Retrieval functionality. DB-supported apps. Optimize & parallelize implementations. Mod Indexing strategies. Old problems: Better tokenization needed. Teaming and code sharing. Upgrade of Lemur & Indri. Structured Queries. Multiple languages; diverse skill sets. Constraint: 5 month timeline
4
Big Hill, Little Time
5
Accomplishments Gene-focus tokenization scheme implemented. Intelligent Gene Retrieval function near completion. Optimized & parallelized (EM) Theme Clustering. Developed DB infrastructure for application support: multiple DBs, tables, DAO access. Developed Common Library (4K lines C++) for sharing across applications.
6
Accomplishments cont… Upgraded Lemur/Indri and normalized indexing. Developed (6) Collection operations and Boolean Query support.
9
Mining & Analysis of EAR Graphs Need to be able to quickly analyze and mine knowledge nets and EAR graphs with ad-hoc operations. User-driven exploration via 4GL query language. Many data models already fit within an Entity/Attribute/Relationship (EAR) model. Very flexible design. 4GL approach to operations: increases target audience and allows for query optimization: select src, sum(wt) from edge group by trg order by src;
11
Approach Adopt a layered system (stack) approach: Data layer, core software layer, interpreter, GUI. Possibility for administration client. Data Layer: currently implemented on top of RDBMS. Achieves flexibility, outreach/reuse, and is often cluster-compatible. Core Software Layer: C++ STL implementation with functional programming paradigm. Best-of-Worlds Effect: combines salient features of relational modeling with functional power and adaptive-object modeling methodology.
12
Motivation #1 (define (deriv exp var) (cond ((constant? exp) 0) ((variable? exp) (if (same-variable? exp var) 1 0) ((sum? exp) (make-sum (deriv (addend exp) var) (deriv augend exp) var))) ….) Ref: “Structure and Interpretation of Computer Programs”, Abelson et al.
13
Motivation #2 C++ STL: class myFunctor : std::unary_functor { double operator()(int x) { return 2.0*x; } }; std::for_each(list.begin(), list.end(), myFunctor); std::set_intersection(a.begin(), a.end(), b.begin(), b.end, result.begin());
14
Applications Concept switching. Theme extraction. Theme expansion, shrinking, morphing. Path finding; net flow analysis. Support for propagation nets, belief nets. Clustering, clique finding, etc. *** Not just standard CS/statistical algorithms, but utilizing semantic information and user- directives.
15
Modeling a Concept Space Alternative definitions: Case 1: powerset: implies F is monoid wrt funct comp (*) Case 2: finite vector space: F is R^n (choose generalized velocities) Case 3: random vectors: F can be modeled w/ functs over rvs. Theme =>point; Region => set-of-subset/compact set/distribution => need to capture variances ~ N(mu, sigma)
16
Other ideas… EM ~ separation: w ind. d | c => matrix factorization. Golub discusses Jacobi iterations (parallelizable) Cons inference markov net w/ Gibbs over max. cliques. What are min# of operators needed for mining of EAR…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.