Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMP9321 Web Application Engineering Semester 2, 2016 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 1COMP9321, 16s2, Week.

Similar presentations


Presentation on theme: "COMP9321 Web Application Engineering Semester 2, 2016 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 1COMP9321, 16s2, Week."— Presentation transcript:

1 COMP9321 Web Application Engineering Semester 2, 2016 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 1COMP9321, 16s2, Week 6 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2445

2 We are Generating Vast Amounts of Data !! Healthcare Remote patient monitoring Manufacturing Product sensors Location-Based Services Real time location data Retail Social media … Digitalization of Artefacts books, music, videos, etc. 2 COMP9321, 16s2, Week 6

3 We are Generating Vast Amounts of Data !! Air Bus A380: o generate 10 TB every 30 min Twitter : http://www.internetlivestats.com/twitter-statistics/ o Generate approximately 12 TB of data per day. Facebook : o Facebook data grows by over 500 TB daily. New York Stock : o Exchange 1TB of data everyday. 3COMP9321, 16s2, Week 6 https://www.brandwatch.com/2016/03/96-amazing-social-media-statistics-and-facts-for-2016/

4 Challenge 4COMP9321, 16s2, Week 6 How do we store and access this data over the web ?

5 Challenge 5COMP9321, 16s2, Week 6 How do we store and access this data over the web ? E-Commerce website Data operations are mainly transactions (Reads and Writes) Operations are mostly on-line Response time should be quick but important to maintain security and reliability of transactions. ACID properties are important

6 Challenge 6COMP9321, 16s2, Week 6 How do we store and access this data over the web ? E-Commerce website Data operations are mainly transactions (Reads and Writes) Operations are mostly on-line Response time should be quick but important to maintain security and reliability of transactions. ACID properties are important http://www.techtweet.org/

7 Challenge 7COMP9321, 16s2, Week 6 How do we store and access this data over the web ? Image serving website Data operations are mainly fetching large files (Reads) ACID requirements can be relaxed Operations are mainly on-line High bandwidth requirement

8 Challenge 8COMP9321, 16s2, Week 6 How do we store and access this data over the web ? Search Website Data operations are mainly reading index files for answering queries (Reads) ACID requirements can be relaxed Index compilation is performed off-line due to the large size of source data (the entire Web) Response times must be as fast as possible.

9 Persistence 9COMP9321, 16s2, Week 6 (Hibernate, pp.5-29)

10 Persistence 10COMP9321, 16s2, Week 6 Persistence is: “the continuance of an effect after its cause is removed” In the context of storing data in a computer system, this means that: “the data survives after the process with which it was created has ended” In other words, for a data store to be considered persistent: “it must write to non-volatile storage” (Hibernate, pp.5-29)

11 Persistence 11COMP9321, 16s2, Week 6 Persistence is a fundamental concept in application development. In an object-oriented applications, persistence allows an object to outlive the process that created it. The state of the object may be stored to disk and an object with the same state re-created at some point in the future. Sometimes entire graphs of interconnected objects may be made persistent and later re-created in a new process. (Hibernate, pp.5-29)

12 Persistence 12COMP9321, 16s2, Week 6 Not all objects are persistent: o some (transient objects) will have a limited lifetime that is bounded by the life of the process that instantiated it. Almost all Java applications contain a mix of persistent and transient Objects. This means we need a subsystem that manages our persistent objects. (Hibernate, pp.5-29)

13 Data Persistence 13COMP9321, 16s2, Week 6 (Hibernate, pp.5-29)

14 Data Persistence 14COMP9321, 16s2, Week 6 When we talk about persistence in Java, we normally mean storing data in a relational database using SQL. Relational technology: is a common denominator for many disparate systems and technology platforms. provides a way of sharing data across different applications or technologies that form part of the same application. The relational data model is often the common enterprise wide presentation of business entities. (Hibernate, pp.5-29)

15 Data Persistence 15COMP9321, 16s2, Week 6 When you work with a relational database in a Java application, the Java code issues SQL statements to the database via the JDBC API. The Java Database Connectivity (JDBC) API provides universal data access from the Java programming language. Using the JDBC API, you can access virtually any data source, from relational databases to spreadsheets and flat files. JDBC API: https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/ https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/ (Hibernate, pp.5-29)

16 Data Persistence 16COMP9321, 16s2, Week 6 When you work with a relational database in a Java application, the Java code issues SQL statements to the database via the JDBC API. The Java Database Connectivity (JDBC) API provides universal data access from the Java programming language. Using the JDBC API, you can access virtually any data source, from relational databases to spreadsheets and flat files. JDBC API: https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/ https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/ (Hibernate, pp.5-29)

17 Relational Databases 17COMP9321, 16s2, Week 6 (Hibernate, pp.5-29)

18 Relational Databases 18COMP9321, 16s2, Week 6 Data is stored as a collection of tuples that groups attributes e.g. (student-id, name, birthdate, courses). Data is visualized as tables, where the tuples are the rows and the attributes form the columns. Tables can be related to each other through specific columns. Each row in a table has at least one unique attribute. (Hibernate, pp.5-29)

19 Structured Query Language (SQL) 19COMP9321, 16s2, Week 6

20 Structured Query Language (SQL) 20COMP9321, 16s2, Week 6

21 Database Concepts 21COMP9321, 16s2, Week 6

22 Database Concepts 22COMP9321, 16s2, Week 6

23 Accessing DB from an Application (JDBC) 23COMP9321, 16s2, Week 6

24 Accessing DB from an Application 24COMP9321, 16s2, Week 6

25 Java DataBase Connectivity 25COMP9321, 16s2, Week 6

26 JDBC Concepts 26COMP9321, 16s2, Week 6 (Barish, p.310) When developers use JDBC, they construct SQL statements that can be executed. A template like query string: SELECT name FROM employee WHERE age = ? can be combined with local data structures so that regular Java objects can be mapped to the bindings in the string. e.g., a java.lang.Integer object with the value of 42 can be mapped: SELECT name FROM employee WHERE age = 42 The results of execution, if any, are combined in a set returned to the caller. For example, the query may return: We can browse this result set as necessary.

27 JDBC Interfaces 27COMP9321, 16s2, Week 6

28 Typical JDBC Scenario 28COMP9321, 16s2, Week 6

29 PreparedStatement object 29COMP9321, 16s2, Week 6 A more realistic case is that the same kind of SQL statement is processed over and over (rather than a static SQL statement). In PreparedStatement, a place holder (?) will be bound to an incoming value before execution (no recompilation).

30 Transaction Management 30COMP9321, 16s2, Week 6 By default, JDBC commits each update when you call executeUpdate(). Committing after each update can be suboptimal in terms of performance. It is also not suitable if you want to manage a series of operations as a logical single operation (i.e., transaction).

31 Data Access Objects (DAO) 31COMP9321, 16s2, Week 6

32 Data Access Objects (DAO) 32COMP9321, 16s2, Week 6

33 Data Access Objects (DAO) 33COMP9321, 16s2, Week 6

34 Data Access Objects (DAO) 34COMP9321, 16s2, Week 6 http://onewebsql.com/

35 Data Access Objects (DAO) 35COMP9321, 16s2, Week 6 Example: Cars Database

36 Data Access Objects (DAO) 36COMP9321, 16s2, Week 6 Example: Cars Database DTO (Data Transfer Object)

37 Data Access Objects (DAO) 37COMP9321, 16s2, Week 6 Example: Cars Database DTO (Data Transfer Object) carries the actual data...

38 Data Access Objects (DAO) 38COMP9321, 16s2, Week 6 Example: Cars Database

39 Data Access Objects (DAO) 39COMP9321, 16s2, Week 6 Example: Cars Database

40 Data Access Objects (DAO) 40COMP9321, 16s2, Week 6 Example: Cars Database

41 Data Access Objects (DAO) 41COMP9321, 16s2, Week 6 Example: Cars Database

42 Object-Relational Impedance Mismatch Problems 42COMP9321, 16s2, Week 6

43 Object-Relational Impedance Mismatch Problems 43COMP9321, 16s2, Week 6

44 Object-Relational Impedance Mismatch Problems 44COMP9321, 16s2, Week 6 https://docs.oracle.com/cd/E16162_01/user.1112/e17455/img/mismatch.gif

45 Object-Relational Impedance Mismatch Problems 45COMP9321, 16s2, Week 6

46 Impedance (or Paradigm) Mismatch Problem 46COMP9321, 16s2, Week 6

47 Impedance (or Paradigm) Mismatch Problem 47COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of granularity Granularity

48 Impedance (or Paradigm) Mismatch Problem 48COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of granularity Granularity The granularity of data refers to the size in which data fields are sub-divided. For example, a postal address can be recorded, with: 1- Coarse Granularity, as a single field: address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA

49 Impedance (or Paradigm) Mismatch Problem 49COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of granularity Granularity The granularity of data refers to the size in which data fields are sub-divided. For example, a postal address can be recorded, with: 1- Coarse Granularity, as a single field: address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA 2- Fine Granularity, as multiple fields: street address = 200 2nd Ave. South #358 city = St. Petersburg postal code = FL 33701-4313 country = USA

50 Impedance (or Paradigm) Mismatch Problem 50COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of granularity Granularity The granularity of data refers to the size in which data fields are sub-divided. For example, a postal address can be recorded, with: 1- Coarse Granularity, as a single field: address = 200 2nd Ave. South #358, St. Petersburg, FL 33701-4313 USA 2- Fine Granularity, as multiple fields: street address = 200 2nd Ave. South #358 city = St. Petersburg postal code = FL 33701-4313 country = USA or even Finer Granularity: street = 2nd Ave. South address number = 200 suite/apartment number = #358 city = St. Petersburg state = FL postal-code = 33701 postal-code-add-on = 4313 country = USA

51 Impedance (or Paradigm) Mismatch Problem 51COMP9321, 16s2, Week 6 Observation: Classes in your OO-based model come in a range of different levels of granularity (coarse-grained entity classes like User, finer-grained classes like Address, simple String class like Postcode) Just two levels of granularity in RDB: Tables and Columns with scalar types (i.e., not as flexible as Java type system) Coarse grained means a single call will do more work, fine grained means it might take several calls to get the same work done. Coarse grained is often better in distributed systems because calls between distributed components can be expensive and time consuming. (Hibernate, pp.5-29) The problem of granularity Granularity

52 Impedance (or Paradigm) Mismatch Problem 52COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of subtypes Subtypes

53 Impedance (or Paradigm) Mismatch Problem 53COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of identity Identity

54 Impedance (or Paradigm) Mismatch Problem 54COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of identity Identity While on the subject of identity … Modern object persistence solutions recommend using surrogate key. A surrogate key in a database is a unique identifier for either an entity in the modelled world or an object in the database. The surrogate key is not derived from application data, unlike a natural (or business) key which is derived from application data.

55 Impedance (or Paradigm) Mismatch Problem 55COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of association Association

56 Impedance (or Paradigm) Mismatch Problem 56COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of association Association

57 Impedance (or Paradigm) Mismatch Problem 57COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of object graph navigation Object Graph Navigation

58 Impedance (or Paradigm) Mismatch Problem 58COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The problem of object graph navigation Object Graph Navigation Considering the following example:

59 Impedance (or Paradigm) Mismatch Problem 59COMP9321, 16s2, Week 6 N+1 selects problem: The N+1 query problem is a common performance issue. It looks like this: Assuming load_cats() has an implementation that boils down to:..and load_hats_for_cat($cat) has an implementation something like this:..you will issue "N+1" queries when the code executes, where N is the number of cats: https://secure.phabricator.com/book/phabcontrib/article/n_plus_one/

60 Impedance (or Paradigm) Mismatch Problem 60COMP9321, 16s2, Week 6 (Hibernate, pp.5-29) The cost of mismatch problems The cost of mismatch problems: The DAO pattern helps isolate the mismatch problems by separating the interfaces from implementation, but someone (usually application developers) still has to provide the implementation classes !!

61 Object-Relational Mapping (ORM) 61COMP9321, 16s2, Week 6

62 Object-Relational Mapping (ORM) 62COMP9321, 16s2, Week 6

63 Hibernate 63COMP9321, 16s2, Week 6

64 Hibernate 64COMP9321, 16s2, Week 6

65 Hibernate 65COMP9321, 16s2, Week 6

66 Continuing with the Cars example... 66COMP9321, 16s2, Week 6

67 Continuing with the Cars example... 67COMP9321, 16s2, Week 6

68 Continuing with the Cars example... 68COMP9321, 16s2, Week 6

69 Continuing with the Cars example... 69COMP9321, 16s2, Week 6

70 Continuing with the Cars example... 70COMP9321, 16s2, Week 6

71 Continuing with the Cars example... 71COMP9321, 16s2, Week 6

72 To use Hibernate, you need: 72COMP9321, 16s2, Week 6 Hibernate packages (hibernate*.jar) A set of mapping (between a table and an object) les A Hibernate configuration file (e.g., database connection details)

73 Hibernate Example 73COMP9321, 16s2, Week 6 See course material, week 6

74 NoSQL 74COMP9321, 16s2, Week 6

75 What is NoSQL? 75COMP9321, 16s2, Week 6 Stands for No-SQL or Not Only SQL?? Class of non-relational data storage systems E.g. BigTable, Dynamo, PNUTS/Sherpa,.. Usually do not require a fixed table schema nor do they use the concept of joins Distributed data storage systems All NoSQL offerings relax one or more of the ACID properties (will talk about the CAP theorem) Chapter 19: Distributed Databases

76 RDBMS 76COMP9321, 16s2, Week 6 Scale Up Structured Data Atomic Transactions Impedance Mismatch Scale Out Semi-/Un-Structured Data Eventual Consistency NoSQL vs Relaxed..

77 CAP Theorem 77COMP9321, 16s2, Week 6 Three properties of a (distributed computer) system: Consistency (all copies have same value) Availability (system can run even if parts have failed) Via replication. Partitions (network can break into two or more parts, each with active systems that can’t talk to other parts) Brewer’s CAP “Theorem”: You can have at most two of these three properties for any system. Very large systems will partition at some point.

78 Why NoSQL? 78COMP9321, 16s2, Week 6 NoSQL Data storage systems makes sense for applications that need to deal with very large semi-structured data : e.g. Social Networking Feeds

79 Why NoSQL? share, comment, review, crowdsource, etc. 79 COMP9321, 16s2, Week 6

80 Types and examples of NoSQL databases Column:  HBase, Accumulo, Cassandra, Druid, Vertica Document:  MongoDB, Apache CouchDB, Clusterpoint, DocumentDB, Key-value:  Dynamo, Aerospike, Couchbase, FairCom c-treeACE, FoundationDB, HyperDex, MemcacheDB, MUMPS, Oracle NoSQL Database, OrientDB, Redis, Riak, Berkeley DB Graph:  Neo4J, AllegroGraph, InfiniteGraph, Giraph, MarkLogic, OrientDB, Virtuoso Multi-model:  Alchemy Database, ArangoDB, CortexDB, FoundationDB, MarkLogic, OrientDB 80COMP9321, 16s2, Week 6

81 Types and examples of NoSQL databases Column (Data Store): It is a tuple (a key-value pair) consisting of three elements:  Unique name: Used to reference the column  Value: The content of the column  Timestamp: The system timestamp used to determine the valid content. In relational databases, a column is a part of a relational table that can be seen in each row of the table. This is not the case in distributed data stores. Example (In JSON-like notation): { street: {name: "street", value: "1234 x street", timestamp: 123456789}, } 81COMP9321, 16s2, Week 6

82 Types and examples of NoSQL databases Document-oriented database: Designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. XML databases are a subclass of document-oriented databases that are optimized to work with XML documents. Document databases store all information for a given object in a single instance in the database, and every stored object can be different from every other. Example: MongoDB, a free and open-source cross-platform document-oriented database. MongoDB avoids the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas. https://www.mongodb.com/ 82COMP9321, 16s2, Week 6

83 Types and examples of NoSQL databases Key-value database: A key-value store, or key-value database, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash. Dictionaries contain a collection of objects, or records, which in turn have many different fields within them, each containing data. These records are stored and retrieved using a key that uniquely identifies the record, and is used to quickly find the data within the database. Example: Amazon DynamoDB, created to help address some scalability issues and is used to power parts of the Amazon Web Services, such as S3 (Simple Storage Service: provides developers and IT teams with secure, durable, highly-scalable cloud storage.). 83COMP9321, 16s2, Week 6

84 Graph Database 84 User Movie Netflix Collaborative Filtering Docs Words Wiki Text Analysis Social Network Probabilistic Analysis COMP9321, 16s2, Week 6

85 Graph Database 85 User Movie Netflix Collaborative Filtering Docs Words Wiki Text Analysis Social Network Probabilistic Analysis …Beheshti, et al. “Large Scale Graph Processing Systems: Survey and An Experimental Evaluation”, Cluster Computing Journal, 2015 …Beheshti, et al. “On Characterizing the Performance of Distributed Graph Computation Platforms”. TPCTC Conference, 2014. VLDB …,Beheshti S.M.R. et al. "DREAM: Distributed RDF Engine with Adaptive Query Planner and Minimal Communication", VLDB (2015) COMP9321, 16s2, Week 6

86 Graph Stores Use a graph structure – Labeled, directed, attributed multi-graph Label for each edge Directed edges Multiple attributes per node Multiple edges between nodes – Relational DBs can model graphs, but an edge requires a join which is expensive Example Neo4j – neo4j.com/ COMP9321, 16s2, Week 6

87 Advantages of NoSQL Cheap, easy to implement Data are replicated and can be partitioned Easy to distribute Don't require a schema Can scale up and down Quickly process large amounts of data Relax the data consistency requirement (CAP) Can handle web-scale data, whereas Relational DBs cannot COMP9321, 16s2, Week 6

88 Disadvantages of NoSQL New and sometimes buggy Data is generally duplicated, potential for inconsistency No standardized schema No standard format for queries No standard language Difficult to impose complicated structures Depend on the application layer to enforce data integrity No guarantee of support Too many options, which one, or ones to pick COMP9321, 16s2, Week 6

89 More? Search NoSQL Documents: Elasticsearch can be used to search all kinds of documents. Elasticsearch uses Lucene (an indexing and search library) and tries to make all its features available through the JSON and Java API. Database Service: Dozens of new DBs! how do we choose which DB to use? Solution: o Manage multiple database technologies and weave them together at the app layer.. o Make this service accessible through a single API. Example: https://orchestrate.io/ https://orchestrate.io/ o Developers will automatically have access to: a flexible key-value store that works with time-ordered events, graph relationships, and geospatial data.. COMP9321, 16s2, Week 6 https://www.elastic.co/products/elasticsearch https://lucene.apache.org/

90 References 90COMP9321, 16s2, Week 6 (Hibernate) Hibernate In Action, Christian Bauer and Gavin King, Manning Publications (HibernateDOC) http://www.hibernate.org/hib docs/reference/en/html/ Some examples are originated from Dr. David Edmond from School of Information Systems, QUT, Brisbane and S. Sudarshan from IIT Bombay.

91 91COMP9321, 16s2, Week 6


Download ppt "COMP9321 Web Application Engineering Semester 2, 2016 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 1COMP9321, 16s2, Week."

Similar presentations


Ads by Google