Introduction to SAP HANA Ali safari khatouni, November 2018
What are Enterprise Resource Planning (ERP) Systems? Incredibly large, extensive software packages used to manage a firm’s business processes. Standard software packages that must be configured to meet the needs of a company Database programs with the following functions: Input Storage/Retrieval Manipulation Output
Typical scenario in ERP system http://sbbjitsolutions.com/erp-software.html
ERP players in market https://www.appsruntheworld.com/top-10-erp-software-vendors-and-market-forecast/
SAP clients
What is SAP HANA compared to ERP?
SAP S/4 HANA SAP S/4HANA is a full enterprise management suite, is SAP's next generation Business Suite and is a completely new product and code line and covers the areas of finance, sourcing & procurement, asset management, manufacturing and supply chain. Instant, real-time insight for better decisions in-memory technology embedded analytics Reinvented processes for higher performance SAP Fiori user experience for higher productivity intuitive on all devices, mobile first Simplified Architecture
What are enterprise application like? Workload in enterprise application consist of: Mainly read queries Online transaction processing (OLTP) 83% Online analytical processing (OLAP) 94% Many queries access large sets of data https://slideplayer.org/slide/2329349/
SAP HANA DB processes https://slideplayer.org/slide/2329349/
Business Applications Connection and Session Management Authori-zation Manager SQL SQL Script MDX … Trans- action Manager Optimizer and Plan Generator Calculation Engine Execution Engine Metadata Manager In-Memory Processing Engines Column Engine Row Engine Text Engine Persistency Logging and Recovery Data Storage
SAP HANA technology Hybrid data storage Tuple 1 Application often processes single records at once many selects and /or updates of single records Application typically accesses the complete record Columns contain mainly distinct values Aggregations and fast searching not required Small number of rows (e.g. configuration tables) SAP HANA Row Store stores tables by row Tuple 2 Tuple 3 Att1 Att2 Att3 Att4 Att5 Tuple n
SAP HANA technology Hybrid data storage Search and calculation on values of a few columns Big number of columns Big number of rows and columnar operations aggregate, scan, etc. High compression rates possible Most columns contain only few distinct values SAP HANA Column Store stores tables by column Att1 Att2 Att3 Att4 Att5 Tuple 1 Tuple 2 Tuple 3 Tuple n
SAP HANA technology Dictionary compression Classical Row Store HANA Column Store Dictionary for attribute/ column „Group“ 0 INTEL 1 Siemens 2 SAP 3 IBM Company [CHAR50] Region [CHAR30] Group [CHAR5] INTEL USA A Siemens Europe B C SAP IBM 0 A 1 B 2 C 0 Europe 1 USA 1 Index Vector Stored in one memory chunk => data locality for fast scans 1 1 1 2 2 2 3 1
SAP HANA main advantages Database Services using in-memory database services to process high-speed transactions and analytics Advanced Analytics Processing data processing capabilities – text, predictive, spatial, graph, streaming, and time series – you can get answers to any business question and make smart decisions in real time. App Development Develop next-generation applications that combine analytics and transactions, and deploy them on any device. Data Access Gain a complete and accurate view of your business by accessing data from any source – internal or external Administration Simplify system administration and IT operations Security Keep your communications, data storage, and application services secure with robust identity and access management controls.
Future evolution Apache Kafka Hadoop Spark a distributed streaming platform. Hadoop a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Spark a unified analytics engine for large-scale data processing.
Practical example Is there any database available in our laptop, pc, mobile, etc.? Yes, many applications store data as database
Available SQL database engines https://medium.com/@yangforbig/sqlite-vs-mysql-vs-postgresql-a-comparison-of-relational-database-management-systems-afd5afd6566
Let’s use our SQL knowledge Tool: Sqlite: SQLite is an in-process library that implements a self- contained, serverless, zero-configuration, transactional SQL database engine. Android Linux Mac OS X Windows Database: Use history database of Firefox Ubuntu: ~/.mozilla/firefox/luvtp4j4.default/places.sqlite … https://www.sqlite.org/about.html
places.sqlite Database moz_places: The main table of URIs and is managed by the history service . Any time a Places component wants to reference a URL, whether visited or not, it refers to this table. Each entry has an optional reference to the moz_favicon table to identify the favicon of the page. No two entries may have the same value in the url column. https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Database
places.sqlite Database moz_places moz_hosts moz_historyvisits moz_bookmarks moz_bookmarks_roots moz_keywords moz_anno_attributes moz_annos moz_items_annos moz_favicons https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Database
places.sqlite Database
places.sqlite Database
places.sqlite Database
Thanks for your attention! Questions?
A streaming platform has three key capabilities: Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system. Store streams of records in a fault-tolerant durable way. Process streams of records as they occur. Kafka is generally used for two broad classes of applications: Building real-time streaming data pipelines that reliably get data between systems or applications Building real-time streaming applications that transform or react to the streams of data