<PETE> Shape Programmability

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

UNC Pixel Planes designed by Prof. Henry Fuchs et al.
8 Intermediate code generation
Grey Level Enhancement Contrast stretching Linear mapping Non-linear mapping Efficient implementation of mapping algorithms Design of classes to support.
Whole-Program Linear-Constant Analysis with Applications to Link-Time Optimization Ludo Van Put – Dominique Chanet – Koen De Bosschere Ghent University.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
From AST to Code Generation Professor Yihjia Tsai Tamkang University.
CS 307 Fundamentals of Computer Science 1 Abstract Data Types many slides taken from Mike Scott, UT Austin.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
9/17/20151 Chapter 12 - Heaps. 9/17/20152 Introduction ► Heaps are largely about priority queues. ► They are an alternative data structure to implementing.
College of Nanoscale Science and Engineering A uniform algebraically-based approach to computational physics and efficient programming James E. Raynolds.
Generative Programming. Automated Assembly Lines.
Section 3.5 Piecewise Functions Day 2 Standard: MM2A1 ab Essential Question: How do I graph piecewise functions and given a graph, can I write the rule?
©Fraser Hutchinson & Cliff Green C++ Certificate Program C++ Intermediate Operator Overloading.
XYZ 11/21/2015 MIT Lincoln Laboratory Monolithic Compiler Experiments Using C++ Expression Templates* Lenore R. Mullin** Edward Rutledge Robert.
Introduction to Code Generation and Intermediate Representations
Intermediate Code Representations
Collections Data structures in Java. OBJECTIVE “ WHEN TO USE WHICH DATA STRUCTURE ” D e b u g.
Functional Modeling.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
FUNCTIONAL PROGRAMING AT WORK - HASKELL AND DOMAIN SPECIFIC LANGUAGES Dr. John Peterson Western State Colorado University.
Compiler Principles Fall Compiler Principles Lecture 8: Intermediate Representation Roman Manevich Ben-Gurion University.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
University at Albany,SUNY lrm-1 lrm 6/28/2016 Levels of Processor/Memory Hierarchy Can be Modeled by Increasing Dimensionality of Data Array. –Additional.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
Data Structures Dr. Abd El-Aziz Ahmed Assistant Professor Institute of Statistical Studies and Research, Cairo University Springer 2015 DS.
Compiler Support for Better Memory Utilization in Scientific Code Rob Fowler, John Mellor-Crummey, Guohua Jin, Apan Qasem {rjf, johnmc, jin,
Chapter 1 Introduction.
Intermediate code Jakub Yaghob
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Naming and Binding A computer system is merely a bunch of resources that are glued together with names Thus, much of system design is merely making choices.
Chapter 1 Introduction.
Optimization Code Optimization ©SoftMoore Consulting.
structures and their relationships." - Linus Torvalds
Outline Notes (not in presentation)
Object - Oriented Programming Language
Syntax-Directed Translation
Important Concepts from Clojure
Eugene Gavrin – MSc student
Important Concepts from Clojure
structures and their relationships." - Linus Torvalds
Building the Support for Radar Processing Across Memory Hierarchies:
Intermediate Representations
Monolithic Compiler Experiments Using C++ Expression Templates*
Multiple Aspect Modeling of the Synchronous Language Signal
CS3901 Intermediate Programming & Data Structures Introduction
Intermediate Representations
Programming Languages 2nd edition Tucker and Noonan
Monolithic Compiler Experiments Using C++ Expression Templates*
Building the Support for Radar Processing Across Memory Hierarchies:
Building the Support for Radar Processing Across Memory Hierarchies:
Building the Support for Radar Processing Across Memory Hierarchies:
Intermediate Representations Hal Perkins Autumn 2005
(HPEC-LB) Outline Notes
Building the Support for Radar Processing Across Memory Hierarchies:
Building the Support for Radar Processing Across Memory Hierarchies:
Building the Support for Radar Processing Across Memory Hierarchies:
Memory Efficient Radar Processing
Outline Notes (not in presentation)
Important Concepts from Clojure
Object - Oriented Programming Language
Operator precedence and AST’s
Object - Oriented Programming Language
Operator Precedence and Associativity
Review: What is an activation record?
Intermediate Code Generating machine-independent intermediate form.
structures and their relationships." - Linus Torvalds
Faculty of Computer Science and Information System
Presentation transcript:

<PETE> Shape Programmability As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape C++ As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

Array.h As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape Mechanization As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

Time Domain Analysis As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

SAR Signal Processing As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

N-dimensional Array Capability inteagrates with <PETE> platform for Psi-Calculus As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape Performance As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

<PETE> Shape As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.

Comparable with C As part of our research, we defined an N-dimensional array class with shape. In order to support the mechanization operations such as these (slide 2). The new array class extends the support for array operations in PETE. The essence of our Array class is the shape notion. The shape notion represents the N-dimensional array shape; the size of each of the Array dimensions. In our class, this is passed in as an STL vector. The array class transforms the underlying linear storage container into an N-dimensional Array representation and facilitates Unary and Binary operations on the shaped Array. The Array class integrates many PETE constructs to facilitate this. PETE is a library that facilitates loop unrolling for computations… which greatly improves speed … as it removes most temporary arrays from the computation. This slide then gives you a glimpse of the non-materialization optimizations facilitated by PETE. If we have an N-dimensional Array computation A+B+C (the result assigned to another like Array) It is represented as a left associative expression tree (or abstract syntax tree) such as this (point to slide). Show how it is evaluated (tree) The nodes of the tree are just templated types … they are not actually evaluated. The actual evaluation is done by the overloaded assignment operator rather than cascading operator overloading. The expression is represented as embedded C++ template types as shown (point to slide). The primary expression, shown in white, is of type Expression, and is comprised of many sub-types. It is essentially a binary node which is compressed of an operation (addition in this case), a reference type with a templated array sub-type, and another binary node which resolves to an array reference type. Thus, it represents the addition of two arrays. The second sub-expression (represented in green, represents the A+B portion of the expression tree. The point of all this is to represent the expression as templated types which are ultimately resolved by the overloaded assignment operator. Normally, c++ resolves these operations at each step, however, PETE forces them to be resolved only when assigned to another array, thus eliminating intermediate storage objects. Array Class: As I said before, our Array class uses these constructs and facilitates these operations on N-dimensional arrays. We tested the performance of our class. A simplified graph of our results is shown above. Essentially, the red line is the performance of hand coded C on the above expression. The black line is the performance of C++ (done normally) on the above expression. The brown line is the performance of our array class on the above expression and the blue line (which almost covers the brown line) is the performance of PETE, with a manual implementation (with out the use of our class) on the expression. It shows that our class performs almost as well as hand coded c, much better than normally implemented c++ and as well as a manual pete implementation. Our experimental results show promising performance on a C++ platform that is more programmable. The benefit here is fast computation with more programmability by using object oriented constructs. This also provides a platform for implementing psi calculus rules to further improve the computational speed (to be done in the future). The Psi calculus rules rewrite the AST often reducing operations to mere index manipulations. As well as facilitating memory and processor mapping. In summary, the system will; allow n-dimensional computations, unroll loops – reducing the number of computations performed, remove temporaries (via pete and psi-ops) and map the problem to memory hierarchies.