Presentation is loading. Please wait.

Presentation is loading. Please wait.

REALIZING C++11 LAMBDA EXPRESSIONS in open64

Similar presentations


Presentation on theme: "REALIZING C++11 LAMBDA EXPRESSIONS in open64"— Presentation transcript:

1 REALIZING C++11 LAMBDA EXPRESSIONS in open64
Javed Absar, Anitha Boyapati & Dibyendu Das Open64 Workshop,15th June 2012, Beijing

2 Contents Introduction Origin of Lambda
Why Lambda – function pointers, functors, lambda Translating Lambda – in OPEN64 Efficiency considerations Conclusion

3 Introduction New C++11 standard Guidelines followed
Prefer new features through library Prefer changes that can evolve programming technique Core language & C++ standard library extensions Lambda Expression What Lambda? Why Lambda? How Lambda? The rest of this presentation New

4 Origins of Lambda Alonzo Church -Calculus [Notion of function OR set]
Passing functions as arguments to higher order functions Why? Do complicated stuff concisely Higher Order Function, Operators and First Class Function Lambda Expression and Anonymous functions LISP 1958 Function Languages - Haskell, Scheme, ML Object Oriented Languages – C#, Java, JavaScript, PHP

5 F(1) = 1 F(2) = 1/4 F(3) = 1/9 F(4) = 1/16 Why Lambda?
Problem of Definite Integral F(1) = 1 F(2) = 1/4 F(3) = 1/9 F(4) = 1/16

6 C++11 Lambda Expression 4/17/2012. Why Lambda?
Problem of Definite Integral C Solution using function pointers Limitation : Function defined separately from the context in which it is used. Rely on using globals double integrate(double a, double b, double (*ptr2func)(double) ) { int i; double sum =0, dt = (b-a)/N; // N is number of segments for( i = 0; i < N; i++ ) sum += ptr2func(a+i*dt) * dt ; return sum; } double func_inverse(double x ) { return u/x+v; } C++11 Lambda Expression 4/17/2012.

7 11/16/2018 Why Lambda? Problem of Definite Integral
C++ Solution using function objects (functors) Limitation : syntactic overheads and … class CFoo{ public: double u,v; double operator( )(double x){ return u/x+v; } }; double integrate( double a, double b, CFoo f){ int i; double sum =0, dt = (b-a)/N; for( int i = 0 ; i < N; i++ ) sum += f(a+i*dt)*dt;… return sum; } int main( ){ CFoo f_inv_x; f_inv_x.u = … ; f_inv_x.v = …; double a=… b =…; double t = integrate (a, b, f_inv_x); … 11/16/2018

8 How Lambda? C++11 Approach
template<class F> double integrate( double a, double b, F f){ int i; double sum =0, dt = (b-a)/N; for( int i = 0 ; i < N; i++ ) sum += f(a+i*dt)*dt; return sum; } int main( ){ int u=3, v=4 ; // u and v are local variables int a = ..., b = … ; double t = integrate (a, b, [u,v](double x){ return u/x+v;} ); double t2= integrate (a, b, [ ](double x){ return x*x;} ); 11/16/2018

9 How Lambda? Idiomatic Uses
STL – Summation of elements int sum = 0; for_each( myvector.begin( ), myvector.end( ), [&sum](int i){ sum+=i;}); C++11 Lambda Expression 4/17/2012.

10 How Lambda? Idiomatic Uses
STL – Searching for an element int ID = …; find_if( employeeRec.begin( ), employeeRec.end( ), [=](const employee& e) {return e.ID == ID; }); template<class InputIterator, class Predicate> InputIterator find_if( InputerIterator first, InputIterator last, Predicate pred) { for( ; first != last; first++ ) if ( pred(*first) ) break; return first; } C++11 Lambda Expression 4/17/2012.

11 Syntax of C++11 Lambda - CAPTURE
[ ] The lambda-expression cannot access any external variables in its body. e.g. [ ]( int i){ return i+j; } //error as j is external and not captured [& ] Any external variable is implicitly captured by reference if it is used in the lambda function. e.g. [&]( int i){ j++; return i+j; } //changes made to j is reflected upon return [=] Any external variable is implicitly captured by value if it is used in the lambda function. e.g. [=]( int i){ j++; return i+j; } //j is locally incremented [&,j] Any external variable is implicitly captured by reference, other than j which is captured by value. e.g. [&,j]( int i){ j++; k++; return i+j+k; } //j is locally incremented. k increment changes external value [=,j] Any external variable is implicitly captured by value, other than j which is captured by reference. e.g. [ ]( int i){ j++; k++; return i+j+k; } [this] Special case. Refers to enclosing class this pointer [&,&j] Error. j is already captured by reference by default [=,this] Error. this when =, this is captured by default [i,i] Error. i repeated. [&this] Error. Cannot take address of this [42] Error. Expects identifier or & or = or this C++11 Lambda Expression 4/17/2012.

12 Translating Lambda Expression in OPEN64
class T { public: int i; void set(int v) { this->i = v; } int get( ) { return [this]( ) {return this->i;}( ); } }; int main( ) { T t; t.set(5); assert( t.get( ) == 5); return 1; } Creating and invoking the lambda exp using () at the end.

13 Translating Lambda Expression
class T{ public: int i; …. get( ) { struct Lambda1{ T* enclosureThisPointer; //closure int operator( )( ){ return enclosureThisPointer->i; } }lam; lam.enclosureThisPointer = this; //copy this of T to closure return lam( ); //call lambda }//get( ) }; One bracket is for overloading + one is for empty parameter list

14 Translating Lambda EXPRESSION
int _ZN1T3getEv( struct T *const this){ struct Lamdbda1 lam; //create  closure lam.enclosureThisPointer= this; //prepare closure return _ZZN1T3getEvEN7Lambda1clEv(&lam); //call  } int _ZZN1T3getEvEN7Lambda1clEv(Lambda1 *const closure) { return constclosure->enclosureThisPointer->i; int main( ) { T t; _ZN1T3setEi (&t, 5); assert(_ZN1T3getEv(&t) == 5 ); return 1; This is from gcc 4.7

15 TranslatED Lambda Expression - WHIRL
1. FUNC_ENTRY <1,53,_ZN1T3getEv> 2. IDNAME 0 <2,4,this> 3. BODY 4. BLOCK {line: 1/8} 5. PRAGMA <null-st> 0 (0x0) # PREAMBLE_END 6. U8U8LDID 0 <2,4,this> T<56,anon_ptr.,8,C> 7. U8STID 0 <2,5,lam> T<69,Lambda1,8> <field_id:1> 8. BLOCK {line: 0/0} U8LDA 0 <2,5,lam> T<71,anon_ptr.,8> U8PARM 2 T<71,anon_ptr.,8> # by_value 11. I4CALL 126 <1,57,_ZZN1T3getEvEN7Lambda1clEv> 12. END_BLOCK 13. I4I4LDID -1 <1,49,.preg_return_val> T<4,.predef_I4,4> 14. I4COMMA 15. I4RETURN_VAL 16. END_BLOCK

16 Implementation FE LIBSPIN WHIRL-Convert Abstract Syntax Tree (AST) Spin output Whirl output Lambda Expression SPIN: Open64 converts GCC Abstract Syntax Tree (AST) to a form that is readily consumable by Whirl transformer. In the process it uses SPIN library. So few implementation changes to SPIN library are required during the whole porting process WHIRL: Interestingly, as an LE gets transformed to a structure no further changes to whirl are required. Whirl already handles structures. The remaining compilation continues as usual.

17 Efficiency considerations
The next 3 slides present approaches to optimize the closure for size When all variables of an LE are captured by reference, pass ‘this’ when the enclosing scope is a class/struct The size of the closure gets reduced to 8bytes (from 32bytes, assuming pointer to be 8bytes long) and number of assignments get reduced to 1 (from 4) However when the variables are captured by value, this transformation becomes invalid due to side- effects. LE before optimization struct foo { int i, j, k, l, m; int get () { [&i,&j,&k,&l] () ->mutable { i=…; return i+j+k;}(); } …. LE after optimization int i, j, k, l,m; [this]()mutable{ this->i = …; return this->i+this->j+this->k;}; Example 2

18 Efficiency considerations
For all other cases a conservative approach where in only those capture variables get declared which find a use in lambda-function-body. This can be applied to all cases. The size of the closure gets reduced from 24bytes to 16bytes. Also note that assignment operation to ‘lam.__l’ (were it created) is saved. LE before optimization struct foo { int i, j, k, l, m; int get () { [&i,j,k,&l] () ->mutable { i =…; return i+j+k;}(); } …. Example 4 LE after optimization represented in GCCs intermediate form struct __lambda0 { int & __i; //data member corresponding to capture ‘i’ int __j; int k; int operator() { i =…; return i+j+k;} }; __lambda0 lam; lam.__i = &i; //initialize member to capture lam.__j = j; lam.k = k; lam(); //operator call

19 Efficiency considerations
When all variables of an LE are captured by copy, it can be transformed to use default-capture The size of the closure gets reduced to 12bytes (from 16bytes assuming int to be 4bytes long) and number of assignment operations gets reduced to 3 (from 4) LE before optimization void foo() { int i, j, k, l, m; m = [i,j,k,l] () ->mutable { i =…; return i+j+k;}(); …. LE after optimization m = [=] () ->mutable { i =…; return i+j+k;}(); Example 3

20 Conclusion Lambda expression provides a concise and powerful form of expression Idiomatic use – STL, small function Implementation via function object Implementation can be optimized further

21 Trademark Attribution AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. Other names used in this presentation are for identification purposes only and may be trademarks of their respective owners. ©2011 Advanced Micro Devices, Inc. All rights reserved.

22 11/16/2018 Syntax of C++11 Lambda primary--expression:
literal | this | (expression) |id-expression | lambda-expression lambda-expression: lambda-introducer lambda-declaratoropt compound-statement lambda-introducer: [ lambda-captureopt ] lambda-capture: capture-default | capture-list | capture-default, capture-list capture-default : & | = capture-list: capture | capture-list , capture capture: identifier | & identifier | this lambda-declarator: (parameter-declaration-clause) attribute-specifieroptmutableopt exception-specificationopttrailing-return-typeopt 11/16/2018

23 Problem statement (conTD)
For e.g., The following table compares the size and operations for efficient and inefficient implementations std::vector<int> myvector(10); int myarray[1000]; auto lam = [myvector, myarray]() { if(myvector.size() ==0) return 0; else … //no use of myarray }; //end lambda Example 1: ‘myarray’ is not used in the compound statement of lambda Inefficient Implementation Efficient Implementation Closure size = 4040 bytes (minimum) Closure size = 40bytes Number of assignment operations = 1010 Number of assignment operations = 10

24 PRIOR work Timeline for the last three iterations of C++ language standardization shown in “[year] standard” format Biggest changes in C++11: Multithreading and usability enhancements. Lambda expression is a part of usability enhancement. Current version of Open64 (4.5.1) does not support lambda-expressions yet. It uses GCC 4.2 as front-end. GCC 4.5 [4] has experimental support for lambda expressions [3] which is complete as per N2927 [2]. State of the implementation is as follows: Complete support for parsing and building closure type of a lambda expression Return type deduction Support for synthesized move constructor that will be used only lambda closures Support for capture for all types including arrays However gcc creates non-static data members for all explicitly captured variables leading to large closures [1998] C++98 [2003] C++03 [2011]C++11 (C++0x)

25 Translating Lambda Expression in OPEN64
We now present a strategy to support lambda expression followed by approaches to optimize it. The following table shows how compiler transforms each lambda expression. The closure type is declared in the smallest block/class/namespace scope that contains the LE. The closure object is built by overloading the function call operator. Original code sum = … for_each(vec.begin(), vec.end(), [&sum](int i){sum+=i;}); Internal transformation by Compiler sum= … class __lambda { public: int __sum ; //function call overloading inline int operator()(int i) const { __sum += i; return __sum} }; __lambda mylam; mylam.__sum = sum; for_each (vec.begin(), vec.end(), mylam);

26 Implementation The existing implementation in GCC is taken as reference and optimizations are built on it. Each stage of compilation is described very briefly: Front-End: The lambda-introducer token ‘[‘ triggers the lambda parsing. All explicit captures will be added to capture list and their capture type (reference/copy) is recorded. For a default capture, id-expressions used in the compound statements are verified for the reaching scope and then are added to capture list An unnamed (internal name generated), non-aggregate class is declared within the smallest scope containing LE. For each variable in the capture list a corresponding non-static data member is declared and initialized with the value of capture. LE’s compound statement yields the function body Apply optimizations to reduce size of closure (described in earlier slides) by traversing the capture list. The class is now lowered and is represented as structure in GCCs intermediate form Dump the intermediate form (AST) to spin file

27 Efficiency considerations
Closure size : optimize size of closure when input capture set is too large E.g., std::vector<int> myvector(10); int myarray[1000]; auto lam = [myvector, myarray]() { if(myvector.size() ==0) return 0; else … //no use of myarray }; //end lambda In the above example, ‘myarray’ is not used in the compound statement of lambda Inefficient implementation leads to a closure size of 4040 bytes and 1010 additional assignment operations required to initialize captured set

28 11/16/2018 Why Lambda? Problem of Definite Integral
C Solution using function pointers Limitation : Function defined separately from the context in which it is used. Rely on using globals class CFoo{ public: double u,v; double operator( )(double x){ return u/x+v; } }; double integrate( double a, double b, CFoo f){ int i; double sum =0, dt = (b-a)/N; for( int i = 0 ; i < N; i++ ) sum += f(a+i*dt)*dt;… return sum; } int main( ){ CFoo f_inv_x; f_inv_x.u = … ; f_inv_x.v = …; double a=… b =…; double t = integrate (a, b, f_inv_x); … 11/16/2018


Download ppt "REALIZING C++11 LAMBDA EXPRESSIONS in open64"

Similar presentations


Ads by Google