C# Language Design Peter Hallam Software Design Engineer C# Compiler Microsoft Corporation
Overview Introduction to C# Introduction to C# Design Problems Design Problems Future Directions Future Directions Questions Questions
Hello World using System; class Hello { static void Main() { static void Main() { Console.WriteLine("Hello, world!"); Console.WriteLine("Hello, world!"); }}
C# Program Structure Namespaces Namespaces Contain types and other namespaces Type declarations Type declarations Classes, structs, interfaces, enums, and delegates Members Members Constants, fields, methods, operators, constructors, destructors Properties, indexers, events Organization Organization No header files, code written “in-line”
Program Structure Example namespace System.Collections { using System; using System; public class Stack: Collection public class Stack: Collection { Entry top; Entry top; public void Push(object data) { public void Push(object data) { top = new Entry(top, data); top = new Entry(top, data); } public object Pop() { public object Pop() { if (top == null) throw new InvalidOperationException(); if (top == null) throw new InvalidOperationException(); object result = top.data; object result = top.data; top = top.next; top = top.next; return result; return result; } }}
Predefined Types C# predefined types C# predefined types The “root” object Logical bool Signed sbyte, short, int, long Unsigned byte, ushort, uint, ulong Floating-point float, double, decimal Textual char, string Textual types use Unicode (16-bit characters)
C# Classes Single inheritance Single inheritance Can implement multiple interfaces Can implement multiple interfaces Members Members Constants, fields, methods, operators, constructors, destructors Properties, indexers, events Nested types Static and instance members Member access Member access public, protected, internal, private
Interfaces Can contain method declarations; no code or data Can contain method declarations; no code or data Defines a “contract” that a class must support Defines a “contract” that a class must support Classes have one base class, but can implement many interfaces Classes have one base class, but can implement many interfaces interface IFormattable { string Format(string format); string Format(string format);} class DateTime: IFormattable { public string Format(string format) {…} public string Format(string format) {…}}
Statements and Expressions Very similar to C++, with some changes to increase robustness Very similar to C++, with some changes to increase robustness No ‘ -> ’ or ‘ :: ’; all qualification uses ‘. ’ Local variables must be initialized before use if, while, do require bool condition goto can’t jump into blocks switch statement – no fall through Expression statements must do something useful (assignment or call) void Foo() { void Foo() { i == 1; // error if (i = 1) // error i == 1; // error if (i = 1) // error }...
C# Design Goals Simple, Extensible Type System Simple, Extensible Type System 1 st Class Component Support 1 st Class Component Support Robust and Versionable Robust and Versionable Preserve existing investments Preserve existing investments
Problem: How to Unify the Type System A single universal base type (“object”) A single universal base type (“object”) All types ultimately inherit from object Object variable can hold any value Any piece of data can be stored, transported, and manipulated with no extra work Unification enables: Unification enables: Calling virtual functions on any value Collection classes for any type
Unifying the Type System Desired Picture: Stream MemoryStreamFileStream Hashtabledoubleint object How to deal with the primitive types without losing performance? How to deal with the primitive types without losing performance? How to create user-defined types that are as efficient as “int” or “double”? How to create user-defined types that are as efficient as “int” or “double”?
How to Unify: A traditional approach (SmallTalk) Make everything a real object Make everything a real object Performance implications All objects have a type descriptor or virtual function table May require all object to be heap-allocated to prevent dangle pointers Behavior and expectation mismatch “int” variables can be “null”
How to Unify: Don’t do it (Eiffel, Java) Intrinsic types are not classes Intrinsic types are not classes Good performance Can’t convert “int” to “Object” – the primitive types are in a separate world Requires special wrapper classes (e.g., “Integer”) to “wrap” a primitive type so that it works in the “Object” world. Not extensible – the set of primitive types is fixed.
How to Unify: C# Approach Types are divides into two kinds: Reference types and Value types Types are divides into two kinds: Reference types and Value types Reference types are full-featured: Reference types are full-featured: Always allocated in heap Arbitrary derivation Value types have restrictions: Value types have restrictions: Only inherit from object Can’t be used as base classes Allocated from stack or inline inside other objects Assignment copies value, not reference
Unification Value types don’t need type descriptors or vtables (efficient!) Value types don’t need type descriptors or vtables (efficient!) “object” does need a type descriptor, because it can contain any type “object” does need a type descriptor, because it can contain any type Value types become reference types when they are converted to “object” Value types become reference types when they are converted to “object” Value is copied to heap, type descriptor attached Process is called “boxing” When cast back to value type, “unboxing” occurs, value is copied out of heap
Boxing and Unboxing “Everything is an object” “Everything is an object” Any type can can be stored as an object int i = 123; 123 i o 123 “int” }“Boxing” j }“Unboxing” ? 123 object o = i; int j = (int)o;
User-Defined Types C# allows user-defined types to be either reference or value types C# allows user-defined types to be either reference or value types Classes (reference) Classes (reference) Used for most objects Structs (value) Structs (value) Objects that are like primitive data (Point, Complex, etc). struct Point { int x, y;... } Point sp = new Point(10, 20);
C# Type System Natural User Model Natural User Model Extensible Extensible Performant Performant
Problem: Additional Declarative Information How do you associate information with types and members? How do you associate information with types and members? XML persistence mapping for a type External code interop information Remoting information Transaction context for a method Visual designer information (how should property be categorized?) Security contraints (what permissions are required to call this method?)
Other Approaches Add keyword or pragma Add keyword or pragma Requires updating the compiler for each new piece of information Use external file Use external file Information clumsy to find/see Require duplication of names Example: IDL files for remote procedures Use naming patterns Use naming patterns Create a new class or constant that describes the class/members Example: Java “BeanInfo” classes
C# Solution: Attributes Attach named attributes (with optional arguments) to language element Attach named attributes (with optional arguments) to language element Uses simple bracketed syntax Uses simple bracketed syntax Arguments must be constants of string, number, enum, type-name Arguments must be constants of string, number, enum, type-name
Attributes - Examples public class OrderProcessor { [WebMethod] [WebMethod] public void SubmitOrder(PurchaseOrder order) {...} public void SubmitOrder(PurchaseOrder order) {...}} public class PurchaseOrder { [XmlElement("shipTo")] public Address ShipTo; [XmlElement("shipTo")] public Address ShipTo; [XmlElement("billTo")] public Address BillTo; [XmlElement("billTo")] public Address BillTo; [XmlElement("items")] public Item[] Items; [XmlElement("items")] public Item[] Items; [XmlAttribute("date")] public DateTime OrderDate; [XmlAttribute("date")] public DateTime OrderDate;} public class Button { [Category(Categories.Layout)] [Category(Categories.Layout)] public int Width { get {…} set {…} } public int Width { get {…} set {…} } [Obsolete(“Use DoStuff2 instead”)] [Obsolete(“Use DoStuff2 instead”)] public void DoStuff() {…} public void DoStuff() {…}}
Attributes Attributes Attributes Attached to types, members, parameters, and libraries Present in the compiled metadata Can by examined by the common language runtime, by compilers, by the.NET Frameworks, or by user code (using reflection) Extensible Extensible Type-safe Type-safe Extensively used in.NET Frameworks Extensively used in.NET Frameworks XML, Web Services, security, serialization, component model, transactions, external code interop…
Creating an Attribute Attributes are simply classes Attributes are simply classes Derived from System.Attribute Class functionality = attribute functionality Attribute arguments are constructor arguments public class ObsoleteAttribute : System.Attribute { public ObsoleteAttribute () { … } public ObsoleteAttribute () { … } public ObsoleteAttribute (string descrip) { … } public ObsoleteAttribute (string descrip) { … }}
Using the Attribute [Obsolete] void Foo() {…} [Obsolete(“Use Baz instead”)] void Bar(int i) {…} When a compiler sees an attribute it: When a compiler sees an attribute it: 1. Finds the constructor, passing in args 2. Checks the types of arguments against the constructor 3. Saves a reference to the constructor and values of the arguments in the metadata
Querying Attributes Use reflection to query attributes Use reflection to query attributes Type type = typeof(MyClass); foreach(Attribute attr in type.GetCustomAttributes()) { if ( attr is ObsoleteAttribute ) { if ( attr is ObsoleteAttribute ) { ObsoleteAttribute oa = (ObsoleteAttribute) attr; ObsoleteAttribute oa = (ObsoleteAttribute) attr; Console.WriteLine(“{0} is obsolete: {1}”, type, attr.Description; Console.WriteLine(“{0} is obsolete: {1}”, type, attr.Description; }}
Problem : Versioning Once a class library is released, can we add functionality without breaking users of the class library? Once a class library is released, can we add functionality without breaking users of the class library? Very important for system level components! Very important for system level components!
Versioning Problems Versioning is overlooked in most languages Versioning is overlooked in most languages C++ and Java produce fragile base classes Users unable to express versioning intent Adding a virtual method can break a derived class Adding a virtual method can break a derived class If the derived class already has a method of the same name, breakage can happen
Versioning: C# solution C# allows intent to be expressed C# allows intent to be expressed Methods are not virtual by default C# keywords “virtual”, “override” and “new” provide context Adding a base class member never breaks a derived class Adding or removing a private member never breaks another class C# can't guarantee versioning C# can't guarantee versioning Can enable (e.g., explicit override) Can encourage (e.g., smart defaults)
Versioning Example class Derived: Base// version 1 { public virtual void Foo() { public virtual void Foo() { Console.WriteLine("Derived.Foo"); Console.WriteLine("Derived.Foo"); }} class Derived: Base// version 2a { new public virtual void Foo() { new public virtual void Foo() { Console.WriteLine("Derived.Foo"); Console.WriteLine("Derived.Foo"); }} class Derived: Base// version 2b { public override void Foo() { public override void Foo() { base.Foo(); base.Foo(); Console.WriteLine("Derived.Foo"); Console.WriteLine("Derived.Foo"); }} class Base// version 1 {} class Base // version 2 { public virtual void Foo() { public virtual void Foo() { Console.WriteLine("Base.Foo"); Console.WriteLine("Base.Foo"); }}
Interface Implementation Private interface implementations Private interface implementations Resolve interface member conflicts interface I { void foo(); void foo();} interface J { void foo(); void foo();} class C: I, J { void I.foo() { /* do one thing */ } void I.foo() { /* do one thing */ } void J.foo() { /* do another thing */ } void J.foo() { /* do another thing */ }}
foreach Statement Iteration of arrays Iteration of arrays Iteration of user-defined collections Iteration of user-defined collections foreach (Customer c in customers.OrderBy("name")) { if (c.Orders.Count != 0) { if (c.Orders.Count != 0) { }} public static void Main(string[] args) { foreach (string s in args) Console.WriteLine(s); foreach (string s in args) Console.WriteLine(s);}
Extending foreach IEnumerable interface IEnumerable { IEnumerator GetEnumerator(); IEnumerator GetEnumerator();} interface IEnumerator { bool MoveNext(); bool MoveNext(); object Current { get; } object Current { get; }}
Extending foreach IEnumerable foreach (int v in collection) { // use element v … } (IEnumerable) ie = (IEnumerable) collection; IEnumerator e = ie.GetEnumerator(); while (e.MoveNext()) { int v = (int) e.Current; …}
foreach Problems with IEnumerable Problems with IEnumerable no compile time type checking boxing when enumerating value types Solution : A pattern-based approach Solution : A pattern-based approach The C# compiler looks for: GetEnumerator() on the collection bool MoveNext() on the enumerator type Strongly typed Current on the enumerator type
foreach - Summary Some Complexity Some Complexity interface pattern Extensible Extensible User collections can plug into foreach User Model User Model Compile-time type checking Performance Performance Value type access without boxing
Future Directions
Generics - Prototype Implemented by MSR Cambridge Implemented by MSR Cambridge Don Syme Andrew Kennedy Published Paper at PLDI 2001 Published Paper at PLDI 2001
Generics - Prototype class Stack class Stack { T[ ] data; void Push(T top) { …} void Push(T top) { …} T Pop() { … } } Stack ss = new Stack ; ss.Push(“Hello”); Stack si = new Stack ; ss.Push(4);
Other Approaches – C++ Templates are really typed macros Templates are really typed macros Compile time instantiations only Compile time instantiations only Require source for new instantiations Require source for new instantiations Type Parameter Bounds Infered Type Parameter Bounds Infered Good Execution Speed Good Execution Speed Bad Code Size Bad Code Size
Other Approaches - Java Type Erasure Type Erasure No VM modifications No VM modifications Compile time type checking Compile time type checking No instantiations on primitive types No instantiations on primitive types Execution Speed – Casts Execution Speed – Casts Good Code Size Good Code Size Type Identity Problems Type Identity Problems List vs. List List vs. List
C#.NET Generics Prototype prototype.NET runtime is generics aware prototype.NET runtime is generics aware All objects carry exact runtime type All objects carry exact runtime type Instantiations on reference and value types Instantiations on reference and value types Type Parameters bounded by base class and/or interfaces Type Parameters bounded by base class and/or interfaces Runtime performs specialization Runtime performs specialization
C#.NET Generics Prototype Compile Time Experience Compile Time Experience Separate compilation of generic types Instantiations checked at compile time Can Instantiate on all types – int, string Runtime Experience Runtime Experience Dynamic Type Specialization Execution Speed – No Extra Casts Code Size – Code Sharing reduces bloat