The C# Programming Language

Anders Hejlsberg, Mads Torgersen, Scott Wiltamuth

Mentioned 14

The definitive reference to the C# Programming Language, updated for the new version 4.0, direct from its creator, Anders Hejlsberg * *New to this edition - all code presented in full color. *Contains insightful, valuable annotations from twelve leading C# programmers, available nowhere else. *C# has become the most widely used language for Windows development. *Anders Hejlsberg is the creator of C#, and a true legend among programmers. C# is now firmly established as the most-used language when writing applications for Windows and the Microsoft platform. Written by the language's architect, Anders Hejlsberg, and design team members, and now updated for C# 4.0, The C# Programming Language, 4/e, is the definitive technical reference for C#. It provides the most complete specification of the languages, along with descriptions, reference materials, and code samples from the C# design team. This edition also adds valuable notes, comments, and tips from twelve of the world's top C# programmers, including Bill Wagner, Chris Sells, Jesse Liberty, and Brad Abrams. It has been brought fully up to date with the new features of C# 4.0. And new to this edition, all the code will be presented in full color, so that it appears in the book exactly as it appears on screen. This book is a must-have for any developer using C# on a regular basis.

More on Amazon.com

Mentioned in questions and answers.

The yield keyword is one of those keywords in C# that continues to mystify me, and I've never been confident that I'm using it correctly.

Of the following two pieces of code, which is the preferred and why?

Version 1: Using yield return

public static IEnumerable<Product> GetAllProducts()
{
    using (AdventureWorksEntities db = new AdventureWorksEntities())
    {
        var products = from product in db.Product
                       select product;

        foreach (Product product in products)
        {
            yield return product;
        }
    }
}

Version 2: Return the list

public static IEnumerable<Product> GetAllProducts()
{
    using (AdventureWorksEntities db = new AdventureWorksEntities())
    {
        var products = from product in db.Product
                       select product;

        return products.ToList<Product>();
    }
}

This is what Chris Sells tells about those statements in The C# Programming Language;

I sometimes forget that yield return is not the same as return , in that the code after a yield return can be executed. For example, the code after the first return here can never be executed:

    int F() {
return 1;
return 2; // Can never be executed
}

In contrast, the code after the first yield return here can be executed:

IEnumerable<int> F() {
yield return 1;
yield return 2; // Can be executed
}

This often bites me in an if statement:

IEnumerable<int> F() {
if(...) { yield return 1; } // I mean this to be the only
// thing returned
yield return 2; // Oops!
}

In these cases, remembering that yield return is not “final” like return is helpful.

In The C# Programming Language Krzysztof Cwalina states in an annotation:

we explicitly decided not to add support for multiple inheritance [...] the lack of multiple inheritance forced us to add the concept of interfaces, which in turn are responsible for problems with the evolution of the framework, deeper inheritance hierarchies, and many other problems.

Interfaces are a core concept to OO programming languages. I don't follow the meaning of "forced us to add the concept of interfaces"

Does Krzysztof mean that certain design decisions had to be made regarding the use of interfaces where otherwise mulitple inheritance would be used? Or, does he mean that interface's were introduced to C# because of a lack of multiple inheritance? Can you provide an example?

An interface is simply a base class that has no data members and only defines public abstract methods. For example, this would be an interface in C++:

class IFrobbable {
    public:
    virtual void Frob() = 0;
}

Therefore when MI is available as a language feature you can "implement" interfaces by simply deriving from them (again, C++):

class Widget : public IFrobbable, public IBrappable {
    // ...
}

Multiple inheritance in the general case gives rise to many questions and problems that don't necessarily have a single answer, or even a good one for your particular definition of "good" (dreaded diamond, anyone?). Multiple interface implementation sidesteps most of these problems exactly because the concept of "inheriting" an interface is a very constrained special case of inheriting a full-blown class.

And this is where "forced us to add the concept of interfaces" comes in: you cannot do much OO design when constrained to single inheritance only, for example there are serious issues with not being able to reuse code when code reuse is in fact one of the most common arguments for OO. You have to do something more, and the next step is adding multiple inheritance but only for classes that satisfy the constraints of an interface.

So, I interpret Krzysztof's quote as saying

Multiple inheritance in the general case is a very thorny problem that we could not tackle in a satisfactory manner given real-life constraints on the development of .NET. However, interface inheritance is both much simpler to tackle and of supreme importance in OOP, so we did put that in. But of course interfaces also come with their own set of problems, mainly regarding how the BCL is structured.

Why will the below not compile? What's special about the interface that causes the compiler to think it can't cast from Container<T> to T, when T is an interface? I don't think its a covariant issue, as I'm not downcasting, but perhaps it is. This is quite like Why C# compiler doesn't call implicit cast operator? but I don't think it's quite the same.

Product pIn =null;
Product pOut;
Container<Product> pContainer;

List<Product> pListIn = null;
List<Product> pListOut;
Container<List<Product>> pListContainer;

IList<Product> pIListIn = null;
IList<Product> pIListOut;
Container<IList<Product>> pIListContainer;

pContainer = pIn;
pOut = pContainer; // all good

pListContainer = pListIn; 
pListOut = pListContainer; // all good too

pIListContainer = pIListIn; // fails , cant do implicit cast for some reason
pIListOut = pIListContainer; // and here too

class Container<T>
{
 private T value;

 private Container(T item) { value = item; }

 public static implicit operator Container<T>(T item)
 {
  return new Container<T>(item);
 }

 public static implicit operator T(Container<T> container)
 {
  return container.value;
 }
}

Cannot implicitly convert type 'Container<IList<Product>>' to 'IList<Product>'. An explicit conversion exists (are you missing a cast?)
Cannot implicitly convert type 'IList<Product>' to 'Container<IList<Product>>'. An explicit conversion exists (are you missing a cast?)

User defined conversions aren't allowed on interfaces at all. It would potentially be ambiguous, because the type you're trying to convert from could implement the interface itself - at which point what would the cast mean? A reference conversion like a normal cast, or an invocation of the user-defined conversion?

From section 10.3.3 of the C# 4 spec:

For a given source type S and target type T, if S or T are nullable types, let S0 and T0 refer to their underlying types, otherwise S0 and T0 are equal to S and T respectively. A class or struct is permitted to declare a conversion from a source type S to a target type T only if all of the following are true:

  • S0 and T0 are different types.
  • Either S0 or T0 is the class or struct type in which the operator declaration takes place.
  • Neither S0 nor T0 is an interface-type.
  • Excluding user-defined conversions, a conversion does not exist from S to T or from T to S.

and then later:

However, it is possible to declare operators on generic types that, for particular type arguments, specify conversions that already exist as pre-defined conversions
...
In cases where a pre-defined conversion exists between two types, any user-defined conversions between those types are ignored. Specifically:

  • If a pre-defined implicit conversion (§6.1) exists from type S to type T, all user-defined conversions (implicit or explicit) from S to T are ignored.
  • If a pre-defined explicit conversion (§6.2) exists from type S to type T, any user-defined explicit conversions from S to T are ignored. Furthermore:
    • If T is an interface type, user-defined implicit conversions from S to T are ignored.
    • Otherwise, user-defined implicit conversions from S to T are still considered.

Note the first nested bullet here.

(I can thoroughly recommend getting hold of the spec by the way. It's available online in various versions and formats, but the hardcopy annotated edition is also a goldmine of little nuggets from the team and others. I should confess a certain bias here, as I'm one of the annotators - but ignoring my stuff, all the other annotations are well worth reading!)

Possible Duplicate:
Using var outside of a method

class A {
string X;
}
// Proper
class A {
var X;
}
// Improper (gives error)

Why is it, that i cant have var type variable declare in Class and what can be done in order to achieve it OR what is an alternative ?

In function/method, i can declare a var type variable,then why can't, i do it in class ?

Thanks.

// method variable
var X;

is never valid - even inside a method; you need immediate initialization to infer the type:

// method variable
var X = "abc"; // now a string

As for why this isn't available for fields with a field-initializer: simply, the spec says so. Now why the spec says so is another debate... I could check the annotated spec, but my suspicion would be simply that they are more necessary for method variables, where the logic is more complex (re LINQ etc). Also, they are often used with anonymous types (that being the necessity for their existence); but anonymous types can't be exposed on a public api... so you could have the very confusing:

private var foo = new { x = 123, y = "abc"}; // valid
public var bar = new { x = 123, y = "abc"}; // invalid

So all in all I'm happy with the current logic.

This one's really an offshoot of this question, but I think it deserves its own answer.

According to section 15.13 of the ECMA-334 (on the using statement, below referred to as resource-acquisition):

Local variables declared in a resource-acquisition are read-only, and shall include an initializer. A compile-time error occurs if the embedded statement attempts to modify these local variables (via assignment or the ++ and -- operators) or pass them as ref or out parameters.

This seems to explain why the code below is illegal.

struct Mutable : IDisposable
{
    public int Field;
    public void SetField(int value) { Field = value; }
    public void Dispose() { }
}

using (var m = new Mutable())
{
    // This results in a compiler error.
    m.Field = 10;
}

But what about this?

using (var e = new Mutable())
{
    // This is doing exactly the same thing, but it compiles and runs just fine.
    e.SetField(10);
}

Is the above snippet undefined and/or illegal in C#? If it's legal, what is the relationship between this code and the excerpt from the spec above? If it's illegal, why does it work? Is there some subtle loophole that permits it, or is the fact that it works attributable only to mere luck (so that one shouldn't ever rely on the functionality of such seemingly harmless-looking code)?

This behavior is undefined. In The C# Programming language at the end of the C# 4.0 spec section 7.6.4 (Member Access) Peter Sestoft states:

The two bulleted points stating "if the field is readonly...then the result is a value" have a slightly surprising effect when the field has a struct type, and that struct type has a mutable field (not a recommended combination--see other annotations on this point).

He provides an example. I created my own example which displays more detail below.

Then, he goes on to say:

Somewhat strangely, if instead s were a local variable of struct type declared in a using statement, which also has the effect of making s immutable, then s.SetX() updates s.x as expected.

Here we see one of the authors acknowledge that this behavior is inconsistent. Per section 7.6.4, readonly fields are treated as values and do not change (copies change). Because section 8.13 tells us using statements treat resources as read-only:

the resource variable is read-only in the embedded statement,

resources in using statements should behave like readonly fields. Per the rules of 7.6.4 we should be dealing with a value not a variable. But surprisingly, the original value of the resource does change as demonstrated in this example:

    //Sections relate to C# 4.0 spec
    class Test
    {
        readonly S readonlyS = new S();

        static void Main()
        {
            Test test = new Test();
            test.readonlyS.SetX();//valid we are incrementing the value of a copy of readonlyS.  This is per the rules defined in 7.6.4
            Console.WriteLine(test.readonlyS.x);//outputs 0 because readonlyS is a value not a variable
            //test.readonlyS.x = 0;//invalid

            using (S s = new S())
            {
                s.SetX();//valid, changes the original value.  
                Console.WriteLine(s.x);//Surprisingly...outputs 2.  Although S is supposed to be a readonly field...the behavior diverges.
                //s.x = 0;//invalid
            }
        }

    }

    struct S : IDisposable
    {
        public int x;

        public void SetX()
        {
            x = 2;
        }

        public void Dispose()
        {

        }
    }    

The situation is bizarre. Bottom line, avoid creating readonly mutable fields.

This is not yet another question about the difference between abstract classes and interfaces, so please think twice before voting to close it.

I am aware that interfaces are essential in those OOP languages which don't support multiple inheritance - such as C# and Java. But what about those with multiple inheritance? Would be a concept of interface (as a specific language feature) redundant in a language with multiple inheritance? I guess that OOP "contract" between classes can be established using abstract classes.

Or, to put it a bit more explicitly, are interfaces in C# and Java just a consequence of the fact that they do not support multiple inheritance?

... The lack of multiple inheritance forced us to add the concept of interfaces...

So yes, I believe interfaces are redundant given multiple inheritance. You could use pure abstract base classes in a language supporting multiple inheritance or mix-ins.

That said, I'm quite happy with single inheritance most of the time. Eric Lippert makes the point earlier in the same volume (p. 10) that the choice of single inheritance "... eliminates in one stroke many of the complicated corner cases..."

I have three classes, Base, Derived and Final. Derived derives from Base and Final derives from Derived. All three classes have a static constructor. Class Derived as a public static method called Setup. When I call Final.Setup, I expect that all three static constructors get executed, but only the one in Derived gets run.

Here is the sample source code:

    abstract class Base
    {
        static Base()
        {
            System.Console.WriteLine ("Base");
        }
    }

    abstract class Derived : Base
    {
        static Derived()
        {
            System.Console.WriteLine ("Derived");
        }

        public static void Setup()
        {
            System.Console.WriteLine ("Setup");
        }
    }

    sealed class Final : Derived
    {
        static Final()
        {
            System.Console.WriteLine ("Final");
        }
    }

This makes only partially sense to me. I understand that calling Final.Setup() is in fact just an alias for Derived.Setup(), so skipping the static constructor in Final seems fair enough. However, why isn't the static constructor of Base called?

I can fix this by calling into a no-operation static method of Base or by accessing some dummy static method of Base. But I was wondering: what is the reasoning behind this apparently strange behavior?

A static constructor is called when (according to TCPL):

  • An instance of the class type is created.
  • Any of the static members of the class type are referenced.

As an example, consider a class with the static Main method in which execution begins: if you have a static constructor, it will be called before the Main method is called.

Note that even before a static constructor is executed, any static fields are initialized to their default value and then the static field initializers are executed for those field. Only then, the static constructor (cctor) is executed.


To answer your question more directly: static constructors are not inherited, and they cannot be called directly, hence your Base cctor will not be called in your scenario, unless you give the abstract Base class a static method and call that first, i.e. as in Base.Initialize(), as you already suggested.

About the reasoning, that's simple, thinking C# (in Java this is different): static methods are not inherited, thus static constructors should neither be inherited as this could cause unwanted side effects (a cctor called when nothing references that class).

In The C# Programming language Bill Wagner says:

Many people confuse dynamic bindig with type inference. Type inference is statically bound. The compiler determines the type at compile time. For example:

var i = 5;             //i is an int (Compiler performs type inference)
Console.WriteLine(i);  //Static binding to Console.WriteLine(int)

The compiler infers that i is an integer. All binding on the variable i uses static binding.

Now, given this information and my own made-up dynamic scenerio:

        dynamic i = 5;       //Compiler punts
        Console.WriteLine(i);//This is now dynamically bound

We know type inference is statically bound. This implies that there is no way a dynamic variable can use type inference to determine a type. How does a dynamic type get resolved without using type inference?

Update
To try and clarify...at runtime we must somehow figure out what type i is right? Because I assign a literal 5 then the runtime can infer that i is an int. Isn't that type inference rather than dynamic binding?

What distinction is Bill making?

The distinction that Bill is making is that many people think that:

var x = Whatever();
x.Foo();

will work out at runtime what method Foo to call based on the type of object returned at runtime by Whatever. That's not true; that would be

dynamic x = Whatever();
x.Foo();

The var just means "work out the type at compile time and substitute it in", not "work it out at runtime".

So if I have

dynamic i = 5;
Console.WriteLine(i);

What happens?

The compiler generates code that is morally like this:

object i = (object)5;
DynamicCallSite callSite = new DynamicCallSite(typeof(Console), "WriteLine"));
callSite.Invoke(i);

It is a bit more complicated than that; the call site is cached, for one thing. But this gives you the flavour of it.

The invocation method asks i for its type via GetType and then starts up a special version of the C# compiler that can understand reflection objects. It does overload resolution on the members of Console named WriteLine, and determines which overload of Console.WriteLine would have been called had i been typed as int in the first place.

It then generates an expression tree representing that call, compiles the expression tree into a delegate, caches it in the call site, and invokes the delegate.

The second time you do this, the cached call site looks in its cache and sees that the last time that i was int, a particular delegate was invoked. So the second time it skips creating the call site and doing overload resolution, and just invokes the delegate.

For more information, see:

http://ericlippert.com/2012/10/22/a-method-group-of-one/

http://ericlippert.com/2012/11/05/dynamic-contagion-part-one/

http://ericlippert.com/2012/11/09/dynamic-contagion-part-two/

A historical perspective on the feature can be obtained from Chris and Sam's blogs:

http://blogs.msdn.com/b/cburrows/archive/tags/dynamic/

http://blogs.msdn.com/b/samng/archive/tags/dynamic/

They did a lot of the implementation; however some of these article reflect outdated design choices. We never did go with "The Phantom Method" algorithm, regrettably. (Not a great algorithm, but a great name!)

I am trying to get hold of a hard copy of the C# language spec and heard you can get in touch with Microsoft to do this. I have contacted them by phone and they barely knew what C# was let alone where to get a copy of the spec!

Can anyone shed a bit of light on this please??

Thanks!

You can get a copy of the ECMA spec free from ECMA (in hard copy) - or at least you used to be able to.

For the Microsoft spec, you can buy the annotated copy from Amazon, or of course your other favourite book supplier. I can thoroughly recommend this version - the annotations are really interesting. (Well, mine aren't all that hot, but the other annotators are really smart :)

There are soft copies for free, of course - I maintain a page with links to the the various versions.

Possible Duplicate:
Why does .Net Framework not use unsigned data types?

In The C# Programming Language (Covering C# 4.0) (4th Edition), 1.3 Types and Variables, Page 9.

Jon Skeet says;

Hooray for byte being an unsigned type! The fact that in Java a byte is signed (and with no unsigned equivalent) makes a lot of bit-twiddling pointlessly error-prone. It’s quite possible that we should all be using uint a lot more than we do, mind you: I’m sure many developers reach for int by default when they want an integer type. The framework designers also fall into this category, of course: Why should String.Length be signed?

When I decompile String.Length;

/// <summary>
    /// Gets the number of characters in the current <see cref="T:System.String"/> object.
    /// </summary>
    /// 
    /// <returns>
    /// The number of characters in the current string.
    /// </returns>
    /// <filterpriority>1</filterpriority>
    [__DynamicallyInvokable]
    public int Length { [SecuritySafeCritical, __DynamicallyInvokable, MethodImpl(MethodImplOptions.InternalCall)] get; }

Also in MSDN;

The Length property returns the number of Char objects in this instance, not the number of Unicode characters. The reason is that a Unicode character might be represented by more than one Char.

It returns the number of character objects in the current string. Why does String.Length return Int32 when there already is a type called UInt32? What is the point of the signed-unsigned byte and String.Length?

Int32 is widely considered to be the "general-purpose" integer in the .NET framework. If you need a general number that is not fractional, Int32 is what you reach for first. You would only use one of the unsigned types if you had a specific reason to do so.

Using int for all Count properties creates a consistent API, and allows for the possibility of using -1 as a flag value (which some APIs in the framework do).

From http://blogs.msdn.com/b/brada/archive/2003/09/02/50285.aspx:

The general feeling among many of us is that the vast majority of programming is done with signed types. Whenever you switch to unsigned types you force a mental model switch (and an ugly cast). In the worst cast you build up a whole parallel world of APIs that take unsigned types.

In .NET, all value types inherit from the class named System.ValueType. System.ValueType is a class, so it is a reference type.

My question is how and why possible a value type derives from a reference type?

Eric Lippert says in The C# Programming Language 4th Edition:

This point is frequently confusing to novices. I am often asked, “But how is it possible that a value type derives from a reference type?” I think the confusion arises as a result of a misunderstanding of what “derives from” means. Derivation does not imply that the layout of the bits in memory of the base type is somewhere found in the layout of bits in the derived type. Rather, it simply implies that some mechanism exists whereby members of the base type may be accessed from the derived type.

I want to know, is this possible to use C# language syntax in own platform? I know that C# is ECMA standartized language. So how can it be implemented?

I know there are examples such as Mono & Unity3D who implemented C#.

So for example : One common class library (own, written in C) & C# as a programming language.

The problem is that I never did that before, so I am interested what should I read & where to start. Any other articles about implementing syntax will be good.

I hope you got my idea.

Best Regards,

George.

If you want to write a compiler for C#, the place to start is the Dragon Book, alongside a copy of the C# 4 spec. It's an awful lot of work though, and not for the faint of heart; you generally need years of experience to write a compiler for something as complicated as C#.

I recommend starting with a smaller language, maybe a trivial language like brainfuck, or looking at existing toy compilers.

I came to here from a friend of mine that, to master the C# language, one must go through the ECMA C# spec. Is this true?

The ECMA spec is sadly out of date. Reading the annotated MS spec is useful, but not essential. Really, just use C#... lots. And read books like C# in Depth.

In the C# programming language Chris Sells states:

I begin to wonder about any language where the following string of characters is both valid and meaningful:

class Foo
{
    public static dynamic DoFoo()
    {
        //...
    }
}

Of course this means that the DoFoo method is a type method (as opposed to an instance method) and that the type of the return value is unknown until runtime, but it's hard not to read DoFoo as both static and dynamic at the same time and worry about an occurrence of a singularity.

I believe Chris means something like..."we don't want to produce a language that implies that a method belongs to a type and the type could be anything"...but I can't find any evidence of this nor of the impact of such a design.

I found Technological singularity, but that appears unrelated. What does Chris mean by a "singularity" in this case? Why are singularities troublesome?

I think that in this case, Chris is assuming an alternative meaning of the word static that would imply that it is the diametric opposite of dynamic.

This oxymoron-ic method signature might encourage the appearance of a black-hole or other unexpected physical phenomenon.