CLR Via C#

Jeffrey Richter

Mentioned 82

Your essential guide to developing applications with the common language runtime (CLR) and Micrsoft.NET Framework4.0, with examles in Microsoft Visual C # 2010.

More on Amazon.com

Mentioned in questions and answers.

I frequently hear/read the following advice:

Always make a copy of an event before you check it for null and fire it. This will eliminate a potential problem with threading where the event becomes null at the location right between where you check for null and where you fire the event:

// Copy the event delegate before checking/calling
EventHandler copy = TheEvent;

if (copy != null)
    copy(this, EventArgs.Empty); // Call any handlers on the copied list

Updated: I thought from reading about optimizations that this might also require the event member to be volatile, but Jon Skeet states in his answer that the CLR doesn't optimize away the copy.

But meanwhile, in order for this issue to even occur, another thread must have done something like this:

// Better delist from event - don't want our handler called from now on:
otherObject.TheEvent -= OnTheEvent;
// Good, now we can be certain that OnTheEvent will not run...

The actual sequence might be this mixture:

// Copy the event delegate before checking/calling
EventHandler copy = TheEvent;

// Better delist from event - don't want our handler called from now on:
otherObject.TheEvent -= OnTheEvent;    
// Good, now we can be certain that OnTheEvent will not run...

if (copy != null)
    copy(this, EventArgs.Empty); // Call any handlers on the copied list

The point being that OnTheEvent runs after the author has unsubscribed, and yet they just unsubscribed specifically to avoid that happening. Surely what is really needed is a custom event implementation with appropriate synchronisation in the add and remove accessors. And in addition there is the problem of possible deadlocks if a lock is held while an event is fired.

So is this Cargo Cult Programming? It seems that way - a lot of people must be taking this step to protect their code from multiple threads, when in reality it seems to me that events require much more care than this before they can be used as part of a multi-threaded design. Consequently, people who are not taking that additional care might as well ignore this advice - it simply isn't an issue for single-threaded programs, and in fact, given the absence of volatile in most online example code, the advice may be having no effect at all.

(And isn't it a lot simpler to just assign the empty delegate { } on the member declaration so that you never need to check for null in the first place?)

Updated: In case it wasn't clear, I did grasp the intention of the advice - to avoid a null reference exception under all circumstances. My point is that this particular null reference exception can only occur if another thread is delisting from the event, and the only reason for doing that is to ensure that no further calls will be received via that event, which clearly is NOT achieved by this technique. You'd be concealing a race condition - it would be better to reveal it! That null exception helps to detect an abuse of your component. If you want your component to be protected from abuse, you could follow the example of WPF - store the thread ID in your constructor and then throw an exception if another thread tries to interact directly with your component. Or else implement a truly thread-safe component (not an easy task).

So I contend that merely doing this copy/check idiom is cargo cult programming, adding mess and noise to your code. To actually protect against other threads requires a lot more work.

Update in response to Eric Lippert's blog posts:

So there's a major thing I'd missed about event handlers: "event handlers are required to be robust in the face of being called even after the event has been unsubscribed", and obviously therefore we only need to care about the possibility of the event delegate being null. Is that requirement on event handlers documented anywhere?

And so: "There are other ways to solve this problem; for example, initializing the handler to have an empty action that is never removed. But doing a null check is the standard pattern."

So the one remaining fragment of my question is, why is explicit-null-check the "standard pattern"? The alternative, assigning the empty delegate, requires only = delegate {} to be added to the event declaration, and this eliminates those little piles of stinky ceremony from every place where the event is raised. It would be easy to make sure that the empty delegate is cheap to instantiate. Or am I still missing something?

Surely it must be that (as Jon Skeet suggested) this is just .NET 1.x advice that hasn't died out, as it should have done in 2005?

According to Jeffrey Richter in the book CLR via C#, the correct method is:

// Copy a reference to the delegate field now into a temporary field for thread safety
EventHandler<EventArgs> temp =
Interlocked.CompareExchange(ref NewMail, null, null);
// If any methods registered interest with our event, notify them
if (temp != null) temp(this, e);

Because it forces a reference copy. For more information, see his Event section in the book.

I'm new to .NET C# programming. I'm following few books. It is said that instead of compiling it directly to binary code (Native code). High level code is converted into intermediate language (called MSIL aka CIL). But when I compile, I get an exe/Dll file.

  1. Is this MSIL/CIL containted in these exe/dll file?
  2. I want to see that intermediate language code. Just to get feel of its existence. How to view it?
  3. They are calling this exe/dll file an assembly. Are they using this "fancy word" just to differentiate these from the exe/dll files that contain binary code (native code)?
  1. Yes it is in assembly.
  2. You need .NET Reflector or ILDasm.
  3. More details on assembly check HERE.

P.S As you are following some books I will highly recommend you CLR via C#.

I am trying to do operator overloads for +=, but I can't. I can only make an operator overload for +.

How come?

Edit

The reason this is not working is that I have a Vector class (with an X and Y field). Consider the following example.

vector1 += vector2;

If my operator overload is set to:

public static Vector operator +(Vector left, Vector right)
{
    return new Vector(right.x + left.x, right.y + left.y);
}

Then the result won't be added to vector1, but instead, vector1 will become a brand new Vector by reference as well.

Overloadable Operators, from MSDN:

Assignment operators cannot be overloaded, but +=, for example, is evaluated using +, which can be overloaded.

Even more, none of assignment operators can be overloaded. I think this is because there will be an effect for the Garbage collection and memory management, which is a potential security hole in CLR strong typed world.

Nevertheless, let's see what exactly an operator is. According to the famous Jeffrey Richter's book, each programming language has its own operators list, which are compiled in a special method calls, and CLR itself doesn't know anything about operators. So let's see what exactly stays behind the + and += operators.

See this simple code:

Decimal d = 10M;
d = d + 10M;
Console.WriteLine(d);

Let view the IL-code for this instructions:

  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  newobj     instance void [mscorlib]System.Decimal::.ctor(int32)
  IL_0008:  stloc.0
  IL_0009:  ldloc.0
  IL_000a:  ldc.i4.s   10
  IL_000c:  newobj     instance void [mscorlib]System.Decimal::.ctor(int32)
  IL_0011:  call       valuetype [mscorlib]System.Decimal [mscorlib]System.Decimal::op_Addition(valuetype [mscorlib]System.Decimal,
                                                                                                valuetype [mscorlib]System.Decimal)
  IL_0016:  stloc.0

Now lets see this code:

Decimal d1 = 10M;
d1 += 10M;
Console.WriteLine(d1);

And IL-code for this:

  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  newobj     instance void [mscorlib]System.Decimal::.ctor(int32)
  IL_0008:  stloc.0
  IL_0009:  ldloc.0
  IL_000a:  ldc.i4.s   10
  IL_000c:  newobj     instance void [mscorlib]System.Decimal::.ctor(int32)
  IL_0011:  call       valuetype [mscorlib]System.Decimal [mscorlib]System.Decimal::op_Addition(valuetype [mscorlib]System.Decimal,
                                                                                                valuetype [mscorlib]System.Decimal)
  IL_0016:  stloc.0

They are equal! So the += operator is just syntactic sugar for your program in C#, and you can simply overload + operator.

For example:

class Foo
{
    private int c1;

    public Foo(int c11)
    {
        c1 = c11;
    }

    public static Foo operator +(Foo c1, Foo x)
    {
        return new Foo(c1.c1 + x.c1);
    }
}

static void Main(string[] args)
{
    Foo d1 =  new Foo (10);
    Foo d2 = new Foo(11);
    d2 += d1;
}

This code will be compiled and successfully run as:

  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  newobj     instance void ConsoleApplication2.Program/Foo::.ctor(int32)
  IL_0008:  stloc.0
  IL_0009:  ldc.i4.s   11
  IL_000b:  newobj     instance void ConsoleApplication2.Program/Foo::.ctor(int32)
  IL_0010:  stloc.1
  IL_0011:  ldloc.1
  IL_0012:  ldloc.0
  IL_0013:  call       class ConsoleApplication2.Program/Foo ConsoleApplication2.Program/Foo::op_Addition(class ConsoleApplication2.Program/Foo,
                                                                                                          class ConsoleApplication2.Program/Foo)
  IL_0018:  stloc.1

Update:

According to your Update - as the @EricLippert says, you really should have the vectors as an immutable object. Result of adding of the two vectors is a new vector, not the first one with different sizes.

If, for some reason you need to change first vector, you can use this overload (but as for me, this is very strange behaviour):

public static Vector operator +(Vector left, Vector right)
{
    left.x += right.x;
    left.y += right.y;
    return left;
}

Ignoring unsafe code, .NET cannot have memory leaks. I've read this endlessly from many experts and I believe it. However, I do not understand why this is so.

It is my understanding that the framework itself is written in C++ and C++ is susceptible to memory leaks.

  • Is the underlying framework so well-written, that it absolutely does not have any possibility of internal memory leaks?
  • Is there something within the framework's code that self-manages and even cures its own would-be memory leaks?
  • Is the answer something else that I haven't considered?

.NET can have memory leaks but it does a lot to help you avoid them. All reference type objects are allocated from a managed heap which tracks what objects are currently being used (value types are usually allocated on the stack). Whenever a new reference type object is created in .NET, it is allocated from this managed heap. The garbage collector is responsible for periodically running and freeing up any object that is no longer used (no longer being referenced by anything else in the application).

Jeffrey Richter's book CLR via C# has a good chapter on how memory is managed in .NET.

I've become a bit confused about the details of how the JIT compiler works. I know that C# compiles down to IL. The first time it is run it is JIT'd. Does this involve it getting translated into native code? Is the .NET runtime (as a Virtual Machine?) interact with the JIT'd code? I know this is naive, but I've really confused myself. My impression has always been that the assemblies are not interpreted by the .NET Runtime but I don't understand the details of the interaction.

If you're interested in a book on the topic of IL and the lower-level stuff in .NET, I would suggest looking at CLR via C#.

A .NET program is first compiled into MSIL code. When it is executed, the JIT compiler will compile it into native machine code.

I am wondering:

Where is these JIT-compiled machine code stored? Is it only stored in address space of the process? But since the second startup of the program is much faster than the first time, I think this native code must have been stored on disk somewhere even after the execution has finished. But where?

In short, the IL is JIT-compiled for each invocation of the program and is maintained in code pages of the process address space. See Chapter 1 of Richter for great coverage of the .NET execution model.

What is the difference between the JIT compiler and CLR? If you compile your code to il and CLR runs that code then what is the JIT doing? How has JIT compilation changed with the addition of generics to the CLR?

I know the thread is pretty old, but I thought I might put in the picture that made me understand JIT. It's from the excellent book CLR via C# by Jeffrey Ritcher. In the picture, the metadata he is talking about is the metadata emitted in the assembly header where all information about types in the assembly is stored:

JIT image from CLR via C#

I know this question could be similar to others but really I'm looking for reasons why VB6 developers should switch to C#.

My company recently approved project to be written in C#, so we have a lot of VB.Net programmers, however, we have some legacy app developers as well that are in VB6. We have a time frame to re-write those apps into .Net web apps. So no matter what they will have to learn new stuff.

One of the developers today specifically asked "why should we switch to C#?"

I responded that the community largely has decided that C# is the way to go with about 80% of the examples in C#. I am a VB.Net programmer and I am excited to finally cut my teeth on C#, however, being that I'm so new I'm not sure I can answer the "why?" question. My reasons are more because I want to learn it.

So without descending into a VB verses C# I really am curious if there are any resources that I can send to these developers to calm their nerves.

Looking forward to your input!

As far as the migration over to .NET goes, better late than never! As far as my advice goes, your mileage may vary, it's worth every penny you're paying for it!

I personally believe you are making the correct choice. The first instinct for VB developers is to switch to VB.NET. That sounds entirely reasonable, but in my opinion, it's the wrong choice. You really have to break down the reasons for the switch into two categories: Why switch to .NET, and why switch to C#?

Why switch to .NET over VB6:

  • Multithreading in VB6 is technically possible from a programming perspective, but just about impossible if you want to use the IDE.

  • I do not believe you can create a 64-bit native application in VB6. That rules out a lot.

  • No new enhancements are being made to VB6.

  • OK, there are so many reasons I can think of, I'll probably just stop there.

Why switch to C# instead of VB.NET

  • Developers may be lulled into a false sense of familiarity with VB.NET - treating resources like they did in VB6 without understanding the full concepts. An example: you often see new converts to VB.NET setting objects to Nothing, believing that it's a magical way to release resources. It is not.

  • It's true that most examples are now in C#. More importantly, Jeff Richter's book is only in C# now. If you want to understand how .NET really works, IMO his book is pretty much mandatory.

  • In .NET, you'll find that you will use lambda expressions all of the time, especially when operating with Linq. IMO VB's verbosity really becomes a barrier to comprehension and readability here, in ways where it simply wasn't before: foo.Select(x => x > 50) is, by just about any standard, much more fluent and readable than foo.Select(Function(x) x > 50). It gets worse as the expressions get more complex.

  • Some of the worst practices with VB6 are impossible or at least much less accessible in C# (such as ReDim Preserve and On Error Resume Next).

  • VB is saddled with some syntax which makes it pretty cumbersome and confusing to use when creating general-purpose CLR libraries. For example, in C#, you use indexers with brackets[]. In VB, you use parens. That makes it pretty difficult for the user of a subroutine to tell if it's an indexer or a function. If someone tried to use your library outside of VB, the difference would be important, but a VB developer might be inclined to create subroutines which should be indexers as functions, since they look similar.

  • I don't have any data on this, but if you are trying to hire a good set of programmers, the best ones will generally be less inclined to work in a shop which writes VB.NET over C#. They usually fear that the code their colleagues will be generating is likely to be substandard .NET code, and let's be frank here -- there's a stigma against VB.NET developers and the quality of their code in the community. There. I said it. Let the flames begin...

As a footnote, from my perspective, VB.NET was a real missed opportunity for MS. What it should have been was a way to seamlessly convert your old VB6 code to the .NET world - with dynamic invocation and high-quality COM interop from the start. What it ended up being was a near-clone of C#'s feature set with a more verbose syntax and little to no backward compatibility. Sad, really. It locked a lot of organizations out of .NET for a long time. Then again, maybe it forced a "cold-turkey" clean break from the past...

I'm interesting in how CLR implementes the calls like this:

abstract class A {
    public abstract void Foo<T, U, V>();
}

A a = ...
a.Foo<int, string, decimal>(); // <=== ?

Is this call cause an some kind of hash map lookup by type parameters tokens as the keys and compiled generic method specialization (one for all reference types and the different code for all the value types) as the values?

Yes. The code for specific type is generated at the runtime by CLR and keeps a hashtable (or similar) of implementations.

Page 372 of CLR via C#:

When a method that uses generic type parameters is JIT-compiled, the CLR takes the method's IL, substitutes the specified type arguments, and then creates native code that is specific to that method operating on the specified data types. This is exactly what you want and is one of the main features of generics. However, there is a downside to this: the CLR keeps generating native code for every method/type combination. This is referred to as code explosion. This can end up increasing the application's working set substantially, thereby hurting performance. Fortunately, the CLR has some optimizations built into it to reduce code explosion. First, if a method is called for a particular type argument, and later, the method is called again using the same type argument, the CLR will compile the code for this method/type combination just once. So if one assembly uses List, and a completely different assembly (loaded in the same AppDomain) also uses List, the CLR will compile the methods for List just once. This reduces code explosion substantially.

According to the CLI standard (Partition IIA, chapter 18) and the MSDN reference page for the System.Reflection.ExceptionHandlingClauseOptions enum, there are four different kinds of exception handler blocks:

  • catch clauses: "Catch all objects of the specified type."
  • filter clauses: "Enter handler only if filter succeeds."
  • finally clauses: "Handle all exceptions and normal exit."
  • fault clauses: "Handle all exceptions but not normal exit."

Given these brief explanations (cited from the CLI Standard, btw.), these should map to C# as follows:

  • catchcatch (FooException) { … }
  • filter — not available in C# (but in VB.NET as Catch FooException When booleanExpression)
  • finallyfinally { … }
  • faultcatch { … }

Experiment:

A simple experiment shows that this mapping is not what .NET's C# compiler really does:

// using System.Linq;
// using System.Reflection;

static bool IsCatchWithoutTypeSpecificationEmittedAsFaultClause()
{
    try
    {
        return MethodBase
               .GetCurrentMethod()
               .GetMethodBody()
               .ExceptionHandlingClauses
               .Any(clause => clause.Flags == ExceptionHandlingClauseOptions.Fault);
    }
    catch // <-- this is what the above code is inspecting
    {
        throw;
    }
}

This method returns false. That is, catch { … } has not been emitted as a fault clause.

A similar experiment shows that in fact, a catch clause was emitted (clause.Flags == ExceptionHandlingClauseOptions.Clause), even though no exception type has been specified.

Questions:

  1. If catch { … } really is a catch clause, then how are fault clauses different from catch clauses?
  2. Does the C# compiler ever output fault clauses at all?

1. If catch { … } really is a catch clause, then how are fault clauses different from catch clauses?

The C# compiler (at least the one that ships with .NET) actually appears to compile catch { … } as if it were really catch (object) { … }. This can be shown with the code below.

// using System;
// using System.Linq;
// using System.Reflection;

static Type GetCaughtTypeOfCatchClauseWithoutTypeSpecification()
{
    try
    {
        return MethodBase
               .GetCurrentMethod()
               .GetMethodBody()
               .ExceptionHandlingClauses
               .Where(clause => clause.Flags == ExceptionHandlingClauseOptions.Clause)
               .Select(clause => clause.CatchType)
               .Single();
    }
    catch // <-- this is what the above code is inspecting
    {
        throw;
    }
}

That method returns typeof(object).

So conceptually, a fault handler  is  is similar to a catch { … }; however, the C# compiler never generates code for that exact construct but pretends that it is a catch (object) { … }, which is conceptually a catch clause. Thus a catch clause gets emitted.

Side note: Jeffrey Richter's book "CLR via C#" has some related information (on pp. 472–474): Namely that the CLR allows any value to be thrown, not just Exception objects. However, starting with CLR version 2, non-Exception values are automatically wrapped in a RuntimeWrappedException object. So it seems somewhat surprising that C# would transform catch into catch (object) instead of catch (Exception). There is however a reason for this: The CLR can be told not to wrap non-Exception values by applying a [assembly: RuntimeCompatibility(WrapNonExceptionThrows = false)] attribute.

By the way, the VB.NET compiler, unlike the C# compiler, translates Catch to Catch anonymousVariable As Exception.


2. Does the C# compiler ever output fault clauses at all?

It obviously doesn't emit fault clauses for catch { … }. However, Bart de Smet's blog post "Reader challenge – fault handlers in C#" suggests that the C# compiler does produce fault clauses in certain circumstances.

For example, I know it is defined for gcc and used in the Linux kernel as:

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

If nothing like this is possible in C#, is the best alternative to manually reorder if-statements, putting the most likely case first? Are there any other ways to optimize based on this type of external knowledge?

On a related note, the CLR knows how to identify guard clauses and assumes that the alternate branch will be taken, making this optimization inappropriate to use on guard clases, correct?

(Note that I realize this may be a micro-optimization; I'm only interested for academic purposes.)

Short answer: No.

Longer Answer: You don't really need to in most cases. You can give hints by changing the logic in your statements. This is easier to do with a performance tool, like the one built into the higher (and more expensive) versions of Visual Studio, since you can capture the mispredicted branches counter. I realize this is for academic purposes, but it's good to know that the JITer is very good at optimizing your code for you. As an example (taken pretty much verbatim from CLR via C#)

This code:

public static void Main() {
    Int32[] a = new Int32[5];
    for(Int32 index = 0; index < a.Length; index++) {
        // Do something with a[index]
    }
}

may seem to be inefficient, since a.Length is a property and as we know in C#, a property is actually a set of two methods (get_Length and set_Length in this case). However, the JIT knows that it's a property and either stores the length in a local variable for you, or inlines the method, to prevent the overhead.

...some developers have underestimated the abilities of the JIT compiler and have tried to write “clever code” in an attempt to help the JIT compiler. However, any clever attempts that you come up with will almost certainly impact performance negatively and make your code harder to read, reducing its maintainability.

Among other things, it actually goes further and does the bounds checking once outside of the loop instead of inside the loop, which would degrade performance.

I realize it has little to do directly with your question, but I guess the point that I'm trying to make is that micro-optimizations like this don't really help you much in C#, because the JIT generally does it better, as it was designed exactly for this. (Fun fact, the x86 JIT compiler performs more aggressive optimizations than the x64 counterpart)

This article explains some of the optimizations that were added in .NET 3.5 SP1, among them being improvements to straightening branches to improve prediction and cache locality.

All of that being said, if you want to read a great book that goes into what the compiler generates and performance of the CLR, I recommend the book that I quoted from above, CLR via C#.

EDIT: I should mention that if this were currently possible in .NET, you could find the information in either the EMCA-335 standard or working draft. There is no standard that supports this, and viewing the metadata in something like IlDasm or CFF Explorer show no signs of any special metadata that can hint at branch predictions.

I am trying to get something clarified.

  1. When a .NET console application is run, does mscorlib.dll/mscoree.dll get loaded in the process's virtual address space?

  2. mscorlib.dll and mscoree.dll (CLR) are not managed dlls. Is that correct?

Also, what is a good resource to understand more about how a .NET program is executed?

I would recommend to read the Jefrey Richter's book CLR via C#. It provides very clear explanation what is going on under the hood :)

Also yoг may find this question helpful: Why is an assembly .exe file?

I have a class that should delete some file when disposed or finalized. Inside finalizers I can't use other objects because they could have been garbage-collected already.

Am I missing some point regarding finalizers and strings could be used?

UPD: Something like that:

public class TempFileStream : FileStream
{
    private string _filename;

    public TempFileStream(string filename)
        :base(filename, FileMode.Open, FileAccess.Read, FileShare.Read)
    {
        _filename = filename;
    }

    protected override void Dispose(bool disposing)
    {
        base.Dispose(disposing);
        if (_filename == null) return;

        try
        {
            File.Delete(_filename); // <-- oops! _filename could be gc-ed already
            _filename = null;
        }
        catch (Exception e)
        {
            ...
        }
    }
}

Yes, you can most certainly use strings from within a finalizer, and many other object types.

For the definitive source of all this, I would go pick up the book CLR via C#, 3rd edition, written by Jeffrey Richter. In chapter 21 this is all described in detail.

Anyway, here's what is really happening...

During garbage collection, any objects that have a finalizer that still wants to be called are placed on a special list, called the freachable list.

This list is considered a root, just as static variables and live local variables are. Therefore, any objects those objects refer to, and so on recursively is removed from the garbage collection cycle this time. They will survive the current garbage collection cycle as though they weren't eligible to collect to begin with.

Note that this includes strings, which was your question, but it also involves all other object types

Then, at some later point in time, the finalizer thread picks up the object from that list, and runs the finalizer on those objects, and then takes those objects off that list.

Then, the next time garbage collection runs, it finds the same objects once more, but this time the finalizer no longer wants to run, it has already been executed, and so the objects are collected as normal.

Let me illustrate with an example before I tell you what doesn't work.

Let's say you have objects A through Z, and each object references the next one, so you have object A referencing object B, B references C, C references D, and so on until Z.

Some of these objects implement finalizers, and they all implement IDisposable. Let's assume that A does not implement a finalizer but B does, and then some of the rest does as well, it's not important for this example which does beyond A and B.

Your program holds onto a reference to A, and only A.

In an ordinary, and correct, usage pattern you would dispose of A, which would dispose of B, which would dispose of C, etc. but you have a bug, so this doesn't happen. At some point, all of these objects are eligible for collection.

At this point GC will find all of these objects, but then notice that B has a finalizer, and it has not yet run. GC will therefore put B on the freachable list, and recursively take C, D, E, etc. up to Z, off of the GC list, because since B suddenly became in- eligible for collection, so does the rest. Note that some of these objects are also placed on the freachable list themselves, because they have finalizers on their own, but all the objects they refer to will survive GC.

A, however, is collected.

Let me make the above paragraph clear. At this point, A has been collected, but B, C, D, etc. up to Z are still alive as though nothing has happened. Though your code no longer has a reference to any of them, the freachable list has.

Then, the finalizer thread runs, and finalizes all of the objects in the freachable list, and takes the objects off of the list.

The next time GC is run, those objects are now collected.

So that certainly works, so what is the big bruaha about?

The problem is with the finalizer thread. This thread makes no assumptions about the order in which it should finalize those objects. It doesn't do this because in many cases it would be impossible for it to do so.

As I said above, in an ordinary world you would call dispose on A, which disposes B, which disposes C, etc. If one of these objects is a stream, the object referencing the stream might, in its call to Dispose, say "I'll just go ahead and flush my buffers before disposing the stream." This is perfectly legal and lots of existing code do this.

However, in the finalization thread, this order is no longer used, and thus if the stream was placed on the list before the objects that referenced it, the stream is finalized, and thus closed, before the object referencing it.

In other words, what you cannot do is summarized as follows:

You can not access any objects your object refer to, that has finalizers, as you have no guarantee that these objects will be in a usable state when your finalizer runs. The objects will still be there, in memory, and not collected, but they may be closed, terminated, finalized, etc. already.

So, back to your question:

Q. Can I use strings in finalizer method?
A. Yes, because strings do not implement a finalizer, and does not rely on other objects that has a finalizer, and will thus be alive and kicking at the time your finalizer runs.

The assumption that made you take the wrong path is the second sentence of the qustion:

Inside finalizers I can't use other objects because they could have been garbage-collected already.

The correct sentence would be:

Inside finalizer I can't use other objects that have finalizers, because they could have been finalized already.


For an example of something the finalizer would have no way of knowing the order in which to correctly finalize two objects, consider two objects that refer to each other and that both have finalizers. The finalizer thread would have to analyze the code to determine in which order they would normally be disposed, which might be a "dance" between the two objects. The finalizer thread does not do this, it just finalizes one before the other, and you have no guarantee which is first.


So, is there any time it is safe to access objects that also have a finalizer, from my own finalizer?

The only guaranteed safe scenario is when your program/class library/source code owns both objects so that you know that it is.

Before I explain this, this is not really good programming practices, so you probably shouldn't do it.

Example:

You have an object, Cache, that writes data to a file, this file is never kept open, and is thus only open when the object needs to write data to it.

You have another object, CacheManager, that uses the first one, and calls into the first object to give it data to write to the file.

CacheManager has a finalizer. The semantics here is that if the manager class is collected, but not disposed, it should delete the caches as it cannot guarantee their state.

However, the filename of the cache object is retrievable from a property of the cache object.

So the question is, do I need to make a copy of that filename into the manager object, to avoid problems during finalization?

Nope, you don't. When the manager is finalized, the cache object is still in memory, as is the filename string it refers to. What you cannot guarantee, however, is that any finalizer on the cache object hasn't already run.

However, in this case, if you know that the finalizer of the cache object either doesn't exist, or doesn't touch the file, your manager can read the filename property of the cache object, and delete the file.

However, since you now have a pretty strange dependency going on here, I would certainly advice against it.

I would really appreciate if someone could tell me whether I understand it well:

class X
{
   A a1=new A(); // reference on the stack, object value on the heap
   a1.VarA=5;    // on the stack - value type
   A a2=a1;      // reference on the stack, object value on the heap
   a2.VarA=10;   // on the stack - value type         
}

Also both a1 and a2 references are on the stack, while their "object" values are on the heap. But what about VarA variable, its still pure value type?

class A
{
   int VarA;
}

Read Jeff Richter's CLR via C# for a complete understanding of this topic.

As a ASP.NET developer with 5+ year experience. I like to measure my competency level in ASP.NET & SQL Server. Basically my goal is to raise my competency level and skill-set in ASP.NET; before that I need to know what is my level considering current ASP.NET and related technologies...

So, please provide some pointers...

  • Is there are any skill-set measuring Quiz or exam, which account experience and technology ?
  • How do you measure your or your junior developers skills or competency?

I guess I could rattle off some exams, like the MCP exams, or BrainBench, but you have to pay lots of money for those.

If you were really sold on taking an exam to gauge your competency, you could get a one of the MCP exam prep guides for ASP.NET, C#, and SQL Server and see how well you comprehend and take in that material. I'm not sure that it's the most accurate way of measuring competency though.

You can get a good qualitative evaluation of your SQL Server skills by simply reading Itzik's or Kalen's books and seeing how you comprehend them. For .NET, read Richter and critically evaluate yourself against the concepts you find in that book. Do those concepts make sense?

Probably the most valuable way to get feedback is to ask your senior developers for a frank evaluation of your skills.

If you're asking how I evaluate my junior developers, it's pretty easy once I see their code and they get a track record for a few months, but I don't believe quantitative analysis is the best way. Instead, I ask questions like:

  • Can they deliver?
  • Are they writing good code?
  • Are they taking the initiative to learn more?
  • What have they brought to the table?
  • Do they understand the software development lifecycle?
  • Do they break builds?
  • Are they good team players, or do they code in solitude?
  • Do they make suggestions?
  • Are they open to others' suggestions?
  • Do their design decisions make sense for the projects they've been on?

Ask yourself how your leaders would answer these questions about you. If you are seriously confident that they will respond positively, you will have an easier time "grading yourself".

I am trying to sign an assembly with a strong name by following the guide from here: http://msdn.microsoft.com/en-us/library/xc31ft41.aspx

The key instruction is:

al /out:<assembly name> <module name> /keyfile:<file name>

And it says

module name is the name of the code module used to create the assembly

I don't understand what this means. In the literal sense I would interpret the above as some component of csc.exe (i.e., it created the assembly) but obviously this is nonsensical in this context.

So firstly what does this refer to, and secondly (in order to aid my meta-learning) how would one go about reasoning what it is? I get the impression given the terseness of the documentation that it should be obvious or intuitive to me, but it currently is not.

I tried specifying some random names (e.g. blah.blah) but get this error:

ALINK: error AL1047: Error importing file 'c:\path\to\proj\bin\Debug\blah.blah' -- The system cannot find the file specified.

Edit: Upon further reading I get the impression the module name is the name of the code, but I have not had any luck specifying the .cs files either - I am told Database file is corrupt and may not be usable.

An assembly is made up of modules (.netmodule files), which are produced by compiling sources (.cs files). The assembly linker is responsible for packaging modules into assemblies. So if you have two source files class1.cs and class2.cs:

csc /t:module class1.cs
csc /t:module class2.cs
al /out:assembly.dll /t:library class1.netmodule class2.netmodule

For the best treatment of how the CLR deals with modules, manifests and assemblies, see Richter.

I need to capture video from a webcam. Are there any classes in C#/.NET that can help me with this. I am only interested in real time data.

And are there any good C#/.NET books that I can study to gain deep knowledge on the language and the platform?

I would recommend you to use 3rd party library. It would be the best solution instead of inventing your own bicycle. Here, I used AForge.Net. Though it has some problems concerning performance, but I tweaked the library myself when performance became a critical issue for me. The AForge.Net code is open source and you can tweak it to your needs.

As for books you definitely should look at Jeffrey Richter's "CLR via C#" and John Skeet's "C# in Depth".

What is different between this two variable definitions?

object oVar;
dynamic dVar;

Performance? Memory allocation ? Benefits?

The dynamic keyword also adds some overhead to your execution time, due to all the extra logic used - so if you don't need the dynamic runtime or interop and can get away with using object your code will be more efficient.

More information on the dynamic keyword can be found in Jeff Richter's book: CLR via C#, 3rd Edition

Sam Gentile did a couple of posts about the details too:

Why should I use IEnumerable<T> when I can make do with...say List<T>? What's the advantage of the former over the latter?

On this point Jeffrey-Richter writes:

When declaring a method’s parameter types, you should specify the weakest type possible, preferring interfaces over base classes. For example, if you are writing a method that manipulates a collection of items, it would be best to declare the method’s parameter by using an interface such as IEnumerable<T> rather than using a strong data type such as List<T> or even a stronger interface type such as ICollection<T> or IList<T>:

// Desired: This method uses a weak parameter type   
public void ManipulateItems<T>(IEnumerable<T> collection) { ... }  

// Undesired: This method uses a strong parameter type   
public void ManipulateItems<T>(List<T> collection) { ... }

The reason, of course, is that someone can call the first method passing in an array object, a List<T> object, a String object, and so on — any object whose type implements IEnumerable<T>. The second method allows only List<T> objects to be passed in; it will not accept an array or a String object. Obviously, the first method is better because it is much more flexible and can be used in a much wider range of scenarios.

Naturally, if you are writing a method that requires a list (not just any enumerable object), then you should declare the parameter type as an IList<T>. You should still avoid declaring the parameter type as List<T>. Using IList<T> allows the caller to pass arrays and any other objects whose type implements IList<T>.

On the flip side, it is usually best to declare a method’s return type by using the strongest type possible (trying not to commit yourself to a specific type).

Everything is ultimately JITed into native machine code, so ultimately, we have a native stack in .NET which the GC needs to scan for object pointers whenever it does a garbage collection.

Now, the question is: How does the .NET garbage collector figure out if a pointer to an object inside the GC heap is actually a managed pointer or a random integer that happens to have a value that corresponds to a valid address?

Obviously, if it can't distinguish the two, then there can be memory leaks, so I'm wondering how it works. Or -- dare I say it -- does .NET have the potential to leak memory? :O

Remember that all managed memory is managed by the CLR. Any actual managed reference was created by the CLR. It knows what it created and what it didn't.

If you really feel you must know the details of the implementation, then you should read CLR via C# by Jeffrey Richter. The answer is not simple - it's quote a bit more than can be answered on SO.

I am wondering what's the way to call a c# class method from C++(Native, not C++ CLI) code? Need simple and elegant way

You can embed any CLR assembly (C#, VB.NET, F#, ...) in a native C++ program using what's called "CLR Hosting". This is how native programs (such as SQL Server) support .NET code extensions. E.g. SQL CLR in SQL Server.

You load the CLR into a native process using CorBindToRuntimeEx() for .NET 2.0 and CLRCreateInstance() in .NET 4.

Details can be found on MSDN, or Jeff Richter's book CLR via C#.

I've been programming in C and C++ in Linux for around 3 years, and recently have been interested in developing commercial software for businesses. Let's say I've found a niche where I think I could be successful, but that they only use Windows. I have no experience whatsoever with the Windows API, however. I have a few questions:

Should I learn .NET?

Do I need to learn C# in order to use .NET, or can I stick with C++?

What is the sentiment about compiling using GCC under Cygwin with the --no-cygwin option? I'm interested in portability, and I'm worried that if I start writing for VC++, I might get bound to Windows.

Where is a good place to do my essential reading?

What else can you tell me about transitioning from Linux to Windows programming?

no. dont learn .NET - learn C#.. its like the anything goes playground. once you get the hang of it.. you will be able to finish projects in 1/10th the time..

but with C#/.NET you learn bad habits. i am now back in C++. i had C++ 12years, then C# 5 years, now C++ 6 months.

although it does take around 6 times longer to complete projects... (6 months vs 1 month) , i feel that the C++ code has an artistic feel.. while the C# code is generic. like BestBuy.

i am totally anti C++/CLI or whatever its called. if you need to get down to the CLR level.. run, dont walk back to C++.. or you'll end up spending all your time working around some arbitrary C# design "feature" like event synchronization..

my next destination may be .NET Powershell to manage my C++ server applications.

I did Unix to Windows move around 10 years ago.. i tried going back to FreeBSD or Linux... I used to love vi.. but VS is by far the best IDE by far. just get vs2010 pro + reshaper and read these 2 books. the first one is core C# but its .NET 2.0 , which is good.. because its easiest to start writing C++ style code anyways. the next book will last you years.

http://www.amazon.com/2-0-Practical-Guide-Programmers-Guides/dp/0121674517

http://www.amazon.com/gp/reader/0735627045/ref=sib_dp_pt#reader-link

hope this helps.

also. no reason NOT to write portal c++ code in the year 2012. cmake + VS2010 + boost + crossroads-io

I faced exactly the same questions and I am so happy I tried .NET. I hope this info can help you:

Should I learn .NET?

I would highly recomment it.

Do I need to learn C# in order to use .NET, or can I stick with C++?

You can stick with C++ but I am sure you will love to learn C#, please try it. You will be able to mix them too. The main challenge with .NET is to learn all the libraries that are out there to help you out (so you do not need to reinvent the wheel). Use msdn often and try to get a map of the fundamental classes and assemblies.

It is a fun experience if you come from C++ you should not have major problems.

Where is a good place to do my essential reading?

I would start with something light, check the free Visual Studio tools, software and examples here, go to the MSDN documentation and compile some of the examples in MSDN (how to access files,...). As you can see you will find C# and C++ examples side by side.

Then of course books like C# via CLR will eventually need to be read.

Portability

Be sure you run your code in Mono and multiple platforms.

Future investment

Investing on learning the .NET framework will pay back. Today is costly to learn all the new tools out there, with .NET you know you can evolve. New features and languages appear but the fundamental classes remain so your time/effort investment is more under control.

what book would you recommend to improve one's c# style of writing? I know Code Complete has a few tips on style and organizing code but it's not specific to c#.

I would also recommend Clean Code by Robert Martin. Yes, it's not C#-specific, and yes, it will improve one's C# style of writing. It might be a good idea to continue with Agile Software Development, Principles, Patterns, and Practices book by the same author.

And here is 1 hour video from uncle Bob at Øredev conference Clean Code III: Functions

PS: Shameless plug. I developed a site which answers exactly this question: "Which book is of higher importance in given area?". I get the data from Amazon, and draw a network of books. The more links one book has the higher its importance. Thanks to this site I also found "Agile Principles, Patterns, and Practices in C#", again by Robert Martin, but I prefer the original book.

CLR Via C# by Jeffrey Richter contains all the 2.0 patterns you need to follow in order to produce good code. Helped me immensely.

Effective C# by Bill Wagner, as well as the sequel, More Effective C#.

Elements of C# Style is a good primer.

While it may not go into as much detail as other books that are available but I've definetly got my moneys worth from it - highly recommended.

C# Concisely very thorough

Yes, I am using a profiler (ANTS). But at the micro-level it cannot tell you how to fix your problem. And I'm at a microoptimization stage right now. For example, I was profiling this:

for (int x = 0; x < Width; x++)
{
    for (int y = 0; y < Height; y++)
    {
        packedCells.Add(Data[x, y].