The C Sharp Language and Environment

Revision as of 19:58, 30 January 2008 by Neil (Talk | contribs) (A Brief History of Computer Programming Languages)

Revision as of 19:58, 30 January 2008 by Neil (Talk | contribs) (A Brief History of Computer Programming Languages)

PreviousTable of ContentsNext
About C# EssentialsA Simple C# Console Application


C# is the latest progression in a never ending quest to make it as easy and efficient as possible for humans to program computers. Whilst it would be easy to simply describe C# as just another object oriented programming language developed by Microsoft (and ratified by ECMA and ISO), the fact is that C# is actually an integral part of an entire development and execution infrastructure. The primary object of this chapter of C# Essentials is to provide an overview of both the C# language and the infrastructure on which it relies. By the end of this chapter it also is intended that the reader will have a clear understanding of what acronyms such as CLI, CLR, VES, JIT and .NET mean.

A Brief History of Computer Programming Languages

The problem with programming is that computers think exclusively in numbers (the numbers 0 and 1 to be precise) known as machine code while humans communicate using words. In the very early days programmers actually entered machine code directly into computers to program them. This, as you can imagine, was a laborious and error prone process. The next evolution was to associate brief human readable commands with the corresponding machine code. For example, a programmer could enter the command MOV to transfer a value from one microprocessor register to another. These commands would then be translated into machine code by a piece of software called an assembler, thereby giving this command syntax the name Assembly Lanaguage.

Next came a series of high level languages designed to make it easier for humans to write programs. These programs are written using a human readable syntax and then either compiled to machine code by a compiler or interpreted on behalf of the processor by an interpreter. Such languages include BASIC, COBOL, Pascal and Fortran. One other such language is called C which was created at AT&T Bell Labs in the late 1960's and early 1970's. In the late 1970's and early 1980's work started on an object oriented approach to C programming culminating in a new, object oriented variant of C known as C++.

The story, however, does not end there. The problem with C++ was that it was an incredibly easy language in which to make programming mistakes. C++ would quite happily allow a programmer to make coding mistakes that would cause buffers to overflow, memory locations to be arbitrarily overwritten and introduce memory leaks that would cause applications to bloat to the point of using up the entire physical memory and swap space on a system. Another problem encountered with C, C++ and all other compiled languages is the fact that the source code has to re-compiled for each different processor type making it difficult to port an application from one hardware platform to another.

In order to address the short-comings of C and C++, Sun Microsystems started work on a new programming language and execution environment in the 1990's. The end result was called Java. Java consists of a programming language with many of the pitfalls of C++ removed, a portable intermediate byte code format, a runtime environment (called the virtual machine) that executes the byte code and handle issues such as memory management, and a vast suite of libraries providing all the functionality required to develop enterprise class applications (such as networking, file handling, database access, graphics etc).

Java gained rapid acceptance and for a time Microsoft began their Java embrace and extend campaign. Sun were happy for Microsoft to embrace Java but reached for their lawyers when they realized that the extend part was a plan for Microsoft to introduce their own proprietary version of the language. Politics ensued and Microsoft eventually walked away from Java. Not long after, Microsoft started talking about something called .Net, following by something else called C#.

What exactly is C#?

"What does all this history have to do with C#?" I hear you ask. Well, the origins of the C# programming syntax can be traced right back to C and C++. If you are already familiar with C or C++ then you have a big head start in terms of learning C#. In fact the same can be said of syntax similarities between Java, C, C++ and C# syntax. In addition, C# also inherits many of the benefits of Java in terms of memory handling (better known as garbage collection) and an intermediate byte code that negates the need to recompile an application for each target hardware platform. C# is also accompanied by a vast framework of libraries designed to provide the programmer with ready made solutions to just about every imaginable scenario.

Despite these similarities there are differences between the Java and C# infrastructures work. The remainder of this chapter will be dedicated to providing and overview of the C# infrastructure.


The Common Language Infrastructure (CLI)

C# is an object oriented programming language. It essentially a standard defining what constitutes valid syntax. On its own C# is actually of little use because it is dependent upon something called the Common Language Infrastructure (CLI) both for compilation and execution of applications. The CLI in turn, is actually a standard which defines specifications for the following components:

  • Virtual Execution System (VES)
  • Common Intermediate Language (CIL)
  • Common Type System (CTS)
  • Common Language Specification (CLS)
  • Framework

In the remainder of this chapter we will look at each of these CLI components in order to build up a picture of how the CLI environment fits together.

Common Intermediate Language (CIL)

Unlike the C and C++ compilers which compile source code down to the machine code understood by the target microprocessor, the C# compiler compiles to an intermediate byte code format known as the Common Intermediate Language (CIL). This code can, in theroy, be take to any system where there is a CLI compliant Virtual Execution System (VES) and executed. There is, therefore, no need to compile an application for each and every target platform.

The word Common in Common Intermediate Language is used because this format is common to more than just the C# programming language. In fact any programming language may target the CIL allowing libraries and code modules from different languages to execute together in the same application. Typical languages for which CIL compilation is available include Visual Basic, COBOL, SmallTalk and C++.

Virtual Execution System (VES)

The VES (usually referred to as the runtime) is the environment in which the CIL byte code is executed. The VES reads the byte code generated by the C# compiler and uses something called a Just in Time (JIT) compiler to compile the byte code down to the native machine code of the processor on which it is running. While this code is executing it does so in conjunction with a runtime agent which essentially manages the execution process. As a result, this executing code is known as managed code and the process handles issues such as garbage collection (to handle memory allocation and deallocation), memory access and type safety to ensure that the code does not do anything it is not supposed to do.

A term that is often used in connection with the VES is the Common Language Runtime (CLR). The CLR is officially the name given to Microsoft's implementation of the VES component of the CLI specification.

It is worth noting that the JIT process can introduce a startup delay on execution of an application. One option available with .Net to avoid this problem is to pre-compile CLI byte code down to native machine code using the NGEN compiler. Because the NGEN compilation must take place on the target processor architecture this is step is often performed at the point that the application in question is installed by the user.

Common Type System (CTS) & Common Language Specification (CLS)

As mentioned previously a number of different programming languages target the CLI allowing, for example, code from C# sources to interact with code from Visual Basic. In order to achieve this feat, each language must have the same concept of how data types are stored in memory. The CTS, therefore, defines how a CLI compatible language must view the bit patterns of values and layout and behavior of objects to ensure interoperability.

The CLS is a essentially a subset of the CTS aimed at creating interoperable libraries.

The Framework (Base Class and Framework Class Libraries)

The CLI specifies a set of base classes that must be available to executing CLI code, otherwise known as the Base Class Library (BCL). The BCL contains APIs that enable executing CIL code to interact with the runtime environment and the underlying operating system.

Beyond the basics there is also the Framework Class Library. This is a Microsoft library which contains APIs for the creation of graphical user interfaces, database applications, web access and much, much more.

Non Microsoft Implementations of the CLI

Microsoft's implementation of the CLI stack is called .NET. .NET is not, however, the only implementation available. Another implementation provided by Microsoft is called Rotor. Rotor is available for Windows, Mac OS and FreeBSD and is available in source form. Rotor, however, is primarily a learning tool and as such is licensed under terms which prohibit use as the basis of commercial applications.

Other significant open source implementations are the Mono and DotGNU projects targeted at Windows, Linux and Unix platforms.



PreviousTable of ContentsNext
About C# EssentialsA Simple C# Console Application