Working with Strings in C Sharp

Revision as of 16:44, 25 January 2008 by Neil (Talk | contribs)

Revision as of 16:44, 25 January 2008 by Neil (Talk | contribs)

PreviousTable of ContentsNext
C# List and ArrayList CollectionsFormatting Strings in C#


Strings are collections of characters that are grouped together to form words or sentences. If it wasn't for humans, computers would probably never have anything to do with strings. The fact is, however, that one of the primary jobs of a computer is to accept data from and present data to humans. For this reason it is highly likely that any C# program is going to involve a considerable amount of code specifically designed to work with data in the form of strings. The purpose of this chapter is to cover the key aspects of string creation, comparison and manipulation in C#.

Creating Strings in C#

Strings consist of sequences of characters contained in string object. A string object may be created using a number of different mechanisms.

A string may be declared but not initialized as follows:

string myString;

A literal value may be assigned to a string in C# using the assignment operator:

string myString = "Hello World";

Alternatively a new string may be created using the new keyword and passing through the literal value to the constructor:

string myString = new String("Hello World");

String literals are placed with double quotes (as shown above). If the string itself contains double quotes the escape character (\) should precede the double quote characters:

System.Console.WriteLine ("He shouted \"Can you here me?\"");

C# can be instructed to treat all the characters in a string verbatim using the @ notation. When using the @ notation everything between the double quotes is treated as a raw string, regardless of whether new lines, carriage returns, backslashes etc are present in the text.

For example:

		System.Console.WriteLine (@"You can put a backslash \ here
and a new line
and tabs			work too. 
You can also put in sequences that would normally be seen as escape sequences \n \t");
	}

The above jumble of text will be faithfully reproduced character for character as follows:

You can put a backslash \ here
and a new line
and tabs                        work too.
You can also put in sequences that would normally be seen as escape sequences \n \t

Programmers familiar with the heredoc function of other programming language will quickly notice that this is essentially the C# equivalent.

Obtaining the Length of a C# String

The length of a C# may be obtained by accessing the Length property of the string object:

	String myString = "Hello World";

	System.Console.WriteLine ("myString length = " + myString.Length);

When executed, the output will read "myString length = 11".


Treating Strings as Arrays

It is possible to access individual characters in a string by treating the string as an array (for details on C# arrays read the chapter entitled Introducing C# Arrays.

By specifying the index into the array of the character individual characters may be accessed. It is important to note that strings are immutable (in other words the value of a string cannot be modified unless an entirely new string literal is assigned to the object). This means that while it is possible to read the value of a character in a string it is not possible to change the value:

       	string myString = "Hello World";

        System.Console.WriteLine(myString[1]); //Displays second character (e)

        myString[0] = 'h';   // Illegal - string cannot be modified.

The above code would display the letter e which is the second character of the string (remember that indexes in C# begin at 0). Unfortunately the code would fail to compile because an illegal attempt to change the value of a character is made.

Concatenating Strings in C#

Strings may be concatenated (i.e joined together) simply by adding them together using the addition operator (+).

We can, therefore, combine two strings:

      string myString = "Hello World.";

      System.Console.WriteLine (myString + " How are you?");

Resulting in output which reads "Hello World. How are you?".

Strings may also be concatenated using the Concat() method. This method takes two strings to be joined as arguments and returns a third string containing the union of the two strings:

	string myString1 = "Hello World.";
	string myString2 = " How are you?";
	string myString3;

	myString3 = String.Concat ( myString1, myString2 );		

Comparing Strings in C#

A common mistake when comparing strings is to try to perform the comparison using the equality operator (==). For example:

		String myString1 = "Hello World";
		String myString2 = "Hello World";

		if (myString1 == myString2)
		{
			System.Console.WriteLine ("They match");
		}
		else
		{
			System.Console.WriteLine ("They do not match");
		}

The above example will display the "They do not match" message even though strings contain the same text. This is because the comparison is not comapring the text, but rather comparing the location in memory of one string object with the location of another. Clearly since they are different strings objects the reside at different locations - even though they contain the same text. Comparisons must, therefore, be made using the Compare() method.

The C# String Compare() method returns 1 is the left hand string is greater than the right hand string, 0 if the strings match and 1 -1 if the left hand string is less than the right hand string:

	String myString1 = "Hello World";
	String myString2 = "Hello World";

	if (String.Compare (myString1, myString2) == 0)
	{
		System.Console.WriteLine ("They match");
	}
	else
	{
		System.Console.WriteLine ("They do not match");
	}

By default, Compare() performs a case sensitive comparison. If the Compare() is called with a third boolean argument it is possible to control whether the comparison is case sensitive or not. A true argument tells Compare() to ignore case when performing the comparison:

	String myString1 = "Hello World";
	String myString2 = "HELLO WORLD";

	if (String.Compare (myString1, myString2, true) == 0)
	{
		System.Console.WriteLine ("They match");
	}
	else
	{
		System.Console.WriteLine ("They do not match");
	}

Changing String Case

The case of the characters in a string may be changed using the ToUpper and ToLower methods. Both of these methods return a modified string rather than changing the actual string. For example:

	string myString = "Hello World";
	string newString;

	newString = myString.ToUpper();

	System.Console.WriteLine (newString);  // Displays HELLO WORLD

	newString = myString.ToLower();

	System.Console.WriteLine (newString);  // Displays hello world

Splitting a C# String into Multiple Parts

A string may be separated into multiple parts using the Split() method. Split() takes as an argument the character to use as the delimiter to identify the points at which the string is to be split. Returned from the method call is an array containing the individual parts of the string. For example, the following code splits a string up using the comma character as the delimiter. the results are placed in an array called myColors and a foreach loop then reads each item from the array and displays it:

	string myString = "Red, Green, Blue, Yellow, Pink, Purple";

	string[] myColors = myString.Split(',');

	foreach (string color in myColors)
	{
		System.Console.WriteLine (color);
	}

The resulting output will read:

Red
 Green
 Blue
 Yellow
 Pink
 Purple

As we can see, the Split() method broke the string up as requested, by we have a problem in that the spaces are still present. Fortunately C# provides a method to handle this.

Trimming and Padding C# Strings

Unwanted leading and trailing spaces can be removed from a string using the Trim() method. When called, this method returns a modified version of the string with both leading and trailing spaces removed:

	string myString = "    hello      ";

	System.Console.WriteLine ("[" + myString + "]");
	System.Console.WriteLine ("[" + myString.Trim() + "]");

The above code will result in the following output:

[    hello      ]
[hello]

The remove just the leading or trailing spaces use the TrimStart() or TrimEnd() method respectively.

In inverse of the Trim() method are the PadLeft() and PadRight() methods. These methods allow leading or trailing characters to be added to a string. The methods take as arguments the total number of characters to which the string is to be padded and the padding character:

		string myString = "hello";
		string newString;

		newString = myString.PadLeft(10, ' ');

		newString = newString.PadRight(20, '*');

		System.Console.WriteLine ("[" + newString + "]"); // Outputs [     hello**********]

C# String Replacement

Parts of a string my be replaced using the Replace() method. This method takes part of the string to be replaced and the string with which it is to be replaced as arguments and returns a new string reflecting the change. The '"Replace() method will replace all instances of the string:

	string myString = "Hello World";
	string newString;

	newString = myString.Replace("Hello", "Goodbye");

	System.Console.WriteLine (newString);

Summary

In this chapter of C# Essentials we have looked at a variety of different mechanisms for creating and working with strings in C#. In the next chapter we will look at using the String.Format() method to format strings in C#.



PreviousTable of ContentsNext
C# List and ArrayList CollectionsFormatting Strings in C#