34,333
edits
Changes
→Handling Special Characters
== Handling Special Characters ==
As you have seen, regular expressions assign special meaning to particular characters. For example the dot (.) and square brackets ([]) all have special meaning. Those who studying critical thinking at college will already be questioning what to do if the character sequence that is the subject of a search contains one or more of these characters. Obviously if you are are looking for the following text that looks like a regular expression, the text for which you want to search is, itself, going to be viewed as regular expression syntax.
To address this issue, a concept known as ''escaping'' is used. In SQL, escaping involves preceding any characters that may be mis-interpreted as a regular expression special character with double back slashes (\\). For example, suppose we have a row in our product table which reads as follows:
<pre>
+-----------+--------------------+-----------+
| prod_code | prod_name | prod_desc |
+-----------+--------------------+-----------+
| 17 | CD-RW Model [7543] | CD Writer |
+-----------+--------------------+-----------+
</pre>
If we were to search for this without regard to the fact that the prod_name value contains regular expression special characters we will not get what we wanted:
<pre>
mysql> SELECT * FROM product WHERE prod_name REGEXP 'CD-RW Model [7543]';
+-----------+------------------+-----------+
| prod_code | prod_name | prod_desc |
+-----------+------------------+-----------+
| 1 | CD-RW Model 4543 | CD Writer |
| 14 | CD-RW Model 5543 | CD Writer |
+-----------+------------------+-----------+
2 rows in set (0.00 sec)
</pre>
The cause of the problem is that the regular expression has been interpreted as a search for any entries that read 'CD-RW Model' followed by either a 7, 5, 4 or 3. Clearly, what we wanted was the actual text [7543]. If instead we escape the brackets with the \\ escape sequence we get the result we want:
<pre>
mysql> SELECT * FROM product WHERE prod_name REGEXP 'CD-RW Model \\[7543\\]';
+-----------+--------------------+-----------+
| prod_code | prod_name | prod_desc |
+-----------+--------------------+-----------+
| 17 | CD-RW Model [7543] | CD Writer |
+-----------+--------------------+-----------+
1 row in set (0.00 sec)
</pre>
== Regular Expressions and Whitespace Characters ==
Regular expression syntax also provides a mechanism to reference whitespace characters such as tabs, carriage returns and line feeds. These are referenced in a regular expression using ''metacharacters''. These metacharacters are outlined in the following table:
<table border="1" cellspacing="0">
<tr style="background:#efefef;">
<th>Metacharacter<th>Description</th>
<tr>
<td>\\n<td>New line (line feed)</td>
<tr>
<td>\\f<td>Form feed</td>
<tr>
<td>\\t<td>Tab</td>
<tr>
<td>\\r<td>Carriage return<td>
<tr>
<td>\\v<td>Vertical tab</td>
</table>
==