Documentation Home
MySQL 5.5 Reference Manual
Related Documentation Download this Manual Excerpts from this Manual

MySQL 5.5 Reference Manual  /  ...  /  mysql_real_escape_string()

23.8.7.53 mysql_real_escape_string()

unsigned long mysql_real_escape_string(MYSQL *mysql, char *to, const char *from, unsigned long length)

Description

This function creates a legal SQL string for use in an SQL statement. See Section 9.1.1, “String Literals”.

The mysql argument must be a valid, open connection because character escaping depends on the character set in use by the server.

The string in the from argument is encoded to produce an escaped SQL string, taking into account the current character set of the connection. The result is placed in the to argument, followed by a terminating null byte.

Characters encoded are \, ', ", NUL (ASCII 0), \n, \r, and Control+Z. Strictly speaking, MySQL requires only that backslash and the quote character used to quote the string in the query be escaped. mysql_real_escape_string() quotes the other characters to make them easier to read in log files. For comparison, see the quoting rules for literal strings and the QUOTE() SQL function in Section 9.1.1, “String Literals”, and Section 12.5, “String Functions”.

The string pointed to by from must be length bytes long. You must allocate the to buffer to be at least length*2+1 bytes long. (In the worst case, each character may need to be encoded as using two bytes, and there must be room for the terminating null byte.) When mysql_real_escape_string() returns, the contents of to is a null-terminated string. The return value is the length of the encoded string, not including the terminating null byte.

If you must change the character set of the connection, use the mysql_set_character_set() function rather than executing a SET NAMES (or SET CHARACTER SET) statement. mysql_set_character_set() works like SET NAMES but also affects the character set used by mysql_real_escape_string(), which SET NAMES does not.

Example

The following example inserts two escaped strings into an INSERT statement, each within single quote characters:

char query[1000],*end;

end = strmov(query,"INSERT INTO test_table VALUES(");
*end++ = '\'';
end += mysql_real_escape_string(&mysql,end,"What is this",12);
*end++ = '\'';
*end++ = ',';
*end++ = '\'';
end += mysql_real_escape_string(&mysql,end,"binary data: \0\r\n",16);
*end++ = '\'';
*end++ = ')';

if (mysql_real_query(&mysql,query,(unsigned int) (end - query)))
{
   fprintf(stderr, "Failed to insert row, Error: %s\n",
           mysql_error(&mysql));
}

The strmov() function used in the example is included in the libmysqlclient library and works like strcpy() but returns a pointer to the terminating null of the first parameter.

Return Values

The length of the encoded string that is placed into the to argument, not including the terminating null character.

Errors

None.


User Comments
  Posted by Michael Kofler on January 19, 2003
Documentation is unclear (at least to me):

> taking into account the current character
> set of the connection

What does this mean? How is mysql_real_escape_string affected by the character set?

To test the feature, I run the mysql server with default-character-set=usa7

Then I used mysql_real_escape_string on a string containing german special characters (äöüß), which should be illegal for the usa7 charset; nothing happened, i.e. mysql_real_escape_string neither removed nor changed these characters.

Thus, I experienced no change compared to mysql_escape_string.
  Posted by Gerard Boor on March 5, 2003
This doesn't work when using the VC++ APIs for Win32 (and maybe also not in the BCB APIs), you get an unresolved external, even with all libs included.
Here is a solution that does the same. Buffer1 is your binary data, Buffer 2 is the data you put into the query;

char Buffer1[100]
char Buffer2[201]

for(int x = 0; x < 100; x++)
{
switch(Buffer[x])
{
case '\0':
Picture += "\\0";
break;
case '\n':
Picture += "\\n";
break;
case '\r':
Picture += "\\r";
break;
case '\'':
Picture += "\\'";
break;
case '"':
Picture += "\\\"";
break;
case '\\':
Picture += "\\\\";
break;
default:
Picture += Buffer[y];
break;
}

  Posted by Gerard Boor on March 5, 2003
Errata: of course the 'Picture' should be 'Buffer2', sorry for the inconvenience.
  Posted by on May 1, 2003
Please add a note on how to scape the % character in strings, so that it can be later used in LIKE expressions.
  Posted by Justin Shumaker on June 28, 2003
It would be most useful to add a function that provides the programmer with the length returned by mysql_real_escape_string() because the *to buffer usually needs to be allocated a particular size before this function is called. If that size is unknown before the function is called a paradox exists. Simply allocating twice the memory for the *to buffer with respect to the *from buffer is unsatisfactory.
  Posted by on October 10, 2003
Watch out! If you insert data into a binary char using this, when you try to retrieve it MySQL will strip off the trailing spaces! Very confusing! Be sure to read section 6.2, Column Types!
  Posted by on October 9, 2003
I use a method like this for escaping strings. Pass in 0 for bufSize and returns size without having allocate double mem before hand. There are optimizations for this. But you can get the idea.

int MySqlConn::escapeString(char *szOrig, char *szEscaped, int bufSize)
{
int newLen = 0;
int origLen = strlen(szOrig);

for(int i = 0; i < origLen; i++)
{
switch(szOrig[i])
{
case '\0':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\0", 2);
newLen += 2;
break;
case '\b':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\b", 2);
newLen += 2;
break;
case '\n':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\n", 2);
newLen += 2;
break;
case '\r':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\r", 2);
newLen += 2;
break;
case '\t':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\t", 2);
newLen += 2;
break;
case '\z':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\z", 2);
newLen += 2;
break;
case '\%':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\%", 2);
newLen += 2;
break;
case '\_':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\_", 2);
newLen += 2;
break;
case '\'':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\'", 2);
newLen += 2;
break;
case '\"':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\\"", 2);
newLen += 2;
break;
case '\\':
if (bufSize > newLen + 2)
memcpy(&szEscaped[newLen], "\\\\", 2);
newLen += 2;
break;
default:
if (bufSize > newLen + 1)
szEscaped[newLen] = szOrig[i];
newLen++;
break;
}
}

return newLen;
}

  Posted by Daniel Eloff on March 11, 2004
mysql_real_escape_string generates an unresolved external on the VC++ compiler (tested with VC++ 7.1) even with all libs included.

Use mysql_escape_string instead (or if you use an incompatible charset, roll your own).

Hopefully this will be fixed soon.

  Posted by Dario E. on July 9, 2004
Warning, sample code in french translation is wrong!
Attention, l'exemple en français est erroné !

Le 4ème paramètre dans les deux appels à la fonction
mysql_real_escape_string() est incorrecte. La taille indiquée ne correspond pas à la chaîne à coder.

La chaîne de caractères "C'est quoi ça" ne fait pas 11 caractères mais 13, et l'autre chaîne "donnée binaire : \0\r\n" fait 20 caractères et non 16.

Je vous laisse deviner l'origine de ces deux erreurs :-)...

  Posted by Ben Johnson on December 31, 2004
using mysql_real_escape_string() seems pretty clumsy. I thought... why not make it more like snprintf()? so I made a wrapper that's more like snprintf(). So far (after VERY little testing), it seems to work pretty well. (use at your own risk. please consider this GPL'd code.) It would be great if this were part of the mysql client lib.

basics:
int mysql_escapestring_sprintf(MYSQL * mysql, char * buf, size_t size, const char * fmt, ...)

- mysql is a database handle
- buf is preallocated buffer size characters long.
- fmt is a format string similar to those used in printf() functions
- ... is a variable length list of strings, integers, floats and characters

fmt can contain only %s, %d, %f, and %c format specifiers. The %s string format specifier is the only one that can be modifed with a length parameter. example: %8s means the associated string is exactly 8 bytes long. the length specifier is used in mysql_real_escape_string's len parameter, and the associated string is passed as the from parameter. this makes it possible to pass binary data.

The function returns the length of the created string. if the created string is truncated because of insufficient space in the buff parameter then the returned value is the length the string would have been. Test for that condition by testing if the returned value is greater than the size parameter. Under some conditions the function will call malloc(). If malloc fails this function will return -1 and errno should be set appropriately. Invalid format strings are more or less ignored and may result in a badly formed query string in buff.

the function is likely very platform dependent. it was developed on an up-to-date Debian 3.1 system.

example:
<pre>
int newlen = 0;
char escapedbuf[1024];
newlen = mysql_escapestring_sprintf(mysql, escapedbuf, sizeof(escapedbuf),
"select n from non where m = %d and name = '%s'", 4, "Penny" );
if(newlen < 0)
printf("error: %m\n");
else if(newlen > sizeof(escapedbuf))
printf("query too long\n");
else
printf("new query string: %s\n", escapedbuf);

</pre>

<pre>
/* this is licensed under the GPL. Ben Johnson is the author who borrowed heavily from man pages. */

/* because some limit is required, string
* must be less than 9999 characters long. */
#define MAX_STRMODIFIER_LEN 4
static int
snprintf_mysql_escaped_sql_statement(MYSQL * mysql, char * buf, size_t size, const char * fmt, ...)
{
int added_length = 0;
int offset_from_buf = 0;
int oversized = 0;
va_list ap;
int d;
double f;
char c, *s;
char current_char;
int expecting = 0;
char strmodifier[MAX_STRMODIFIER_LEN+1];
int strmodifier_offset = 0;
int keep_expecting = 0;
int isoverflowed = 0;

strmodifier[0] = '\0';

va_start(ap, fmt);
while ((current_char = *fmt++))
{
added_length = 0; /* this might not change */

if(! expecting)
{
switch(current_char)
{
case '%': /* start expecting */
expecting = 1;
break;
default:
if( offset_from_buf < size-2 )
{
buf[offset_from_buf] = current_char;
buf[offset_from_buf+1] = '\0'; /* null terminate every change */
}
else
isoverflowed = 1;
added_length = 1;
break;
}
}
else
{
keep_expecting = 0;

switch(current_char)
{
case '%': /* literal '%' character */
if( offset_from_buf < size-2 )
{
buf[offset_from_buf] = current_char;
buf[offset_from_buf+1] = '\0'; /* null terminate every change */
}
else
isoverflowed = 1;
added_length = 1;
break;
/* string length modifiers */
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
if( strmodifier_offset < MAX_STRMODIFIER_LEN )
{
strmodifier[strmodifier_offset] = current_char;
strmodifier[strmodifier_offset+1] = '\0'; /* null term every change */
strmodifier_offset++;
}
keep_expecting = 1;
break;
case 's': /* string */
{
int len = 0;
char * p = NULL; /* for quick malloc */
s = va_arg(ap, char *);

/* need string length, one way or another */
if( *strmodifier )
len = atoi(strmodifier);
else
len = strlen(s);

/* if remaining space is sufficient, use buf
* as mysql_real_escape_string()'s "to" arg */
if( size - offset_from_buf > len*2+1 )
added_length = mysql_real_escape_string(mysql, &buf[offset_from_buf], s, len);
else
{
p = malloc(len*2+1);
if( p )
{
added_length = mysql_real_escape_string(mysql, p, s, len);
if( offset_from_buf+added_length < size )
memcpy(&buf[offset_from_buf], p, added_length+1); /* +1 is null */
else
isoverflowed = 1;
free(p);
}
else
{
oversized = -1;
goto error_out;
}
}
}
break;
case 'd': /* int */
d = va_arg(ap, int);
added_length = snprintf(&buf[offset_from_buf], (size-offset_from_buf)+1, "%d", d);
if( added_length > (size-offset_from_buf)+1 )
isoverflowed = 1;
break;
case 'c': /* char */
/* need a cast here since va_arg only
takes fully promoted types */
c = (char) va_arg(ap, int);
added_length = snprintf(&buf[offset_from_buf], (size-offset_from_buf)+1, "%c", c);
if( added_length > (size-offset_from_buf)+1 )
isoverflowed = 1;
break;
case 'f': /* float/double */
f = va_arg(ap, double); /* promote floats */
added_length = snprintf(&buf[offset_from_buf], (size-offset_from_buf)+1, "%f", f);
if( added_length > (size-offset_from_buf)+1 )
isoverflowed = 1;
break;
default:
/* invalid format. maybe do an error here.
printf("%s[%d]: default? not adding: %c\n", __FUNCTION__, offset_from_buf, current_char);
*/
break;
}

if( ! keep_expecting )
{
strmodifier[0] = '\0';
strmodifier_offset = 0;
expecting = 0;
}
}

oversized += added_length;
if( added_length <= size - offset_from_buf )
offset_from_buf += added_length;
}

#if 0
if( isoverflowed )
memset(buf, 0, size);
#endif

error_out:
va_end(ap);

return oversized;
}

</pre>

  Posted by Saif Ullah Baig on May 15, 2007
I couldn't find a method for unescaping a string i.e. converting '\'', '\"' etc in a string back to ''', '"' etc. I've written the following function that can do this. It does the job by performing a single pass over the data to be unescaped.

enum STATE { NORMAL, ESCAPE } state;

// unescape: in-memory unescapes the string passed in as argument
// return: length of unescaped string
int unescape (char *str, int strLength) {
int sIndex = 0, dIndex = 0;

if (str == 0 || strLength <= 0) // validate string
return 0;

state = NORMAL;

while (sIndex < strLength) {
switch (state) {
// we are in the NORMAL state until we find a '\\' character
case NORMAL:
if (str [sIndex] == '\\') {
state = ESCAPE;
}
else {
// simply copy the non '\\' characters from
// the source to the destination
// in the NORMAL state
str [dIndex] = str [sIndex];
dIndex ++;
}

break;

case ESCAPE:
// in the ESCAPE state, copy the sequence
// like '\\\n', '\\\r', etc. as '\n', '\r', etc.
if (
str [sIndex] == '\0' ||
str [sIndex] == '\n' ||
str [sIndex] == '\r' ||
str [sIndex] == '\\' ||
str [sIndex] == '\'' ||
str [sIndex] == '\"' ||
str [sIndex] == (char) 0x1A /* Ctrl-Z */
) {
str [dIndex] = str [sIndex];
dIndex ++;
}
else {
// we have (mistakenly) entered the ESCAPE
// state, so output the suppressed '\\' character
// and the current character
str [dIndex] = '\\';
dIndex ++;
str [dIndex] = str [sIndex];
dIndex ++;
}

state = NORMAL;

break;

default:
// error !
break;
} // end switch

sIndex ++;
} // end while

// in case the last character is a '\\' indicated by the ESCAPE state at loop termination
// we have to copy the '\\' character
if (state == ESCAPE) {
str [dIndex] = '\\';
dIndex ++;
}

return dIndex;
}

// example usage
void ExampleUsage ()
{
char str [] = "\\\'Hello World\\\'";
unescape (str, strlen (str) + 1 /* last NULL byte */);
// OR: unescape (str, sizeof (str));

// after function call, str = 'Hello World'

}

  Posted by Evan Jones on October 26, 2010
The text here contracts the documentation in (8.1.1. Strings). The Strings documentation states that NULL ('\0'), backslash ('\\'), single quote ('\'') and double quote ('"') must all be escaped for binary data. For mysql_real_escape_string, the documentation states "Strictly speaking, MySQL requires only that backslash and the quote character [...] be escaped". I *think* that mysql_real_escape_string's documentation is the right one.
Sign Up Login You must be logged in to post a comment.