MySQL Internals Manual  /  A Guided Tour Of The MySQL Source Code  /  The Skeleton of the Server Code

1.9 The Skeleton of the Server Code

And now we're going to walk through something harder, namely the server.

WARNING WARNING WARNING: code changes constantly, so names and parameters may have changed by the time you read this.

Important files we'll be walking through:


This is not as simple as what we've just done. In fact we'll need multiple pages to walk through this one, and that's despite our use of truncation and condensation again. But the server is important, and if you can grasp what we're doing with it, you'll have grasped the essence of what the MySQL source code is all about.

We'll mostly be looking at programs in the sql directory, which is where mysqld and most of the programs for the SQL engine code are stored.

Our objective is to follow the server from the time it starts up, through a single INSERT statement that it receives from a client, to the point where it finally performs the low level write in the MyISAM file.

Walking Through The Server Code: /sql/

  int main(int argc, char **argv)
    (void) thr_setconcurrency(concurrency);
    server_init();                             // 'bind' + 'listen'
    acl_init((THD *)0, opt_noacl);
    handle_connections_sockets(0);             // !
    DBUG_PRINT("quit",("Exiting main thread"));

Here is where it all starts, in the main function of

Notice that we show a directory name and program name just above this snippet. We will do the same for all the snippets in this series.

By glancing at this snippet for a few seconds, you will probably see that the main function is doing some initial checks on startup, is initializing some components, is calling a function named handle_connections_sockets, and then is exiting. It's possible that acl stands for "access control" and it's interesting that DBUG_PRINT is something from Fred Fish's debug library, which we've mentioned before. But we must not digress.

In fact there are 150 code lines in the main function, and we're only showing 13 code lines. That will give you an idea of how much we are shaving and pruning. We threw away the error checks, the side paths, the optional code, and the variables. But we did not change what was left. You will be able to find these lines if you take an editor to the program, and the same applies for all the other routines in the snippets in this series.

The one thing you won't see in the actual source code is the little marker "// !". This marker will always be on the line of the function that will be the subject of the next snippet. In this case, it means that the next snippet will show the handle_connection_sockets function. To prove that, let's go to the next snippet.

Walking Through The Server Code: /sql/

  handle_connections_sockets (arg __attribute__((unused))
     if (ip_sock != INVALID_SOCKET)
       DBUG_PRINT("general",("Waiting for connections."));
       while (!abort_loop)
         new_sock = accept(sock, my_reinterpret_cast(struct sockaddr*)
           (&cAddr),             &length);
         thd= new THD;
         if (sock == unix_sock)
         thd->host=(char*) localhost;
         create_new_thread(thd);            // !

Inside handle_connections_sockets you'll see the hallmarks of a classic client/server architecture. In a classic client/server, the server has a main thread which is always listening for incoming requests from new clients. Once it receives such a request, it assigns resources which will be exclusive to that client. In particular, the main thread will spawn a new thread just to handle the connection. Then the main server will loop and listen for new connections — but we will leave it and follow the new thread.

As well as the sockets code that we chose to display here, there are several variants of this thread loop, because clients can choose to connect in other ways, for example with named pipes or with shared memory. But the important item to note from this section is that the server is spawning new threads.

Walking Through The Server Code: /sql/

  create_new_thread(THD *thd)
        handle_one_connection,                        // !
        (void*) thd));

Here is a close look at the routine that spawns the new thread. The noticeable detail is that, as you can see, it uses a mutex or mutual exclusion object. MySQL has a great variety of mutexes that it uses to keep actions of all the threads from conflicting with each other.

Walking Through The Server Code: /sql/

handle_one_connection(THD *thd)
    init_sql_alloc(&thd->mem_root, MEM_ROOT_BLOCK_SIZE, MEM_ROOT_PREALLOC);
    while (!net->error && net->vio != 0 && !thd->killed)
      if (do_command(thd))            // !
    packet=(char*) net->read_pos; 

With this snippet, we've wandered out of Now, we're in the sql_parse file, still in the sql directory. This is where the session's big loop is.

The loop repeatedly gets and does commands. When it ends, the connection closes. At that point, the thread will end and the resources for it will be deallocated.

But we're more interested in what happens inside the loop, when we call the do_command function.


   client           <===== MESSAGE ====> server                     <======PACKETS ====>


To put it graphically, at this point there is a long-lasting connection between the client and one server thread. Message packets will go back and forth between them through this connection. For today's tour, let's assume that the client passes the INSERT statement shown on the Graphic, for the server to process.

Walking Through The Server Code: /sql/

bool do_command(THD *thd)
  packet=(char*) net->read_pos;
  command = (enum enum_server_command) (uchar) packet[0];
  dispatch_command(command,thd, packet+1, (uint) packet_length);
// !

You've probably noticed by now that whenever we call a lower-level routine, we pass an argument named thd, which is an abbreviation for the word thread (we think). This is the essential context which we must never lose.

The my_net_read function is in another file called The function gets a packet from the client, uncompresses it, and strips the header.

Once that's done, we've got a multi-byte variable named packet which contains what the client has sent. The first byte is important because it contains a code identifying the type of message.

We'll pass that and the rest of the packet on to the dispatch_command function.

Walking Through The Server Code: /sql/

bool dispatch_command(enum enum_server_command command, THD *thd,
       char* packet, uint packet_length)
  switch (command) {
    case COM_INIT_DB:          ...
    case COM_REGISTER_SLAVE:   ...
    case COM_TABLE_DUMP:       ...
    case COM_CHANGE_USER:      ...
    case COM_EXECUTE:
    case COM_LONG_DATA:        ...
    case COM_PREPARE:
         mysql_stmt_prepare(thd, packet, packet_length);   // !
    /* and so on for 18 other cases */
     send_error(thd, ER_UNKNOWN_COM_ERROR);

And here's just part of a very large switch statement in The snippet doesn't have room to show the rest, but you'll see when you look at the dispatch_command function that there are more case statements after the ones that you see here.

There will be — we're going into list mode now and just reciting the rest of the items in the switch statement — code for prepare, close statement, query, quit, create database, drop database, dump binary log, refresh, statistics, get process info, kill process, sleep, connect, and several minor commands. This is the big junction.

We have cut out the code for all of the cases except for two, COM_EXECUTE and COM_PREPARE.

Walking Through The Server Code: /sql/

We are not going to follow what happens with COM_PREPARE. Instead, we are going to follow the code after COM_EXECUTE. But we'll have to digress from our main line for a moment and explain what the prepare does.

Parse the query
Allocate a new statement, keep it in 'thd->prepared statements' pool
Return to client the total number of parameters and result-set
metadata information (if any)"

The prepare is the step that must happen before execute happens. It consists of checking for syntax errors, looking up any tables and columns referenced in the statement, and setting up tables for the execute to use. Once a prepare is done, an execute can be done multiple times without having to go through the syntax checking and table lookups again.

Since we're not going to walk through the COM_PREPARE code, we decided not to show its code at this point. Instead, we have cut and pasted some code comments that describe prepare. All we're illustrating here is that there are comments in the code, so you will have aid when you look harder at the prepare code.

Walking Through The Server Code: /sql/

  bool dispatch_command(enum enum_server_command command, THD *thd,
       char* packet, uint packet_length)
  switch (command) {
    case COM_INIT_DB:          ...
    case COM_REGISTER_SLAVE:   ...
    case COM_TABLE_DUMP:       ...
    case COM_CHANGE_USER:      ...
    case COM_EXECUTE:
         mysql_stmt_execute(thd,packet);                   // !
    case COM_LONG_DATA:        ...
    case COM_PREPARE:
         mysql_stmt_prepare(thd, packet, packet_length);
    /* and so on for 18 other cases */
     send_error(thd, ER_UNKNOWN_COM_ERROR);

Let's return to the grand central junction again in for a moment. The thing to note on this snippet is that the line which we're really going to follow is what happens for COM_EXECUTE.

Walking Through The Server Code: /sql/

  void mysql_stmt_execute(THD *thd, char *packet)
    if (!(stmt=find_prepared_statement(thd, stmt_id, "execute")))
    mysql_execute_command(thd);           // !

In this case, the line that we're following is the line that executes a statement.

Notice how we keep carrying the THD thread and the packet along with us, and notice that we expect to find a prepared statement waiting for us, since this is the execute phase. Notice as well that we continue to sprinkle error-related functions that begin with the letters DBUG, for use by the debug library. Finally, notice that the identifier "stmt" is the same name that ODBC uses for the equivalent object. We try to use standard names when they fit.

Walking Through The Server Code: /sql/

  void mysql_execute_command(THD *thd)
       switch (lex->sql_command) {
       case SQLCOM_SELECT: ...
       case SQLCOM_SHOW_ERRORS: ...
       case SQLCOM_CREATE_TABLE: ...
       case SQLCOM_UPDATE: ...
       case SQLCOM_INSERT: ...                   // !
       case SQLCOM_DELETE: ...
       case SQLCOM_DROP_TABLE: ...

In the mysql_execute_command function. we encounter another junction. One of the items in the switch statement is named SQLCOM_INSERT.

Walking Through The Server Code: /sql/

  my_bool update=(lex->value_list.elements ? UPDATE_ACL : 0);
  ulong privilege= (lex->duplicates == DUP_REPLACE ?
                    INSERT_ACL | DELETE_ACL : INSERT_ACL | update);
  if (check_access(thd,privilege,tables->db,&tables->grant.privilege))
    goto error;
  if (grant_option && check_grant(thd,privilege,tables))
    goto error;
  if (select_lex->item_list.elements != lex->value_list.elements)
  res = mysql_insert(thd,tables,lex->field_list,lex->many_values,
                     select_lex->item_list, lex->value_list,
                     (update ? DUP_UPDATE : lex->duplicates));
// !
  if (thd->net.report_error)
    res= -1;

For this snippet, we've blown up the code around the SQLCOM_INSERT case in the mysql_execute_command function. The first thing to do is check whether the user has the appropriate privileges for doing an INSERT into the table, and this is the place where the server checks for that, by calling the check_access and check_grant functions. It would be tempting to follow those functions, but those are side paths. Instead, we'll follow the path where the work is going on.

Walking Through The Server Code: Navigation Aid

Some program names in the /sql directory:

Program Name          SQL statement type
------------          ------------------         DELETE             DO        HANDLER           HELP         INSERT            // !           LOAD         RENAME         SELECT           SHOW         UPDATE

Question: Where will mysql_insert() be?

The line that we're following will take us next to a routine named mysql_insert. Sometimes it's difficult to guess what program a routine will be in, because MySQL has no consistent naming convention. However, here is one aid to navigation that works for some statement types. In the sql directory, the names of some programs correspond to statement types. This happens to be the case for INSERT, for instance. So the mysql_insert program will be in the program But there's no reliable rule.

(Let's add here a few sentences about the tags 'ctags' program. When an editor supports ctags (and the list is long, but vi and emacs of course are there), the function definition is one key press away - no guessing involved. In the above case, a vim user could press ^] on mysql_insert name and vim would open and position the curson on the first line of the mysql_insert() function. The tags help can be indispensable in everyday work.)

Walking Through The Server Code: /sql/

 int mysql_insert(THD *thd,TABLE_LIST *table_list, List<Item> &fields,
        List<List_item> &values_list,enum_duplicates duplic)
    table = open_ltable(thd,table_list,lock_type);
    if (check_insert_fields(thd,table,fields,*values,1) ||
      setup_tables(table_list) ||
      goto abort;
    error=write_record(table,&info);                 // !
    query_cache_invalidate3(thd, table_list, 1);
    if (transactional_table)
    query_cache_invalidate3(thd, table_list, 1);
    mysql_unlock_tables(thd, thd->lock);

For the mysql_insert routine, we're just going to read what's in the snippet. What we're trying to do here is highlight the fact that the function names and variable names are nearly English.

Okay, we start by opening a table. Then, if a check of the fields in the INSERT fails, or if an attempt to set up the tables fails, or if an attempt to set up the fields fails, we'll abort.

Next, we'll fill the record buffer with values. Then we'll write the record. Then we'll invalidate the query cache. Remember, by the way, that MySQL stores frequently-used select statements and result sets in memory as an optimization, but once the insert succeeds the stored sets are invalid. Finally, we'll unlock the tables.

Walking Through The Server Code: /sql/

  int write_record(TABLE *table,COPY_INFO *info)
    table->file->write_row(table->record[0];           // !

You can see from our marker that we're going to follow the line that contains the words 'write row'. But this is not an ordinary function call, so people who are just reading the code without the aid of a debugger can easily miss what the next point is in the line of execution here. The fact is, 'write_row' can take us to one of several different places.

Walking Through The Server Code: /sql/handler.h

  /* The handler for a table type.
     Will be included in the TABLE structure */

  handler(TABLE *table_arg) :
    data_file_length(0), max_data_file_length(0),
    delete_length(0), auto_increment_value(0), raid_type(0),
    create_time(0), check_time(0), update_time(0), mean_rec_length(0),
  virtual int write_row(byte * buf)=0;

To see what the write_row statement is doing, we'll have to look at one of the include files. In handler.h on the sql directory, we find that write_row is associated with a handler for a table. This definition is telling us that the address in write_row will vary it gets filled in at run time. In fact, there are several possible addresses.

There is one address for each handler. In our case, since we're using the default values, the value at this point will be the address of write_row in the MyISAM handler program.

Walking Through The Server Code: /sql/

int ha_myisam::write_row(byte * buf)
   /* If we have a timestamp column, update it to the current time */
   if (table->time_stamp)
  If we have an auto_increment column and we are writing a changed row
    or a new row, then update the auto_increment value in the record.
  if (table->next_number_field && buf == table->record[0])
  return mi_write(file,buf);     // !

And that brings us to write_row in the program. Remember we told you that these programs beginning with the letters ha are interfaces to handlers, and this one is the interface to the myisam handler. We have at last reached the point where we're ready to call something in the handler package.

Walking Through The Server Code: /myisam/mi_write.c

int mi_write(MI_INFO *info, byte *record)
  /* Calculate and check all unique constraints */
  for (i=0 ; i < share->state.header.uniques ; i++)

  ... to be continued in next snippet

Notice that at this point there is no more referencing of tables, the comments are about files and index keys. We have reached the bottom level at last. Notice as well that we are now in a C program, not a C++ program.

In this first half of the mi_write function, we see a call which is clearly commented. This is where checking happens for uniqueness (not the UNIQUE constraint, but an internal matter).

Walking Through The Server Code: /myisam/mi_write.c

 ... continued from previous snippet

  /* Write all keys to indextree */
  for (i=0 ; i < share->base.keys ; i++)
  if (share->base.auto_key)

In this second half of the mi_write function, we see another clear comment, to the effect that this is where the new keys are made for any indexed columns. Then we see the culmination of all that the last 20 snippets have been preparing, the moment we've all been waiting for, the writing of the record.

And, since the object of the INSERT statement is ultimately to cause a write to a record in a file, that's that. The server has done the job.

Walking Through The Server Code: Stack Trace

main in /sql/
handle_connections_sockets in /sql/
create_new_thread in /sql/
handle_one_connection in /sql/
do_command in /sql/
dispatch_command in /sql/
mysql_stmt_execute in /sql/
mysql_execute_command in /sql/
mysql_insert in /sql/
write_record in /sql/
ha_myisam::write_row in /sql/
mi_write in /myisam/mi_write.c

And now here's a look at what's above us on the stack, or at least an idea of how we got here. We started with the main program in We proceeded through the creation of a thread for the client, the several junction processes that determined where we're heading, the parsing and initial execution of an SQL statement, the decision to invoke the MyISAM handler, and the writing of the row. We ended in a low level place, where we're calling the routines that write to the file. That's about as low as we should go today.

The server program would, of course, continue by returning several times in a row, sending a packet to the client saying "Okay", and ending up back in the loop inside the handle_one_connection function.

We, instead, will pause for a moment in awe at the amount of code we've just flitted past. And that will end our walk through the server code.

Graphic: A Chunk of MyISAM File

   column1 CHAR(1),
   column2 CHAR(1),
   column3 CHAR(1));

INSERT INTO Table1 VALUES ('a', 'b', 'c');

INSERT INTO Table1 VALUES ('d', NULL, 'e');

F1 61 62 63 00 F5 64 00 66 00 ... .abc..d e.

Continuing with our worm's-eye view, let's glance at the structure of a record in a MyISAM file.

The SQL statements on this graphic show a table definition and some insert statements that we used to populate the table.

The final line on the graphic is a hexadecimal dump display of the two records that we ended up with, as taken from the MyISAM file for Table1.

The thing to notice here is that the records are stored compactly. There is one byte at the start of each record F1 for the first record and F5 for the second record which contains a bit list.

When a bit is on, that means its corresponding field is NULL. That's why the second row, which has a NULL in the second column, or field, has a different header byte from the first row.

Complications are possible, but a simple record really does look this simple.

Graphic: A Chunk of InnoDB File

19 17 15 13 0C 06 Field Start Offsets /* First Row */
00 00 78 0D 02 BF Extra Bytes
00 00 00 00 04 21 System Column #1
00 00 00 00 09 2A System Column #2
80 00 00 00 2D 00 84 System Column #3
50 50 Field1 'PP'
50 50 Field2 'PP'
50 50 Field3 'PP'

If, on the other hand, you look at an InnoDB file, you'll find that it's got more complexities in the storage. The details are elsewhere in this document. But here's an introductory look.

The header here begins with offsets unlike MyISAM, which has no offsets. So you'd have to go through column 1 before getting to column 2.

Then there is a fixed header the extra bytes.

Then comes the record proper. The first fields of a typical record contain information that the user won't see, such as a row ID, a transaction ID, and a rollback pointer. This part would look different if the user had defined a primary key during the CREATE TABLE statement.

And finally there are the column contents the string of Ps at the end of the snippet here. You can see that InnoDB does more administrating.

There's been a recent change for InnoDB; what you see above is from a database made before version 5.0.

Graphic: A Packet

Number Of Rows
Message Content

Our final worm's-eye look at a physical structure will be a look at packets.

By packet, we mean: what's the format of a message that the client sends over the tcp/ip line to the server and what does the server send back?

Here we're not displaying a dump. If you want to see hexadecimal dumps of the contents of packets, this document is full of them. We're just going to note that a typical message will have a header, an identifier, and a length, followed by the message contents.

Admittedly this isn't following a standard like ISO's RDA or IBM's DRDA, but it's documented so if you want to go out and write your own type 4 JDBC driver, you've got what you need here. (Subject to license restrictions, of course.) But a word of advice on that last point: it's already been done. Mark Matthews wrote it originally, it's all in "MySQL Connector/J".

Download this Manual
EPUB - 0.8Mb
User Comments
Sign Up Login You must be logged in to post a comment.