WL#3288: Multiple cursors per TABLE

Affects: Server-6.1   —   Status: Assigned   —   Priority: Low

MySQL will extend the storage engine API with calls required to support
consistent foreign keys in READ COMMITTED isolation mode.

Sorrento architecture meeting decision #2 No phantoms in non-SERIALIZABLE mode.

In order to avoid phantoms in foreign key constraints in modes different
from SERIALIZABLE, the tables that participate in a foreign key constraint
as referenced tables will be locked in a special lock mode. A new handler API
will be developed in order to lock the specific rows participating in the
foreign key check. Support for this API in the handler will be made a
requirement for proper work of foreign keys with this handler.

Note
----
Originally this task was part of the foreign keys plan. Since it involves
a lot of refactoring in the server, and is not strictly necessary (although is
very much desirable) to deliver foreign keys, it was removed from the plan
and renamed. The replacement task with a more narrow scope is WL#4603.
This following API proposal was agreed on by the Architecture Board in
Orlando January 2008:


/**
  Iterate over table records.

  Represents an abstract API for different table access methods
  (sequential scan, index scan) and selection criteria (all,
  primary key, range). Encapsulates forward and backward
  scrolling and updatability options.

  Forward scrolling is the minimal subset of functionality
  that is expected to be supported by all implementations in
  all storage engines. If a particular implementation
  does not support more advanced features, HA_ERR_WRONG_COMMAND
  is returned.

  The memory management of Handler_cursor is Sql_alloc based.
  The user must supply a memory root whenever a cursor is
  created.
*/

class Handler_cursor: public Sql_alloc
{
protected:
  /**
    Open the cursor and position it on the first record, if any.
    Supported by all implementations.

    @retval  0 in case of success.
    @retval  HA_ERR_END_OF_FILE There is no first row, the result set
             is empty
  */
  virtual int cursor_open()= 0;
  /**
    Similar to open, but applicable only to a cursor that has been
    opened already. Supported by all implementations.

    @return  0 if case of success.
    @return  HA_ERR_END_OF_FILE There is no first row, the result
             set is empty.
  */
  virtual int cursor_reset()= 0;

  /**
    Advance cursor position to the next record.
    Read the record under the current position into a buffer.

    @param[out]  buff  a buffer to read the record into.
                       Must be of the appropriate size,
                       defined by record format in the
                       table share.

    @retval  0  success
    @retval  HA_ERR_END_OF_FILE This is not an error.
             Returned when the cursor position has been
             advanced beyond the last row or
             on attempt to advance the position further
             when it's already beyond the last row.
             'buff' has been left intact.
    @retval  HA_ERR_* A storage engine error.
  */
  virtual int cursor_read_next(void *buff)= 0;
  /**
    Advance cursor position to the previous record.
    Read the record under the current position into a buffer.

    @param[out]  buff  a buffer to read the record into.

    @retval  0  success
    @retval  HA_ERR_WRONG_COMMAND The operation is not supported.
    @retval  HA_ERR_END_OF_FILE This is not an error.
             Returned when the cursor position has been
             advanced before the first row or
             on attempt to advance the position further
             when it's already before the first row.
             'buff' has been left intact.
    @retval  HA_ERR_* A storage engine error.
  */
  virtual int cursor_read_prev(void *buff)= 0;
  /**
    Update the record under the current position.

    @param[in]  buff New record value.

    @retval  0  success
    @retval  HA_ERR_WRONG_COMMAND The operation is not supported.
    @retval  HA_ERR_END_OF_FILE The cursor does not
             point to a record.
    @retval  HA_ERR_RECORD_IS_THE_SAME No update happened
             since the old and new records are the same.
    @retval  HA_ERR_* A storage engine error.
  */
  virtual int positioned_update(const void *buff)= 0;
  /**
    Delete the record under the current position.

    @retval  0  success
    @retval  HA_ERR_WRONG_COMMAND The operation is not supported.
    @retval  HA_ERR_END_OF_FILE The cursor does not
             point to a record.
    @retval  HA_ERR_* A storage engine error.
  */
  virtual int positioned_delete()= 0;

  virtual ~Handler_cursor();
  /**
    Capabilities flags. Must be initialized by the implementation.
  */
  uint32 m_cursor_flags;
public:
  enum {
    CAN_READ_PREV= 1, /* cursor_read_prev() is supported */
    CAN_MODIFY= 2 /* positioned_{update,delete}() are supported */
  };
  /** Redirect to protected counterparts */
  virtual int ha_cursor_open();
  virtual int ha_cursor_reset();
  virtual int ha_cursor_read_next(void *buff);
  virtual int ha_cursor_read_prev(void *buff);
  virtual int ha_positioned_update(const void *buff);
  virtual int ha_positioned_delete();
};

This API over time will replace the current cursor API present in the PSEA API:
handler::rnd_read, handler::rnd_read, handler::update_row(), handler::delete_row().
For backward compatibility with the old handlers, handler::create_cursor()
will provide a default implementation which will work using handler::this and
handler::clone() - this way existing handlers will continue to work. And
default Handler_cursor will work via handler::index_next/etc.


In order to completely decouple reading and writing functionality from st_table
+ handler pair, a new class will be introduced for reading and writing data
data in table->record format:

/**
  Exchange data with the storage engine.
  
  This class represents functionality necessary to encrypt and
   decrypt data from PSEA record format (original MyISAM record
   format).

   It consists of:
   - record buffer for the record
   - Field array to read and write data from the record.
   - auxiliary members to support MySQL semantics of auto_increment,
   auto-updatable timestamp and other when writing records to a table.

   How to use:
   - use in conjunction with Handler_cursor
   - create a Handler_cursor and Field_record
   - pass the buffer for ::cursor_read_prev() method to read data
     and use Fields to encrypt data
   - fill the record buffer using the Field_record functionality
      and pass the buffer to ::positioned_update method of Handler_cursor 
*/
class Field_record: public Sql_alloc
{
   Field *field[];
   uchar *record;
   Field *next_number_field;
   Field *timestamp_field;
   uchar *default_values;
   /* The table for this record */
   struct st_table *table;
};

Todo:
 - address replication issues:
   * should positioned_update() automatically advance the cursor
   * read_set shold be a part of the spec - one should be able to scan
     using one read set, and read using another
   * old image must be available for the binary logging in update and delete
     methods