WL#7387: Unreliable Failure Detector support in Connector/Python
Affects: Connector/Python-1.2
—
Status: Complete
GOAL ==== The goal is to make the connector python report errors to Fabric while accessing a MySQL Instance. The reported data will then be used to update the backing store and trigger a failover operation provided the faulty MySQL Instance is a primary and Fabric has gotten enough complaints from different connectors. REMARKS ======== . See WL#7455 for information on how we are planning to handle security issues. User Documentation ================== http://dev.mysql.com/doc/relnotes/connector-python/en/news-1-2-1.html http://dev.mysql.com/doc/mysql-utilities/1.4/en/connector-python-fabric- connect.html
Requirements ============ 1) It shall be possible to configure whether a connection will report errors back to Fabric or not. 2) It shall be possible to dynamically extend the set of errors that will trigger a notification to Fabric. 3) By default the set in item 2 must contain the following errors: CR_SERVER_LOST, CR_SERVE_GONE_ERROR, etc. 4) There will be a distinction between errors that are reported back to Fabric and errors that invalidate the connector's cache. 5) If the report function fails, an error is reported but no exception shall be raised.
Avoid thundering herds ====================== The ability to report errors back to Fabric must be used wisely, otherwise Fabric may suffer the thundering herd effect. If all connections attempt to report an error after a server failure, Fabric will swamped with several request around at the same time. To avoid this problem, we advise users to define key connection(s) in the application to report errors or devise a routine to periodically check the servers and report errors. This routine would work as distributed failure detector and might be spawned in a different thread within the application's context or as separate process. Handling Errors =============== Any error has the Error class as its base class. In the context of this work, there are two important errors that deserve attention: InterfaceError - This exception is raised whenever the connector is not able to establish a connection to a server. For example, this may be raised because Fabric is not accessible and there is no valid cache entry. MySQLFabricError - This exception is raised whenever there is an error while processing a request (i.e. statement) and the error triggers a cache invalidation. The connector catches the original exception, invalidates the cache and raises the MySQLFabricError. This makes it easy to develop fault tolerant applications as the developer knows that after getting such error an issue was reported back to Fabric, the cache was invalidate and the faulty server might have been replaced or at least tagged as faulty. Security issues =============== Security issues are handled as described in WL#7455.
User Interface ============== Making a connection report errors --------------------------------- The option to report errors is part of the Fabric configuration and can be set as follows: fabric_config = { 'host': .., 'report_errors': True, } cnx = mysql.connector.connect(fabric=fabric_config) Defining which errors to report ------------------------------- Errors which may be reported are stored be dynamically update as follows: from mysql.connector.fabric import extra_failure_report extra_failure_report([error_code_0, error_code_1, ...]) Defining which errors trigger a cache invalidation -------------------------------------------------- There is no function to change the set of errors that trigger a cache invalidation. However, the RESET_CACHE_ON_ERROR global variable which store such information can be updated as follows: from mysql.connector.fabric import RESET_CACHE_ON_ERROR RESET_CACHE_ON_ERROR.append(error_code_0) Inside the Connector Python =========================== Defining which errors to report ------------------------------- Two global variables are used to store the set of errors that are reported back to Fabric: REPORT_ERRORS and REPORT_ERRORS_EXTRA. The extract_failure_report, previously described, is implemented as follows: def extra_failure_report(error_codes): global REPORT_ERRORS_EXTRA if not error_codes: REPORT_ERRORS_EXTRA = [] return if not isinstance(error_codes, (list, tuple)): error_codes = [error_codes] for code in error_codes: if not isinstance(code, int) or not (code >= 1000 and code < 3000): raise AttributeError("Unknown or invalid error code.") REPORT_ERRORS_EXTRA.append(code) The REPORT_ERRORS though have a pre-defined set of errors and cannot be changed: REPORT_ERRORS = ( errorcode.CR_SERVER_LOST, errorcode.CR_SERVER_GONE_ERROR, errorcode.CR_CONN_HOST_ERROR, errorcode.CR_CONNECTION_ERROR, errorcode.CR_IPSOCK_ERROR, errorcode.ER_OPTION_PREVENTS_STATEMENT, ) Handling errors --------------- The following function handles the error and the cache invalidation: class MySQLFabricConnection(object): ... def handle_mysql_error(self, exc): if exc.errno in RESET_CACHE_ON_ERROR: self.disconnect() self._fabric.report_error(mysqlserver.uuid, exc.errno) self.reset_cache() raise MySQLFabricError( "Temporary error ({error}); " "retry transaction".format(error=str(exc))) self._fabric.report_error(mysqlserver.uuid, exc.errno) raise exc It is called whenever there is an error while processing a statement. However, errors while trying to get a connection are handled as follows: class MySQLFabricConnection(object): ... while True: counter++ ... dbconfig['host'] = mysqlserver.host dbconfig['port'] = mysqlserver.port try: self._mysql_cnx = mysql.connector.connect(**dbconfig) except Error as exc: if counter == attempts: self._fabric.report_error(mysqlserver.uuid, exc.errno) self.reset_cache(mysqlserver.group) raise InterfaceError( "Reported faulty server to Fabric ({0})".format(exc)) if attempt_delay > 0: time.sleep(attempt_delay) continue else: self._fabric_mysql_server = mysqlserver break Report error function --------------------- The report error function is implemented as follows: class Fabric(object): ... def report_error(self, server_uuid, errno): if not self._report_errors: return errno = int(errno) current_host = socket.getfqdn() if errno in REPORT_ERRORS or errno in REPORT_ERRORS_EXTRA: inst = self.get_instance() try: inst.proxy.threat.report_error(server_uuid, current_host, errno) except Fault, socket.error: pass
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.