Documentation Home
MySQL 5.6 リファレンスマニュアル
Download this Manual
PDF (US Ltr) - 27.1Mb
PDF (A4) - 27.1Mb
EPUB - 7.5Mb
HTML Download (TGZ) - 7.2Mb
HTML Download (Zip) - 7.2Mb


MySQL 5.6 リファレンスマニュアル  /  ...  /  レプリケーションのトラブルシューティング

17.4.4 レプリケーションのトラブルシューティング

指示に従ってもレプリケーションセットアップが機能しない場合、最初に行うことはエラーログでメッセージを確認することです。多くのユーザーは、問題が発生したあとにこれを十分に実行せずに、時間を失います。

エラーログから何が問題だったのかがわからない場合は、次の手法を試してください。

  • マスターでバイナリロギングが有効になっていることを、SHOW MASTER STATUS ステートメントを発行して確認します。ロギングが有効であると、Position はゼロではありません。バイナリロギングが有効でない場合、--log-bin オプションでマスターを実行していることを確認してください。

  • マスターとスレーブの両方が --server-id オプションで起動されたこと、および ID 値が各サーバーで一意であることを確認します。

  • スレーブが動作していることを確認します。SHOW SLAVE STATUS を使用して、Slave_IO_Running および Slave_SQL_Runningの値が両方とも Yes であるかどうかを確認してください。そうでない場合、スレーブサーバーを起動するときに使用したオプションを確認してください。たとえば、--skip-slave-start は、START SLAVE ステートメントを発行するまでスレーブスレッドが起動するのを妨げます。

  • スレーブが動作している場合、マスターとの接続が確立されたかどうかを確認します。SHOW PROCESSLIST を使用して I/O スレッドと SQL スレッドを見つけ、それらの State カラムをチェックしてそれらに何が表示されているかを確認してください。セクション17.2.1「レプリケーション実装の詳細」を参照してください。I/O スレッド状態が Connecting to master である場合、次のことを確認してください。

    • マスター上でレプリケーションに使用されているユーザーの権限を確認します。

    • マスターのホスト名前が正しいこと、正しいポートを使用してマスターに接続していることを確認します。レプリケーションに使用されるポートは、クライアントネットワーク通信に使用されるポートと同じです (デフォルトは 3306 です)。ホスト名の場合、その名前が正しい IP アドレスに解決されることを確認してください。

    • ネットワークがマスターまたはスレーブ上で無効化されていないことを確認します。構成ファイルで skip-networking オプションを探してください。存在する場合、コメントアウトするか、削除してください。

    • マスターにファイアウォールまたは IP フィルタリング構成がある場合、MySQL に使用されるネットワークポートがフィルタリングされていないことを確認します。

    • ping または traceroute/tracert を使用してホストに到達することで、マスターに到達できることを確認します。

  • スレーブが以前は動作していたのに停止した場合、通常はマスターで成功したステートメントの一部がスレーブで失敗したことが原因です。マスターの適切なスナップショットを作成し、スレーブスレッド以外にスレーブでそのデータを変更しなかった場合、これは決して発生しないはずです。スレーブが突然停止する場合、それはバグであるか、またはセクション17.4.1「レプリケーションの機能と問題」で説明した既知のレプリケーション制限のいずれかが発生しています。バグの場合は、報告方法の説明をセクション17.4.5「レプリケーションバグまたは問題を報告する方法」で参照してください。

  • マスター上で成功したステートメントがスレーブ上で実行することを拒否する場合に、スレーブのデータベースを削除してマスターから新しいスナップショットをコピーすることによる完全なデータベース再同期を実行できない場合は、次の手順を試みてください。

    1. スレーブ上の影響されるテーブルがマスターテーブルと異なるかどうかを判断します。これがどのように発生したかを理解しようとしてください。それから、スレーブのテーブルをマスターのものと同じにして START SLAVE を実行してください。

    2. 前述の手順が機能しない、または当てはまらない場合は、手動で更新を行ってから (必要な場合)、マスターからの次のステートメントを無視することが安全かどうかを理解しようとしてください。

    3. スレーブがマスターからの次のステートメントをスキップできると判断した場合、次のステートメントを発行します。

      mysql> SET GLOBAL sql_slave_skip_counter = N;
      mysql> START SLAVE;
      

      マスターからの次のステートメントが AUTO_INCREMENT または LAST_INSERT_ID() を使用しない場合、N の値は 1 であるべきです。そうでない場合は、値は 2 であるべきです。AUTO_INCREMENT または LAST_INSERT_ID() を使用するステートメントに値 2 を使用する理由は、それらがマスターのバイナリログ内で 2 つのイベントを必要とすることです。

      セクション13.4.2.4「SET GLOBAL sql_slave_skip_counter 構文」も参照してください。

    4. スレーブがマスターと完全に同期された状態で起動したこと、およびスレーブスレッド以外で使用されるテーブルをだれも更新しなかったことがわかっている場合は、おそらくこの矛盾はバグの結果です。最新バージョンの MySQL を実行している場合は、この問題を報告してください。古いバージョンを実行している場合は、最新の本番環境リリースにアップグレードして問題が持続するかどうかを判断してみてください。


User Comments
  Posted by on May 17, 2002
Note that if you client does not do a "USE
dbname", binlog-do-db=dbname will not binlog a
query like: "update in dbname.foobar set foo=1"

You explicitly have to do a USE before a query in
order to have your query binlogged, it looks
like. Replication on the slave side can do
wildcard matches .. but the master cannot (a la
binlog-wild-do-table=dbname.%). So make sure your
clients do a use, if you plan to replicate those
tables it updates.
  Posted by on May 17, 2002
for MySQL v3.23.28:
When you attempt to use a certain master-
user/master-password combo to connect to the
mysql master, and you later change my.cnf to
attempt to connect with a new user, you must
update master.info to reflect the changes.

Since my master.info file only had one entry in
it (the slave only has one master), I simply
deleted the file. Upon restarting the slave
daemon, a new master.info was automatically
written.

--Curby
  Posted by Ed McGuigan on May 17, 2002
If you need to roll your own log file rotation
script as I did, and you are familiar with Perl,
look at the Log::Rotate module on CPAN rather
than reinventing the wheel.
  Posted by Renato Golin on May 17, 2002
If you intend to use "load table from master" you need to have access to that table to "rep" user, unlike this sections says of only having to set "File_priv" to "rep" user for all bases.
  Posted by Jesse Thompson on May 17, 2002
Above on this page where it talks about "properly
modularized and abstracted code" and refers to
safe_reader_query() and safe_writer_query(), I'd
like to put forth the proposition that different
abstracted functions for reading and writing need
not be nessessary for compliance with replication.

We currently have all of our queries running
through One safe-sql envelope, and we intend to
keep this architecture as we move to replication
by telling our envelope to send any queries that
begin with /\s+select/i to the slave, and
anything else to the master. We are running under
the assumption that all "read queries" are
selects. I can't think of any that aren't. If
there were they would probably be nominal in
performace draw (we use many complex select
statements :) and wouldn't do any harm being
handled by the master anyway.

So far as multi-statement queries, are code
doesn't, and won't, mix selects in the same query
with non-selects, or probably even use more than
one select per query, since we're not certain
what results could be returned in such a
circumstance. Thus reading and writing should
never get mixed up in the same query, and all
reads should start with the word "Select".

If you feel that my theory holds water then give
it a go, if you see a flaw in my logic before I
do mail me and lemme know, eh? 10x :)

- - Jess
  Posted by Jeff Allen on May 17, 2002
When I started up my slave the first time, I had
been using binary logging for some time on the
master. I had already removed $hostname-bin.001
long ago. The slave complained about not being
able to find the first log and would not start
replaying transactions. To fix this, I stopped
the master, made a new snapshot, moved "*-bin.*"
to another directory, and started the master
again. Then when I put the snapshot on the slave
and started it, everything worked correctly.
  Posted by Jonathon Padfield on May 17, 2002

If you want your slaves to connect to the
replication server with a unique username &
password and minimal privileges, you need to grant
just the FILE privilege to your replication user.

Similarly, if your slaves have no local updates
made on them, just lots of selects, its a very
good idea to connect with a user that cannot
update the data. This stops dead any chance of
mistakenly connecting to the wrong DB and losing
updates.

  Posted by joshua paul on May 17, 2002
With verion 3.23.41 - I couldn't get replication
to work following the instructions about.
Specifically I got errors when I created
the "repl" user before copying data to the slave.
I could only get replication to work if I created
the "repl" user after copying data to the slave -
and obviously starting both servers...
  Posted by on May 17, 2002
Note: If you are running WinMySQLAdmin, you will
have to make the changes in the my.ini file as
well. It took me a while to realize this.
  Posted by Chad Kouse on May 17, 2002
incredibly helpful page. I do have one issue to
raise - I have two servers doing 2-way
replication. One is on Linux, and one is on
Windows. There is an issue of case-sensitivity
in that if case is not taken into consideration
on the windows machine, the slave on the linux
machine stops. I'd love to hear any fixes to
this: chad@toohome.com
  Posted by on May 17, 2002
I also had the problem Allen pointed out above,
in my case I had not actually deleted any of the
binary log files which the slave still required
to update from (I knew this courtesy of "show
slave status" command). But I had deleted several
log files it had finished with. I was able to fix
this without making a brand new snapshot (which
in my case took a larger outage that I would
like) What I did was edit the file (in the
Masters logs directory) $hostname-bin.index and
edited entries to show exact filenames of my
remaining binary log files.(ie: removing the
entries matching ones I had deleted manually from
the file system) I quickly did a mysql stop and
mysql start after that and I performed the
command "slave stop; slave start;" on the slave
and it started replicating again. I am running
version 3.23.41
  Posted by David Dombek on January 30, 2003
Mysql 3.23.49
When a slave is running you have a master.info file. If you change the slave to become the master you must delete the master.info file or you may get errors such as this:

030130 10:57:03 Slave thread: error connecting to master: Unknown MySQL Server Host '' (4) (107), retry in 60 sec

It will keep retrying every 60 sec. Just deleting the master.info file worked for me.
  Posted by on February 5, 2003
I wouldn't suggest running replication on machines with different os. I had win2000 running MySQL server as well a RedHat Linux 7.3 with MySQL server as well. When the client code on Linux tried to connect to win2000 MySQL server, the server died, with win2000 reporting it has terminated.
Still haven't figure out what happened
  Posted by Christopher Everett on November 2, 2003
Do NOT allow your replication master to run out of space in
the partition where you put your binlogs. One of two things
will happen:

1) the binlog will get truncated in the middle of an event, and the IO thread on the replication slave(s) will halt.
2) A series of transactions will go unrecorded on the master
binlog.

Either way, you'll have to reload the data from the replication master to the slave.
  Posted by Vittal Aithal on March 8, 2004
Having pulled my hair out with this error when getting a slave to replicate:

040308 13:38:01 Slave: reconnected to master 'repl@master:3306',replication resumed in log 'master-bin.011' at position 8379225
8
040308 13:38:01 Error reading packet from server: Access denied for user: 'repl@slave' (Using password: YES) (server_errno=1045)

it turns out that I'd installed Mysql, then changed the hostname of the slave server - *however*, the slave server's mysql.user table still contained the old slave server's hostname.

Changing the host for the root user in the slave's mysql.user table to the correct hostname, then restarting the slave process fixed it all.

vittal
  Posted by James Day on October 3, 2004
If you get a "Could not parse relay log event entry" error in a working replication setup and other slaves are fine, use STOP SLAVE; CHANGE MASTER TO master_log_file='(insert the value from Relay_Master_Log_File)', master_log_pos=(insert the value from Exec_master_log_pos); START SLAVE; . The slave will re-fetch from the master with no risk of losing a transaction. You should also look at the error log, in case it wasn't just a damaged network packet the first time.
  Posted by Don Hejna on October 11, 2005
SELinux can interfer with replication and be difficult to track down. This can sometimes occur after you have extracted files using TAR in the mysql data directory as is often instructed in replication examples because it can modify file properties tracked by SELinux. (Encountered using FC-4 SELinux and MySQL 4.1.11)

If you encounter the following error on the replication slave:

[ERROR] Slave I/O thread: error connecting to master 'repl@192.168.7.11:3306': Error: 'Can't connect to MySQL server on '192.168.7.11' (13)' errno: 2003 retry-time: 60 retries: 86400

check that you can login using:
mysql -u "replication_username" -h "replication_hostname" -p

if you can login using the above but replication fails to connect it may be SELinux policies restricting mysqld's access to ports, sockets, and files.

Check or grep /var/log/audit/audit.log and /var/log/messages for messages pertaining to mysqld like the following:

type=AVC msg=audit(1126806889.640:14244693): avc: denied { name_connect } for pid=2561 comm="mysqld" dest=3306 scontext=system_u:system_r:mysqld_t tcontext=system_u:object_r:mysqld_port_t tclass=tcp_socket type=SYSCALL msg=audit(1126806889.640:14244693): arch=40000003 syscall=102 success=no exit=-13 a0=3 a1=b139a130 a2=2 a3=0 items=0 pid=2561 auid=4294967295 uid=27 gid=27 euid=27 suid=27 fsuid=27 egid=27 sgid=27 fsgid=27 comm="mysqld" exe="/usr/libexec/mysqld"

To confirm SELinux is the cause, try turning off SE Linux from the root account using the following command:

# setenforce 0
# /etc/init.d/mysqld restart

If replication then starts working you know SELinux is the cause. On FC-4, running the following can shed light on this:

# audit2allow -i /var/log/audit/audit.log -l
allow mysqld_t mysqld_port_t:tcp_socket name_connect;
allow mysqld_t self:tcp_socket connect;
allow mysqld_t var_lib_t:dir { add_name read remove_name write };
allow mysqld_t var_lib_t:file { append create getattr lock read unlink write };
allow mysqld_t var_lib_t:sock_file { create getattr };

and running
# restorecon -R -v /var/lib/mysql
can fix problem associated with changing files in the MySQL directory

Since this post is about identifying the MySQL problems and not a tutorial about SELinux you should see also for details:
http://fedora.redhat.com/docs/selinux-apache-fc3/sn-debugging-and-customizing.html (which describes how to update the SELinux policies)

  Posted by Daniel Chen on June 16, 2006
The master and slave must have the same value of global variable "collation_server", if not, slave io process can not be started.
  Posted by on January 12, 2007
If the slave was running previously but has stopped, the
reason usually is that some statement that succeeded on the
master failed on the slave. This should never happen if you
have taken a proper snapshot of the master, and never
modified the data on the slave outside of the slave thread.

AND you have never issued a statement on the MASTER which references a file or database that doesn't exist on the SLAVE.

For example:

INSERT INTO replicated_table SELECT stuff FROM master_only_db.table
  Posted by Chris Wiegand on May 17, 2007
Do not use BIT types in sprocs if you're using replication. I pulled out a good 2" (5cm) of hair because I couldn't figure out why the binlog showed a ^@ where it should be a 1 (or some other way of indicating a bit). Finally figured out that most places I'm using TINYINT(1), but I was using a BIT in that sproc. Replaced, manually replayed those statements that were affected, and didn't have the problem again.
  Posted by James Green on July 25, 2007
We just went from 5.0.26 to 5.0.45 and replication broke. Somehow the HOSTNAME-bin.index file stopped recording binary log file names as ./HOSTNAME-bin.0000001, etc., and started recording them with absolute paths like /usr/lib/mysql/data/HOSTNAME-bin.0000036

This resulted in the slave error log having a line saying:
[ERROR] Got fatal error 1236: 'Could not find first log file name in binary log index file' from master when reading data from binary log

Don't panic. You need to manually edit the master's HOSTNAME-bin.index file and change all ./ relative paths to the absolute path (matching the ones created since the upgrade). Don't do this with the master running for obvious reason. Once you've restarted the master, restart the slave and the slave should catch up. Well, it did for me.

Assumes you haven't explicitly changed where binary logs are going during the upgrade of course.

  Posted by Richard Armstrong on August 1, 2007
If the Binlog becomes corrupt, due to lack of space, or a network glitch.

re-copy the binary log file from the master to the slave, changing the name as necessary.

Shutdown mysqld
Start mysqld again.
> start slave;

For some reason, the daemon held on to the old version of the binary log file, despite me telling it to refresh logs via mysqladmin.
Stop/start of the mysqld daemon followed by a "slave start" worked wonders.

  Posted by Matija Nalis on December 16, 2007
if you keep getting "Access denied; you need the REPLICATION SLAVE privilege for this operation" on your slave despite the fact that you HAVE done "GRANT REPLICATION SLAVE ON *.*" the problem might be that you haven't got the newest mysql privilege tables (for example if master was upgraded before 4.1 or similar).

You can check for this problem if command:
show grants for "replicate"@"your.slave.host";

does NOT have "REPLICATION SLAVE" in its output.

The fix is to run mysql_fix_privilege_tables script which comes with mysql.
  Posted by Titi Ala'ilima on July 14, 2011
Under the section "If the I/O thread state says Connecting to master, check the following:" there should be a note regarding SSL-based replication that you should make sure the certificates have not expired. The "openssl x509" command can be use to find out the period of validity for the certificate files.

  Posted by Marcel Losekoot on August 13, 2014
Do not use a password that's more than 32 characters!

I could not get the replication slave to connect to the replication master, yet I could connect using the mysql client. The problem was that the password I had set was over 32 characters in length, which apparently gets truncated by the replication slave but not in the mysql server or in the mysql client, causing the connection to be refused to the replication slave but not to the mysql client. The clue I got was that inside the master.info file the password was truncated. Using a shorter password fixed it. This issue is described in the documentation for the change master command.
Sign Up Login You must be logged in to post a comment.