[MySQL - Online DDL] use. Gh-ost

Gh-ost란,
MySQL에서 online alter 기능이 있지만 아직 완벽하지는 않습니다.
이를 보완하기 위해 percona 의 pt-online-schema-change 스크립트를 사용하기도 하는데요,
원본과 복사본 테이블의 sync 를 trigger 기반으로 맞춘다는 단점이 있습니다. 특수 케이스에서는 일관성 문제도 야기될 수 있고, trigger 가 유발할 수 있는 Lock 으로 인해서 마지막에 swap 할 때 쿼리들이 실패하기도 합니다.
 
이러한 컨셉은 2009년부터 이어져왔고 큰 변화가 없었습니다. gh-ost 가 이러한 단점을 보완해서 pt-online-schema-change 를 대체할 수 있을지 조심스레 살펴보고 있습니다. 원본과 복사본(ghost table) 간의 sync 를 맞추는데 binlog(RBR) 를 사용합니다. trigger 를 없앰으로써 기대되는 효과가 있습니다.

 

 

Docs URLhttps://github.com/github/gh-ost
 

GitHub - github/gh-ost: GitHub's Online Schema-migration Tool for MySQL

GitHub's Online Schema-migration Tool for MySQL. Contribute to github/gh-ost development by creating an account on GitHub.

github.com

 

 

다른 online-ddl 툴인 pt-online-change-schema 사용하는 방법입니다.

https://jhdatabase.tistory.com/108

 

[MySQL - Online DDL] part 3 use. pt-online-change-schema

pt-online-change-schema 이란, Mysql에서 몇억건 이상의 대량의 데이터를 갖고 있는 테이블을 수정 하는것은 어렵습니다. 단순히 alter table을 해버리면 4시간, 5시간 이상 수행되기 떄문에 이를 해결 하기

jhdatabase.tistory.com

 

 

pt-osv VS gh-ost 차이점
 
pt-osc
gh-ost
작업 중 DML 반영 방식
trigger
binlog
binlog_format
상관없음
ROW(Mixed안됨)
log_slave_updates
상관없음
master에서 binlog 가져오면 상관없음, slave에서 가져오면 필수
FK
지원
미지원
trigger
지원
미지원
임의 중단 기능
미지원
throttle, panic 옵션으로 지원
작업 도중 설정 변경
미지원
작업 수행 서버
Master
Master / Slave (–migrate-on-replica)

 

 

Architecure

 

 

 

 

■  Sample Data Insert

mysql> create table test_tbl ( a int primary key auto_increment, b varchar(100), c varchar(100));
Query OK, 0 rows affected (0.02 sec)

mysql> insert into test_tbl select null, uuid(),uuid() from information_schema.columns as a, information_schema.columns as b limit 1000000;
Query OK, 1000000 rows affected (16.44 sec)
Records: 1000000  Duplicates: 0  Warnings: 0



mysql> desc test_tbl;
+-------+--------------+------+-----+---------+----------------+
| Field | Type         | Null | Key | Default | Extra          |
+-------+--------------+------+-----+---------+----------------+
| a     | int          | NO   | PRI | NULL    | auto_increment |
| b     | varchar(100) | YES  |     | NULL    |                |
| c     | varchar(100) | YES  |     | NULL    |                |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

 

 

 

■ ghost 실행 유저 생성

mysql> create user ghost@'%' identified by 'ghost';
Query OK, 0 rows affected (0.01 sec)

mysql> grant all on *.* to ghost@'%';
Query OK, 0 rows affected (0.01 sec)

 

 

■ gh-ost Install

[root@master ~]# wget https://github.com/github/gh-ost/releases/download/v1.0.49/gh-ost-1.0.49-1.x86_64.rpm
--2023-02-03 08:03:23--  https://github.com/github/gh-ost/releases/download/v1.0.49/gh-ost-1.0.49-1.x86_64.rpm
Resolving github.com (github.com)... 20.200.245.247
Connecting to github.com (github.com)|20.200.245.247|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/54378638/db25d780-4b3d-11ea-905d-1dd3a75a166a?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230202%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230202T230323Z&X-Amz-Expires=300&X-Amz-Signature=34ea3061892e1c08d6525f5898718052a6f80d6e333b2154becd161f7dfa4009&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=54378638&response-content-disposition=attachment%3B%20filename%3Dgh-ost-1.0.49-1.x86_64.rpm&response-content-type=application%2Foctet-stream [following]
--2023-02-03 08:03:23--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/54378638/db25d780-4b3d-11ea-905d-1dd3a75a166a?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230202%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230202T230323Z&X-Amz-Expires=300&X-Amz-Signature=34ea3061892e1c08d6525f5898718052a6f80d6e333b2154becd161f7dfa4009&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=54378638&response-content-disposition=attachment%3B%20filename%3Dgh-ost-1.0.49-1.x86_64.rpm&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4957238 (4.7M) [application/octet-stream]
Saving to: ‘gh-ost-1.0.49-1.x86_64.rpm’

100%[===================================================================================================================================================>] 4,957,238   12.3MB/s   in 0.4s   

2023-02-03 08:03:24 (12.3 MB/s) - ‘gh-ost-1.0.49-1.x86_64.rpm’ saved [4957238/4957238]




[root@master ~]# rpm -ivh gh-ost-1.0.49-1.x86_64.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:gh-ost-1:1.0.49-1                ################################# [100%]

 

 

 

■ binlog format 확인  => ROW

mysql> show variables like 'binlog_format';;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.01 sec)

 

 

 

■ ghost 실행 (add column)

gh-ost \
--max-load=Threads_running=100 \
--critical-load=Threads_running=200 \
--chunk-size=1000 \
--throttle-control-replicas="192.168.100.36:3306" \
--max-lag-millis=500 \
--user="ghost" \
--password='ghost' \
--host="192.168.100.35" \
--port=3306 \
--database="kim" \
--table="test_tbl" \
--verbose \
--alter="add column ghost5 int not null default 0" \
--switch-to-rbr \
--allow-on-master \
--cut-over=default \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--throttle-flag-file=/tmp/throttle.flag \
--execute



host : binary log 를 가져올 서버 설정

throttle-control-replicas : –max-lag-millis 를 체크할 리플리카 서버 지정하는 옵션

switch-to-rbr : gh-ost 는 binlog_format=ROW 일 때만 동작하기 때문에 작업 대상 서버의 binlog_format을 ROW로 변경하는 옵션. 주로 Slave의 binlog_format 변경과 log_slave_updates 옵션 설정 후 재기동 수행 (원복 X)

allow-master-master : MMM 사용할 때 처럼 master-master (active-passive) 구성일 때 설정

allow-on-master : master에서 작업할 때 설정하는 옵션

migrate-on-replica : slave 에서 작업할 때 설정하는 옵션 ( 양방향 복제가 아닌 이상 slave에만 작업 적용되므로 참고! )

assume-master-host : allow-master-master 설정 시 작업할 master 서버 지정하는 옵션

cut-over : org-table과 gh-ost table 간 swapping하는 migration의 마지막 단계로 atomic 하기 때문에 swapping 도중 실패하는 쿼리는 없음

panic-flag-file : flag-file 생성 시 작업 취소

postpone-cut-over-flag-file : flag-file 존재하면 cut-over 단계를 수행하지 않고 대기함 이 옵션 설정 시 작업 시작 할때 이 파일을 생성하기 때문에 마지막 rename 단계 전에 수동으로 삭제해줘야함, 대기하더라도 DML은 계속 copy 진행됨

throttle-flag-file : throttle-flag-file 존재하면 작업 일시정지

 

 

 

■ 분석

## master / slave 파악 및 binary log 가져올 서버(여기선 slave) 와 binlog position 확인
2023-02-03 13:33:17 INFO starting gh-ost 1.0.49
2023-02-03 13:33:17 INFO Migrating `kim`.`test_tbl`
2023-02-03 13:33:17 INFO connection validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO User has SUPER, REPLICATION SLAVE privileges, and has ALL privileges on `kim`.*
2023-02-03 13:33:17 INFO binary logs validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO Restarting replication on 192.168.100.35:3306 to make sure binlog settings apply to replication thread
2023-02-03 13:33:17 INFO Inspector initiated on master:3306, version 8.0.28
2023-02-03 13:33:17 INFO Table found. Engine=InnoDB
2023-02-03 13:33:17 INFO Estimated number of rows via EXPLAIN: 993705
2023-02-03 13:33:17 INFO Master forced to be 192.168.100.35:3306
2023-02-03 13:33:17 INFO log_slave_updates validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO connection validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO Connecting binlog streamer at mysql.000036:175418044
[2023/02/03 13:33:17] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 192.168.100.35 3306 ghost    false false <nil> false UTC true 0 0s 0s 0 false}
[2023/02/03 13:33:17] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql.000036, 175418044)
[2023/02/03 13:33:17] [info] binlogsyncer.go:203 register slave for master server 192.168.100.35:3306
[2023/02/03 13:33:17] [info] binlogsyncer.go:723 rotate to (mysql.000036, 175418044)
2023-02-03 13:33:17 INFO rotate to next log from mysql.000036:0 to mysql.000036
2023-02-03 13:33:17 INFO connection validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO connection validated on 192.168.100.35:3306
2023-02-03 13:33:17 INFO will use time_zone='SYSTEM' on applier
2023-02-03 13:33:17 INFO Examining table structure on applier
2023-02-03 13:33:17 INFO Applier initiated on master:3306, version 8.0.28
...



## migration 과정을 기록할 _tb_test_ghc 테이블 , 데이터 copy 및 rename 할 _tb_test_gho 임시테이블 생성
2023-02-03 13:33:17 INFO Dropping table `kim`.`_test_tbl_ghc`
2023-02-03 13:33:17 INFO Table dropped
2023-02-03 13:33:17 INFO Creating changelog table `kim`.`_test_tbl_ghc`
2023-02-03 13:33:17 INFO Changelog table created
2023-02-03 13:33:17 INFO Creating ghost table `kim`.`_test_tbl_gho`
2023-02-03 13:33:17 INFO Ghost table created
2023-02-03 13:33:17 INFO Altering ghost table `kim`.`_test_tbl_gho`
2023-02-03 13:33:17 INFO Ghost table altered


## 임시 테이블 생김
mysql> show tables;
+---------------+
| Tables_in_kim |
+---------------+
| _test_tbl_ghc |
| _test_tbl_gho |
| kim           |
| test_tbl      |
+---------------+
4 rows in set (0.00 sec)




## 설정 재확인 및 postpone 파일 생성 (마지막 rename 단계에서 수동으로 지워줘야합니다.)
2023-02-03 13:33:17 INFO Intercepted changelog state GhostTableMigrated
2023-02-03 13:33:17 INFO Waiting for ghost table to be migrated. Current lag is 0s
2023-02-03 13:33:17 INFO Handled changelog state GhostTableMigrated
2023-02-03 13:33:17 INFO Chosen shared unique key is PRIMARY
2023-02-03 13:33:17 INFO Shared columns are a,b,c
2023-02-03 13:33:17 INFO Listening on unix socket file: /tmp/gh-ost.kim.test_tbl.sock
2023-02-03 13:33:17 INFO Migration min values: [1]
2023-02-03 13:33:17 INFO Migration max values: [1000000]
2023-02-03 13:33:17 INFO Waiting for first throttle metrics to be collected
2023-02-03 13:33:17 INFO First throttle metrics collected
# Migrating `kim`.`test_tbl`; Ghost table is `kim`.`_test_tbl_gho`
# Migrating master:3306; inspecting master:3306; executing on master
# Migration started at Fri Feb 03 13:33:17 +0900 2023
# chunk-size: 1000; max-lag-millis: 500ms; dml-batch-size: 10; max-load: Threads_running=100; critical-load: Threads_running=200; nice-ratio: 0.000000
# throttle-flag-file: /tmp/throttle.flag
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# throttle-control-replicas count: 1
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag [set]
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.kim.test_tbl.sock




## org_table에서 tmp table로 data copy
Copy: 0/993705 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql.000036:195467254; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 0/993705 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql.000036:195471905; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 62000/993705 6.2%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: mysql.000036:200786238; Lag: 0.01s, State: migrating; ETA: 30s
Copy: 130000/993705 13.1%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 3s(copy); streamer: mysql.000036:206614414; Lag: 0.01s, State: migrating; ETA: 19s
Copy: 196000/993705 19.7%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 4s(copy); streamer: mysql.000036:212271313; Lag: 0.01s, State: migrating; ETA: 16s
Copy: 261000/993705 26.3%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 5s(copy); streamer: mysql.000036:217843014; Lag: 0.01s, State: migrating; ETA: 14s
Copy: 334000/993705 33.6%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 6s(copy); streamer: mysql.000036:224099391; Lag: 0.01s, State: migrating; ETA: 11s
Copy: 406000/993705 40.9%; Applied: 0; Backlog: 0/1000; Time: 7s(total), 7s(copy); streamer: mysql.000036:230270128; Lag: 0.10s, State: migrating; ETA: 10s
Copy: 477000/993705 48.0%; Applied: 0; Backlog: 0/1000; Time: 8s(total), 8s(copy); streamer: mysql.000036:236355229; Lag: 0.01s, State: migrating; ETA: 8s
Copy: 545000/993705 54.8%; Applied: 0; Backlog: 0/1000; Time: 9s(total), 9s(copy); streamer: mysql.000036:242183405; Lag: 0.01s, State: migrating; ETA: 7s
Copy: 613000/993705 61.7%; Applied: 0; Backlog: 0/1000; Time: 10s(total), 10s(copy); streamer: mysql.000036:247992165; Lag: 0.01s, State: migrating; ETA: 6s
Copy: 685000/993705 68.9%; Applied: 0; Backlog: 0/1000; Time: 11s(total), 11s(copy); streamer: mysql.000036:254182759; Lag: 0.02s, State: migrating; ETA: 4s
Copy: 759000/993705 76.4%; Applied: 0; Backlog: 0/1000; Time: 12s(total), 12s(copy); streamer: mysql.000036:260524364; Lag: 0.11s, State: migrating; ETA: 3s
Copy: 834000/993705 83.9%; Applied: 0; Backlog: 0/1000; Time: 13s(total), 13s(copy); streamer: mysql.000036:266952443; Lag: 0.01s, State: migrating; ETA: 2s
Copy: 905000/993705 91.1%; Applied: 0; Backlog: 0/1000; Time: 14s(total), 14s(copy); streamer: mysql.000036:273037539; Lag: 0.01s, State: migrating; ETA: 1s
Copy: 975000/993705 98.1%; Applied: 0; Backlog: 0/1000; Time: 15s(total), 15s(copy); streamer: mysql.000036:279037441; Lag: 0.01s, State: migrating; ETA: 0s
2023-02-03 14:55:44 INFO Row copy complete

 

 

 

■ Copy 완료 후 정상 확인

mysql> show tables;
+---------------+
| Tables_in_kim |
+---------------+
| _test_tbl_ghc |
| _test_tbl_gho |
| kim           |
| test_tbl      |
+---------------+
4 rows in set (0.00 sec)



## copy후 해당 컬럼 추가되었는지 확인
mysql> desc _test_tbl_gho;
+--------+--------------+------+-----+---------+----------------+
| Field  | Type         | Null | Key | Default | Extra          |
+--------+--------------+------+-----+---------+----------------+
| a      | int          | NO   | PRI | NULL    | auto_increment |
| b      | varchar(100) | YES  |     | NULL    |                |
| c      | varchar(100) | YES  |     | NULL    |                |
| ghost5 | int          | NO   |     | 0       |                |
+--------+--------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)




mysql> select count(*) from  test_tbl;
+----------+
| count(*) |
+----------+
|  1000142 |
+----------+
1 row in set (0.20 sec)

mysql> select count(*) from _test_tbl_gho;
+----------+
| count(*) |
+----------+
|  1000142 |
+----------+
1 row in set (0.14 sec)

 

 

■ 수동 Rename 필요 

=> 찾아보기론 rename까지 지원해주지 않는 것으로 보입니다. 하지만 아래와 같이 커뮤니티에서는 그냥 rename하면 된다고 합니다..
테이블 이름 바꾸기는 빠른 작업입니다. MySQL 메타데이터만 변경합니다. 따라서 gh-ost가 필요하지 않습니다. DDL을 직접 실행할 수 있습니다.
이전에는 RENAME TABLE을 실행하려면 LOCK TABLES로 잠긴 테이블이 없어야 했습니다. 이제 WRITE 잠금으로 잠겨 있거나 다중 테이블 이름 바꾸기 작업의 이전 단계에서 WRITE 잠금 테이블의 이름을 바꾼 제품인 테이블의 이름을 바꿀 수 있습니다.
## rename 하는 당시에 AS-IS에 들어온 데이터를 손수 발라서 TO-BE에 넣어줄 수 없기때문에 원본테이블과 임시 테이블 모두 Write lock 잡고는 것이 좋아보입니다.
master> lock tables test_tbl write, _test_tbl_gho write;

mysql> rename /* gh-ost */ table `kim`.`test_tbl` to `kim`.`test_tbl_old`, `kim`.`_test_tbl_gho` to `kim`.`test_tbl`;

 

 

작업중 일시 정지

■ /tmp에 throttle.flag 파일을 생성

[root@master ~]# touch /tmp/throttle.flag

 

 

■ 작업 일시 정지

...
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m0s(total), 15s(copy); streamer: mysql.000036:174604910; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m30s(total), 15s(copy); streamer: mysql.000036:174733742; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 4m0s(total), 15s(copy); streamer: mysql.000036:174861916; Lag: 0.01s, State: postponing cut-over; ETA: due

 

 

throttle.flag 파일 삭제

[root@master tmp]# rm -rf throttle.flag

 

 

■ 작업 재개

...
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m0s(total), 15s(copy); streamer: mysql.000036:174604910; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m30s(total), 15s(copy); streamer: mysql.000036:174733742; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 4m0s(total), 15s(copy); streamer: mysql.000036:174861916; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 4m30s(total), 15s(copy); streamer: mysql.000036:174990818; Lag: 0.01s, State: postponing cut-over; ETA: due

 

 

 

 

panic으로 인한 작업 중단

panic 파일 생성

[root@master ~]# touch /tmp/ghost.panic.flag

 

 

■ 작업 정지됨

...
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 5m0s(total), 15s(copy); streamer: mysql.000036:175118958; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 1000000/1000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 5m30s(total), 15s(copy); streamer: mysql.000036:175247130; Lag: 0.01s, State: postponing cut-over; ETA: due
2023-02-03 13:20:53 FATAL Found panic-file /tmp/ghost.panic.flag. Aborting without cleanup

 

 

■ 작업 테이블이 남아 있으므로 수동 삭제 필요

mysql> show tables;
+---------------+
| Tables_in_kim |
+---------------+
| _test_tbl_ghc |
| _test_tbl_gho |
| kim           |
| test_tbl      |
+---------------+
4 rows in set (0.00 sec)



mysql> drop table _test_tbl_ghc;
Query OK, 0 rows affected (0.01 sec)

mysql> drop table _test_tbl_gho;
Query OK, 0 rows affected (0.05 sec)

 

 

 

 

참고