tongsiying

阅读|运动|自律

0%

BlockStorage-ETCD

etcd

001-安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
参考资料:
https://dreambo8563.github.io/2018/09/14/Etcd-%E5%88%86%E5%B8%83%E5%BC%8F%E9%85%8D%E7%BD%AE%E4%B8%AD%E5%BF%83/

描述:
etcd正常是三个节点,节点越多性能越差,etcd对并发数也有限制为10000
当集群内健康机器少于 3 台的时候,客户端报错,集群整体不可用

原来版本:3.13(支持v2版本的命令)
当前版本:3.5.0-pre

下面配置的是单副本(多副本待确认)
1. 解压etcd-v3.2.11-linux-amd64.tar,将etcd、etcdctl拷贝复制到/usr/local/bin目录下

2.进程:(或者修改run-etcd.sh)
/usr/local/bin/etcd --log-outputs=stdout --name=etcd01 --data-dir=/mnt/etcd/default.etcd/ --election-timeout=5000 --auto-compaction-retention=1 --quota-backend-bytes=134217728000 --backend-bbolt-freelist-type=map --listen-client-urls=http://10.238.161.1:2379,http://127.0.0.1:2379 --listen-peer-urls=http://10.238.161.1:2380 --advertise-client-urls=http://10.238.161.1:2379 --initial-advertise-peer-urls=http://10.238.161.1:2380 --initial-cluster-token=etcd-cluster --initial-cluster=etcd01=http://10.238.161.1:2380,etcd02=http://10.238.161.2:2380,etcd03=http://10.238.161.3:2380 --initial-cluster-state=existing --enable-v2=true --enable-pprof

3.参数解释
#  设置空间配额为内存大小,单位字节
--quota-backend-bytes=134217728000

#  可以使用带有小时时间单位的 --auto-compaction 选项来设置为自动压缩键空间,保持一个小时的历史
--auto-compaction-retention=1

#etcd日志级别
--log-outputs=stdout \  
含义:指定'stdout'或'stderr'以跳过日志记录,即使在systemd或逗号分隔的输出目标列表下运行也是如此。
默认值:default
环境变量:ETCD_LOG_OUTPUT

#注意:
--initial-cluster-state=new           #启动用new就行,之后重启用existing

#etcd会打印debug日志
--debug=true

4.etcd健康
#查询etcd节点健康状态(V2的命令)
etcdctl  --endpoints="http://10.243.0.129:2379,http://10.243.0.130:2379,http://10.243.0.131:2379" cluster-health

5.执行etcd命令需要先设置系统变量
export ETCDCTL_API=3

6.查主etcd及etcd使用容量
[root@host102442549 ChunkServer]# ETCDCTL_API=3 etcdctl --endpoints="http://10.238.161.1:2379,http://10.238.161.2:2379,http://10.238.161.3:2379" endpoint status --write-out=table
+--------------------------+------------------+-----------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------------+------------------+-----------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://10.238.161.1:2379 | 26e7149d86c3a7bc | 3.5.0-pre | 86 MB | false | false | 2 | 53489 | 53489 | |
| http://10.238.161.2:2379 | 693e6c16a77e623 | 3.5.0-pre | 86 MB | true | false | 2 | 53489 | 53489 | |
| http://10.238.161.3:2379 | e88e2e5b8dc90da5 | 3.5.0-pre | 86 MB | false | false | 2 | 53489 | 53489 | |
+--------------------------+------------------+-----------+---------+-----------+------------+-----------+------------+--------------------+--------+

7.这种方式更为方便:参考:https://www.cnblogs.com/opama/p/5836674.html
ETCDCTL_API=3 etcdctl get repair --prefix

8.查看etcd单节点使用的空间
[root@host102442549 ChunkServer]# ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint status
http://127.0.0.1:2379, 26e7149d86c3a7bc, 3.5.0-pre, 86 MB, false, false, 2, 53498, 53498,

9.压缩etcd空间
rev=$(ETCDCTL_API=3 etcdctl --endpoints=http://127.0.0.1:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*'
或者
etcdctl --endpoints=http://127.0.0.1:2379 compact $rev
time etcdctl --endpoints=http://127.0.0.1:2379 --dial-timeout=10s --command-timeout=10000s defrag

10.etcd启动脚本:(根据需要修改)
[root@host102442549 oss]# cat /home/etcd/run-etcd.sh
#!/bin/sh
# 安装
# ./run.sh etcd01 etcd01=http://10.37.2.18:2380,etcd02=http://10.37.2.19:2380,etcd03=http://10.37.2.20:2380 10.37.2.18 /mnt/data/etcd/

etcd --version
if [[ $? -eq 0 ]];
    then
        echo 'etcd exists!'
    else
        echo "etcd not exist!"
    exit
fi

#export HOST=$(ifconfig | grep 'inet addr'| grep -v '127.0.0.1' | awk '{ print $2}'| cut -d: -f2)
export HOST=$3
echo $HOST

mkdir -p /etc/etcd/

ETCD_NAME="$1"
ETCD_DATA_DIR="$4"
ETCD_LISTEN_PEER_URLS="http://$HOST:2380"
ETCD_LISTEN_CLIENT_URLS="http://$HOST:2379,http://127.0.0.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://$HOST:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://$HOST:2379"
#ETCD_INITIAL_CLUSTER_STATE="new"            #第一次启动用new
ETCD_INITIAL_CLUSTER_STATE="existing"        #之后启动用   existing
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster1"
ETCD_INITIAL_CLUSTER="$2"

tee /etc/etcd/etcd.conf <<-'EOF'
ETCD_NAME="$1"
ETCD_DATA_DIR="$4"
ETCD_LISTEN_PEER_URLS="http://$HOST:2380"
ETCD_LISTEN_CLIENT_URLS="http://$HOST:2379,http://127.0.0.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://$HOST:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://$HOST:2379"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster1"
ETCD_INITIAL_CLUSTER="$2"
EOF

mkdir -p $4
# etcd01=http://192.168.2.44:2380,etcd02=http://192.168.2.45:2380,etcd03=http://192.168.2.46:2380
# 替换掉$HOST及$1
sed -i "s%\$HOST%"$HOST"%g"  /etc/etcd/etcd.conf
sed -i "s%\$1%"$1"%g"  /etc/etcd/etcd.conf
sed -i "s%\$2%"$2"%g"  /etc/etcd/etcd.conf

mkdir -p /usr/lib/systemd/system/

tee  /usr/lib/systemd/system/etcd.service <<-'EOF'
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
WorkingDirectory=$4
EnvironmentFile=/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\" --listen-peer-urls=\"${ETCD_LISTEN_PEER_URLS}\" --advertise-client-urls=\"${ETCD_ADVERTISE_CLIENT_URLS}\" --initial-cluster-token=\"${ETCD_INITIAL_CLUSTER_TOKEN}\" --initial-cluster=\"${ETCD_INITIAL_CLUSTER}\" --initial-cluster-state=\"${ETCD_INITIAL_CLUSTER_STATE}\" "
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

#service daemon-reload
nohup etcd --log-output=stdout \    ##有这个才能输出日志
           --name="${ETCD_NAME}" \
           --data-dir="${ETCD_DATA_DIR}" \
                   --auto-compaction-retention=1 \
                   --quota-backend-bytes=$((125*1024*1024*1024)) \
           --listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}" \
           --listen-peer-urls="${ETCD_LISTEN_PEER_URLS}" \
           --advertise-client-urls="${ETCD_ADVERTISE_CLIENT_URLS}" \
           --initial-advertise-peer-urls="${ETCD_INITIAL_ADVERTISE_PEER_URLS}" \
           --initial-cluster-token="${ETCD_INITIAL_CLUSTER_TOKEN}" \
           --initial-cluster="${ETCD_INITIAL_CLUSTER}" \
           --initial-cluster-state="${ETCD_INITIAL_CLUSTER_STATE}" \
           --enable-pprof \
           >/mnt/snbslog/snbs/Master/Etcd.log 2>&1 &
#service enable etcd

#etcdctl --endpoints="http://10.37.2.18:2379,http://10.37.2.19:2379,http://10.37.2.20:2379" member list
#etcdctl --endpoints="http://10.37.2.18:2379,http://10.37.2.19:2379,http://10.37.2.20:2379" cluster-health

002-常用etcd接口

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
* 查看etcd内容
metadata
*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379"  get  metadata --prefix
返回:
metadata/00000000000186a5/00000000                                  ---key
10.242.180.208:9595,10.242.180.212:9595,10.242.180.210:9595         ---value

*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379"  get metadata/00000000000186a5/00000000
返回:
metadata/00000000000186a5/00000000
10.242.180.208:9595,10.242.180.212:9595,10.242.180.210:9595

volumestate
*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379"  get volumestate --prefix
volumestate/4fc2e14c-dab8-46e9-a9ca-84fe1cc4f198
{"FileName":"4fc2e14c-dab8-46e9-a9ca-84fe1cc4f198","Prefix":"000000000010c8e5","Size":107374182400,"Provisioning":"thin","Creation_Time":"2019-07-16 17:00:42","Copies":3,"Type":"origin","Parent":"","Pool":"pool-openstack","Owner":"10.242.180.211:8686","Ref":0,"Status":0,"Index":0,"ReadOnly":false}

*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379"  get volumestate/lCentOS-7.11.raw
返回:
volumestate/lCentOS-7.11.raw
{"FileName":"lCentOS-7.11.raw","Prefix":"000000000010c8ee","Size":2147483648,"Provisioning":"thin","Creation_Time":"2019-07-16 19:31:53","Copies":3,"Type":"origin","Parent":"","Pool":"pool-openstack","Owner":"","Ref":0,"Status":0,"Index":0,"ReadOnly":false}

pooldata
*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get pooldata --prefix
返回:
pooldata/pool-openstack/region-openstack/null


pooldata/pool-openstack/region-openstack/zone1


pooldata/pool-openstack/region-openstack/zone2


pooldata/pool-openstack/region-openstack/zone3

*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get pooldata/pool-openstack/region-openstack/zone1 --prefix

poolstate
*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get poolstate --prefix
poolstate/pool-openstack
{"Copies":3,"PoolName":"pool-openstack","CreateTime":"2019-07-16 09:47:03","Regionid":"region-openstack"}

region
ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get region --prefix
region/region-openstack/null/null


region/region-openstack/zone1/10.242.180.207:9595


region/region-openstack/zone1/10.242.180.208:9595


region/region-openstack/zone1/null


region/region-openstack/zone2/10.242.180.209:9595


region/region-openstack/zone2/10.242.180.210:9595


region/region-openstack/zone2/null


region/region-openstack/zone3/10.242.180.211:9595


region/region-openstack/zone3/10.242.180.212:9595


region/region-openstack/zone3/null

master:查看主master
*ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get master --prefix
master/master
10.242.180.207:9393

recycle:
ETCDCTL_API=3 etcdctl  --endpoints="http://10.243.0.129:2379,http://10.243.0.130:2379,http://10.243.0.131:2379" get recycle --prefix
recycle目录不一定能看到,只有存在recycle的时候才会有

* 查看迁移任务:migrate、repair
ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get migrate --prefix
get migrate --prefix
ETCDCTL_API=3 etcdctl  --endpoints="http://10.242.180.207:2379,http://10.242.180.208:2379,http://10.242.180.209:2379" get repair --prefix

*其它:prefic、get、put
--prefix
ETCDCTL_API=3 ETCDCTL_API=3 etcdctl  --endpoints="http://10.243.0.129:2379,http://10.243.0.130:2379,http://10.243.0.131:2379"  get  volumestate/project001 --prefix
ETCDCTL_API=3 etcdctl  --endpoints="http://10.243.0.129:2379,http://10.243.0.130:2379,http://10.243.0.131:2379"  put volumestate/project001  '{"FileName":"project001","Prefix":"000000000001875a","Size":107374182400,"Provisioning":"thin","Creation_Time":"2019-06-19  15:18:20","Copies":3,"Type":"origin","Parent":"","Pool":"SNPOOL001","Owner":"10.243.0.132:8585","Ref":0,"Status":0,"Index":0}'

*卷相关的数据会先放到recycle中,都放到recycle中之后才会删除正真的元数据
ETCDCTL_API=3  etcdctl --endpoints="http://10.37.2.18:2379,http://10.37.2.19:2379,http://10.37.2.20:2379" get recycle/ --prefix

---查询主etcd
etcdctl  --endpoints="http://10.243.0.129:2379,http://10.243.0.130:2379,http://10.243.0.131:2379" member list
etcdctl member list

etcd-备份恢复方案

003-金鑫etcd总结

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Etcd测试与分析
一、启动参数设置
/usr/local/bin/etcd
--log-outputs=stdout
--name=etcd01
--data-dir=/mnt/etcd/default.etcd/
--election-timeout=5000
--auto-compaction-retention=1
--quota-backend-bytes=214748364800
--backend-bbolt-freelist-type=map --listen-client-urls=http://10.238.161.1:2379,http://127.0.0.1:2379 
--listen-peer-urls=http://10.238.161.1:2380
 --advertise-client-urls=http://10.238.161.1:2379
 --initial-advertise-peer-urls=http://10.238.161.1:2380
 --initial-cluster-token=etcd-cluster --initial-cluster=etcd01=http://10.238.161.1:2380,etcd02=http://10.238.161.2:2380,etcd03=http://10.238.161.3:2380
 --initial-cluster-state=existing
--enable-v2=true
--enable-pprof

二、测试数据
1)基于etcd 3.3.12版本测试,master优化之前的测试数据:
卷数量(75G) chunk数量 etcd内存 etcd磁盘 Compact耗时
70000 210018G 11G 5Min左右
30000 9006G 2.6G 5~10s左右
20000 6003.3G 2G <5s
15000 4502.8G 1.5G <5s
存在的问题:
1.etcd:Compact操作影响etcd的正常put/get等数据存储的操作;
2.master:创建chunk的时候采用了分布式悲观锁,导致每条分布式锁都会记录在磁盘上,过期后也没有删除,只有在主动触发磁盘数据整理之后会将空间释放。
优化策略:
1.etcd:采用了优化了compact的版本。
2.master:创建chunk的时候采用了分布式乐观锁,不需要创建每条锁记录,不占用任何磁盘空间。
2)基于etcd 3.5.0-pre版本测试,优化之后测试数据:
卷数量(75G) chunk数量 etcd内存 磁盘容量 Compact耗时
133333 4千万 16G 6.4G 无

三、Etcd数据存储分析
1、数据存储格式:
etcd磁盘持久化存储的主要数据分为key-value,即将序列化(revision.main+revision.sub)作为作为key,value为KeyValue结构序列化方式存储,具体信息如下:
Key:
type revision struct {
    main int64
    sub int64
}

Value:
type KeyValue struct {
    // key is the key in bytes. An empty key is not allowed.
    Key []byte `protobuf:"bytes,1,opt,name=key,proto3" json:"key,omitempty"`
    // create_revision is the revision of last creation on this key.
    CreateRevision int64 `protobuf:"varint,2,opt,name=create_revision,json=createRevision,proto3" json:"create_revision,omitempty"`
    // mod_revision is the revision of last modification on this key.
    ModRevision int64 `protobuf:"varint,3,opt,name=mod_revision,json=modRevision,proto3" json:"mod_revision,omitempty"`
    // version is the version of the key. A deletion resets
    // the version to zero and any modification of the key
    // increases its version.
    Version int64 `protobuf:"varint,4,opt,name=version,proto3" json:"version,omitempty"`
    // value is the value held by the key, in bytes.
    Value []byte `protobuf:"bytes,5,opt,name=value,proto3" json:"value,omitempty"`
    // lease is the ID of the lease that attached to key.
    // When the attached lease expires, the key will be deleted.
    // If lease is 0, then no lease is attached to the key.
    Lease int64 `protobuf:"varint,6,opt,name=lease,proto3" json:"lease,omitempty"`
}
可以看到每条KV记录除了我们实际存储Key,Value值外,额外占用的内容包括revision (key)占16个字节,CreateRevision,ModRevision,Version ,Lease占32个字节,总共至少额外占用了48个字节。
根据实际测试数据540万条记录,key-value:总共29字节,理论应占用149M,实际磁盘占用473M 内,可以计算出每条记录多占用了60字节,与分析结果差不多。

2、数据整理
Etcd存储在磁盘上的过期版本也会占用磁盘空间,需要手动执行回收多余的空间操作,才能将老版本的数据空间回收。通过命令实现:
etcdctl --endpoints=http://127.0.0.1:2379 defrag
在执行该操作的时候etcd集群会出现阻塞,所有的外部请求会出现超时。目前测试发现该过4千万条记录耗时将近2分钟的时间。(在非主etcd上执行也是会导致相同的现象)。

Etcd 数据存储调研

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
Etcd 数据存储调研

1、数据存储容量:
etcd磁盘持久化存储的主要数据分为key-value,即将序列化(revision.main+revision.sub)作为作为key,value为KeyValue结构序列化方式存储,具体信息如下:

Key:
type revision struct {
    main int64
    sub int64
}

Value:
type KeyValue struct {
    // key is the key in bytes. An empty key is not allowed.
    Key []byte `protobuf:"bytes,1,opt,name=key,proto3" json:"key,omitempty"`
    // create_revision is the revision of last creation on this key.
    CreateRevision int64 `protobuf:"varint,2,opt,name=create_revision,json=createRevision,proto3" json:"create_revision,omitempty"`
    // mod_revision is the revision of last modification on this key.
    ModRevision int64 `protobuf:"varint,3,opt,name=mod_revision,json=modRevision,proto3" json:"mod_revision,omitempty"`
    // version is the version of the key. A deletion resets
    // the version to zero and any modification of the key
    // increases its version.
    Version int64 `protobuf:"varint,4,opt,name=version,proto3" json:"version,omitempty"`
    // value is the value held by the key, in bytes.
    Value []byte `protobuf:"bytes,5,opt,name=value,proto3" json:"value,omitempty"`
    // lease is the ID of the lease that attached to key.
    // When the attached lease expires, the key will be deleted.
    // If lease is 0, then no lease is attached to the key.
    Lease int64 `protobuf:"varint,6,opt,name=lease,proto3" json:"lease,omitempty"`
}
可以看到每条KV记录除了我们实际存储Key,Value值外,额外占用的内容包括revision (key)占16个字节,CreateRevision,ModRevision,Version ,Lease占32个字节,总共至少额外占用了48个字节。

根据实际测试数据540万条记录,key-value:总共29字节,理论应占用149M,实际磁盘占用473M 内,可以计算出每条记录多占用了60字节,与分析结果差不多。
133333个卷,每个卷75G的实际测试结果如下:
存储方式 chunk数量 etcd内存 磁盘容量 加载耗时
目前存储方式 4千万 16G 6.4G 5min左右
优化存储方式 4千万 12G 3.6G 4min左右

注:之前测试数据偏大,是由于在记录每个chunk的时候采用了分布式锁,导致每条分布式锁都会记录在磁盘上,过期后也没有删除,在进行数据整理之后会将空间释放。所以这部分创建chunk的逻辑需要优化。
2、数据整理
Etcd存储在磁盘上的过期版本也会占用磁盘空间,需要手动执行回收多余的空间操作,才能将老版本的数据空间回收。通过命令实现:
etcdctl --endpoints=http://127.0.0.1:2379 defrag
在执行该操作的时候etcd集群会出现阻塞,所有的外部请求会出现超时。目前测试发现该过4千万条记录耗时将近2分钟的时间。(在非主etcd上执行也是会导致相同的现象)。

004-etcd连接风暴优化

问题

测试发现大量客户端同时连接etcd服务器时,会触发大量连接错误,并且需要很长时间才会恢复。

连接性能的测试结果如下

  1. 单个客户端,每次连接耗时200ms,服务端etcd进程CPU增加55%
  2. 多个客户端压测,连接qps稳定在 9,服务端etcd进程CPU利用率在110%左右。

根据以上数据,在有4w agent客户端的情况下,如果etcd整个集群(5节点)发生断网或停机,重连将至少花费15分钟,期间etcd服务极不稳定。

40000/(9*5)=888s=15min

原因

连接风暴期间,性能分析,发现

  1. CPU主要消耗在 blowfish.encryptBlock函数

    [root@sndspstdb52 ~]# perf top
    67.01% etcd [.] etcd-3.3.13/cmd/vendor/golang.org/x/crypto/blowfish.encryptBlock
    4.06% [kernel] [k] _spin_unlock_irqrestore
    3.68% [kernel] [k] finish_task_switch
    3.06% etcd [.] etcd-3.3.13/cmd/vendor/golang.org/x/crypto/blowfish.ExpandKey
    2.34% [kernel] [k] find_busiest_group
    1.74% [kernel] [k] iowrite16
    1.33% [kernel] [k] __do_softirq
    0.58% etcd [.] runtime.mallocgc
    0.50% [kernel] [k] __rcu_process_callbacks
    0.49% [kernel] [k] _spin_lock
    0.45% [kernel] [k] pvclock_clocksource_read
    0.43% etcd [.] runtime.findrunnable
    0.42% [kernel] [k] rcu_process_gp_end
    0.40% etcd [.] runtime.selectgo
    0.37% [kernel] [k] rcu_process_callbacks
    0.34% [kernel] [k] system_call_after_swapgs
    0.33% etcd [.] runtime.lock
    0.33% etcd [.] runtime.heapBitsSetType
    0.31% [kernel] [k] rebalance_domains
    0.31% [kernel] [k] tick_nohz_stop_sched_tick
    0.28% etcd [.] runtime.deferreturn

  2. etcd用bcrypt(blowfish)算法进行密码验证

    [root@sndspstdb52 ~]# pstack $etcd_pid
    Thread 9 (Thread 0x7feb8abfd700 (LWP 2339)):
    #0 0x000000000097b9da in etcd-3.3.13/cmd/vendor/golang.org/x/crypto/blowfish.encryptBlock ()
    #1 0x000000000097b120 in etcd-3.3.13/cmd/vendor/golang.org/x/crypto/blowfish.ExpandKey ()
    #2 0x000000000097dc34 in etcd-3.3.13/cmd/vendor/golang.org/x/crypto/bcrypt.expensiveBlowfishSetup ()
    #3 0x000000000097d935 in etcd-3.3.13/cmd/vendor/golang.org/x/crypto/bcrypt.bcrypt ()
    #4 0x000000000097d076 in etcd-3.3.13/cmd/vendor/golang.org/x/crypto/bcrypt.CompareHashAndPassword ()
    #5 0x00000000009828d7 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/auth.(authStore).CheckPassword ()
    #6 0x0000000000ad6cda in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver.(
    EtcdServer).Authenticate ()
    #7 0x0000000000b75395 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(AuthServer).Authenticate ()
    #8 0x00000000008d9196 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb._Auth_Authenticate_Handler.func1 ()
    #9 0x0000000000b53151 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1 ()
    #10 0x0000000000b5721e in etcd-3.3.13/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(
    ServerMetrics).UnaryServerInterceptor.func1 ()
    #11 0x0000000000b530d9 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1 ()
    #12 0x0000000000b80968 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.newUnaryInterceptor.func1 ()
    #13 0x0000000000b530d9 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1 ()
    #14 0x0000000000b80b5e in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.newLogUnaryInterceptor.func1 ()
    #15 0x0000000000b532f3 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/vendor/github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1 ()
    #16 0x000000000088beb8 in etcd-3.3.13/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb._Auth_Authenticate_Handler ()
    #17 0x00000000008568dc in etcd-3.3.13/cmd/vendor/google.golang.org/grpc.(Server).processUnaryRPC ()
    #18 0x00000000008598f5 in etcd-3.3.13/cmd/vendor/google.golang.org/grpc.(
    Server).handleStream ()
    #19 0x000000000085f66f in etcd-3.3.13/cmd/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1 ()
    #20 0x000000000045ea31 in runtime.goexit ()
    #21 0x000000c024d42e60 in ?? ()
    #22 0x000000c0000f8780 in ?? ()
    #23 0x0000000001018580 in ?? ()
    #24 0x000000c0139b54a0 in ?? ()
    #25 0x000000c01e07a100 in ?? ()
    #26 0x0000000000000000 in ?? ()

  3. 进一步调查了解,Bcrypt消耗CPU高也是出于安全性考虑,增加密码被穷举破解的难度,并且加密性能可以通过参数调节。

    https://segmentfault.com/q/1010000003054250
    md5加密是快,在密码只有小写字母+数字组合的情况下,一台比较好的PC机,在40s内就可以穷举出所有的口令.
    Bcrypt虽然慢,但是对于验证用户口令方面不算慢,对于穷举来说,就很慢了.

    因为bcrypt采用了一系列各种不同的Blowfish加密算法,并引入了一个work factor,这个工作因子可以让你决定这个算法的代价有多大。因为这些,这个算法不会因为计算机CPU处理速度变快了,而导致算法的时间会缩短了。因为,你可以增加work factor来把其性能降下来。

注:关闭密码认证,不会存在连接性能问题。

优化

最新的etcd 3.4.0提供了一个参数bcrypt-cost,可以调节bcrypt的性能。

[root@sndspstdb51 etcd-v3.4.0-linux-amd64]#./etcd --help
Auth:
  ...
  --bcrypt-cost 10
    Specify the cost / strength of the bcrypt algorithm for hashing auth passwords. Valid values are between 4 and 31.

测试结果如下

bcrypt-cost  单客户端连接时间  单客户端连接CPU  多客户端连接qps
10(默认)     100ms               99%                10
4              6ms               25%                600

设置bcrypt-cost=4后,同时启动4000个agent连接,未发生连接报错和阻塞现象。

因此优化方案如下

  1. 使用etcd 3.4.0
  2. 启动etcd时,设置参数 –bcrypt-cost=4

005-etcd客户端

起etcd grpc-proxy

1
etcd grpc-proxy start --endpoints=http://10.244.208.3:2379,http://10.244.208.4:2379,http://10.244.208.5:2379 --listen-addr=10.242.4.92:2370

起etcd watch

1
./watch -etcd=10.242.4.92:2370,10.242.4.93:2370
赞赏一下吧~