clickhouse分片高可用集群安装


简介

环境

这里使用 aws 的 EC2 做示范:

IP 地址 主机名称 CPU 内存 数据盘
172.20.0.201 zookeeper01 4 8G 100G
172.20.0.202 zookeeper02 4 8G 100G
172.20.0.203 zookeeper03 4 8G 100G
172.20.0.101 clickhouse01 8 16G 300G
172.20.0.102 clickhouse02 8 16G 300G
172.20.0.103 clickhouse03 8 16G 300G
172.20.0.104 clickhouse04 8 16G 300G
172.20.0.105 clickhouse05 8 16G 300G
172.20.0.106 clickhouse06 8 16G 300G

正文

1. 主机配置 [所有节点]

这里使用 AWS 的 EC2 主机,所以大家根据自己的实际情况进行操作。

基础配置
$ sudo yum update -y && sudo timedatectl set-timezone Asia/Shanghai && sudo yum install nc iotop dstat -y
配置主机解析
$ sudo bash -c "cat >> /etc/hosts" << EOF
172.20.0.201 zookeeper01
172.20.0.202 zookeeper02
172.20.0.203 zookeeper03
172.20.0.101 clickhouse01
172.20.0.102 clickhouse02
172.20.0.103 clickhouse03
172.20.0.104 clickhouse04
172.20.0.105 clickhouse05
172.20.0.106 clickhouse06
EOF
磁盘分区挂载
$ cfdisk
$ sudo cfdisk /dev/nvme1n1
$ sudo mkfs -t xfs /dev/nvme1n1p1
$ sudo cp /etc/fstab /etc/fstab.orig
$ sudo lsblk -lf
$ sudo bash -c "cat >>  /etc/fstab" << EOF
UUID=1e7c2c75-a6b8-4060-94f1-15e5e1de8fe8  /data  xfs  defaults,nofail  0  2
EOF
$ sudo mkdir /data && sudo mount -a && sudo chown -R ec2-user. /data

2. 配置 docker <可选>

此次集群搭建,使用 docker-compose 进行管控,当然也可以使用裸机进行部署,如果使用裸机部署这里可以忽略。

$ sudo amazon-linux-extras install docker -y && sudo usermod -a -G docker ec2-user && sudo systemctl start docker
$ sudo bash -c "cat >> /etc/docker/daemon.json" << EOF
{
  "builder": {
    "gc": {
      "defaultKeepStorage": "20GB",
      "enabled": true
    }
  },
  "experimental": false,
  "features": {
    "buildkit": false
  },
  "registry-mirrors": [
    "https://docker.mirrors.ustc.edu.cn"
  ],
  "exec-opts": [
    "native.cgroupdriver=systemd"
  ],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "10"
  },
  "max-concurrent-downloads": 20,
  "debug": false,
  "default-address-pools" : [
    {
      "base" : "192.168.224.0/20",
      "size" : 24
    }
  ]
}
EOF
$ sudo systemctl restart docker && sudo chmod 666 /var/run/docker.sock && sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose && sudo chmod +x /usr/local/bin/docker-compose && sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose && sudo chown -R ec2-user. /opt/

3. 配置 zookeeper [zookeeper01,zookeeper02,zookeeper03]

$ mkdir -p /data/zookeeper/{logs,conf,data} && mkdir -p /data/zookeeper/data/{log,data}
配置docker-compose文件
$ mkdir -p ~/data/zookeeper && cat > ~/data/zookeeper/docker-compose.yml << EOF
version: "3"
services:
  zookeeper:
    image: zookeeper:3.7.1-temurin
    container_name: zookeeper
    restart: unless-stopped
    network_mode: "host"
    ports:
      - 2181:2181
      - 2888-3888:2888-3888
      - 9070:9070
    extra_hosts:
      - "zookeeper01:172.20.0.201"
      - "zookeeper02:172.20.0.202"
      - "zookeeper03:172.20.0.203"
    environment:
      #ZOO_MY_ID主机zookeeper01为1,主机zookeeper02为2,主机zookeeper03为3
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zookeeper01:2888:3888;2181 server.2=zookeeper02:2888:3888;2181 server.3=zookeeper03:2888:3888;2181
      ZOO_LOG4J_PROP: INFO,ROLLINGFILE
      JVMFLAGS: -Xms128M -Xmx5G
      ZOO_CFG_EXTRA: metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider metricsProvider.httpPort=9070
      TZ: Asia/Shanghai
    volumes:
      - /etc/localtime:/etc/localtime
      - /data/zookeeper/data/log:/datalog
      - /data/zookeeper/data/data:/data
      - /data/zookeeper/logs:/logs
      - /data/zookeeper/conf:/conf
    deploy:
      replicas: 1
      resources:
        limits:
          cpus: '3'
          memory: 7G
        reservations:
          cpus: '2'
          memory: 5G
    ulimits:
     nproc: 65535
     nofile:
      soft: 262144
      hard: 262144
EOF
配置log4j参数文件
$ cat > /data/zookeeper/conf/log4j.properties << 'EOF'
# Define some default values that can be overridden by system properties
zookeeper.root.logger=INFO, CONSOLE, ROLLINGFILE
zookeeper.console.threshold=INFO
zookeeper.log.dir=/logs
zookeeper.log.file=zookeeper.log
zookeeper.log.threshold=INFO
zookeeper.tracelog.dir=/logs
zookeeper.tracelog.file=zookeeper_trace.log
#
# ZooKeeper Logging Configuration
#
# Format is "<default threshold> (, <appender>)+
# DEFAULT: console appender only
log4j.rootLogger=${zookeeper.root.logger}
# Example with rolling log file
#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE
# Example with rolling log file and tracing
#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE
#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=${zookeeper.console.threshold}
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n
#
# Add ROLLINGFILE to rootLogger to get log file output
# Log DEBUG level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=${zookeeper.log.threshold}
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/${zookeeper.log.file}
# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
log4j.appender.ROLLINGFILE.MaxBackupIndex=5
log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L] - %m%n
#
# Add TRACEFILE to rootLogger to get log file output
# Log DEBUG level and above messages to a log file
log4j.appender.TRACEFILE=org.apache.log4j.FileAppender
log4j.appender.TRACEFILE.Threshold=TRACE
log4j.appender.TRACEFILE.File=${zookeeper.tracelog.dir}/${zookeeper.tracelog.file}
log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout
### Notice we are including log4j's NDC here (%x)
log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} [myid:%X{myid}] - %-5p [%t:%C{1}@%L][%x] - %m%n
EOF
配置zk配置文件
$ mkdir -p /data/zookeeper/conf && cat > /data/zookeeper/conf/zoo.cfg << EOF
dataDir=/data
dataLogDir=/datalog
tickTime=2000
initLimit=30000
syncLimit=10
autopurge.snapRetainCount=10
autopurge.purgeInterval=1
maxClientCnxns=2000
maxSessionTimeout=60000000
preAllocSize=131072
snapCount=3000000
admin.enableServer=true
leaderServes=yes
standaloneEnabled=false
dynamicConfigFile=/conf/dynamic/dynamic.cfg
EOF
$ mkdir -p /data/zookeeper/conf/dynamic/ && cat > /data/zookeeper/conf/dynamic/dynamic.cfg << EOF
server.1=zookeeper01:2888:3888;2181
server.2=zookeeper02:2888:3888;2181
server.3=zookeeper03:2888:3888;2181
EOF

3. 配置 clickhouse [clickhouse01,clickhouse02,clickhouse03,clickhouse04,clickhouse05,clickhouse06]

$ mkdir -p /data/clickhouse/{logs,conf,data}
$ mkdir -p /data/clickhouse/ && cat > /data/clickhouse/docker-compose.yml << EOF
version: '3'
services:
  clickhouse:
    image: clickhouse/clickhouse-server:22.6.8.35
    container_name: clickhouse
    network_mode: "host"
    restart: unless-stopped
    volumes:
      - /etc/localtime:/etc/localtime
      - /data/clickhouse/config:/etc/clickhouse-server
      - /data/clickhouse/data:/var/lib/clickhouse:rw
      - /data/clickhouse/logs:/var/log/clickhouse-server/
    ports:
      - "8123:8123"
      - "9000:9000"
    extra_hosts:
      - "clickhouse01:172.20.0.101"
      - "clickhouse02:172.20.0.102"
      - "clickhouse03:172.20.0.103"
      - "clickhouse04:172.20.0.104"
      - "clickhouse05:172.20.0.105"
      - "clickhouse06:172.20.0.106"
    environment:
      TZ: Asia/Shanghai
    deploy:
      replicas: 1
      resources:
        limits:
          cpus: '3'
          memory: 7G
        reservations:
          cpus: '2'
          memory: 5G
    ulimits:
     nproc: 65535
     nofile:
      soft: 262144
      hard: 262144
EOF
获取clickhouse初始配置文件
$ sed -i '10,12s/^/#/g' /data/clickhouse/docker-compose.yml && cd /data/clickhouse/ && docker-compose up -d && docker cp clickhouse:/etc/clickhouse-server /data/clickhouse/ && cp -rf /data/clickhouse/clickhouse-server/* /data/clickhouse/conf/ && sed -i '10,12s/#//g' /data/clickhouse/docker-compose.yml && mkdir -p /data/clickhouse/data/access

获取hash密码
$ PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d '-'
    cDT5wFrx
    f6a392ac568149ec022d3660906492d520023e15

$ sed -i '/<remote_servers>/i\    <![CDATA[' /data/clickhouse/config/config.xml
$ sed -i '/<\/remote_servers>/a\    ]]>' /data/clickhouse/config/config.xml
$ sed -i 's@<password></password>@<password_double_sha1_hex>f6a392ac568149ec022d3660906492d520023e15</password_double_sha1_hex>@g' /data/clickhouse/config/users.xml
$ sed -i '/<timezone>UTC<\/timezone>/i\    <timezone>Asia\/Shanghai<\/timezone>' /data/clickhouse/config/config.xml
$ sed -i '/<listen_host>0.0.0.0<\/listen_host>/d' /data/clickhouse/config/config.d/docker_related_config.xml

$ sed -i 's@<max_concurrent_queries>100</max_concurrent_queries>@<max_concurrent_queries>150</max_concurrent_queries>@g' /data/clickhouse/config/config.xml

日志级别,可以根据自己需要修改[none,fatal,critical,error,warning,notice,information,debug,trace,test]
$ sed -i 's@<level>trace</level>@<level>notice</level>@g' /data/clickhouse/config/config.xml

zookeeper信息配置
$ cat > /data/clickhouse/config/config.d/zookeeper.xml << EOF
<yandex>
  <!-- zookeeper相关配置 -->
  <zookeeper>
    <node index="1">
      <host>172.20.0.201</host>
      <port>2181</port>
    </node>
    <node index="2">
      <host>172.20.0.202</host>
      <port>2181</port>
    </node>
    <node index="3">
      <host>172.20.0.203</host>
      <port>2181</port>
    </node>
    <session_timeout_ms>30000</session_timeout_ms>
    <operation_timeout_ms>10000</operation_timeout_ms>
  </zookeeper>
</yandex>
EOF

集群配置
cat > /data/clickhouse/config/config.d/cluster01.xml << EOF
<yandex>
  <!-- /etc/clickhouse-server/config.xml 中配置的remote_servers的incl属性值,-->
  <remote_servers>
      <!-- 集群名称,可以修改 -->
      <cluster01>
          <!-- 配置两个分片,每个分片对应两台机器,为每个分片配置两个副本 -->
          <shard>
               <internal_replication>true</internal_replication>
              <replica>
                  <host>clickhouse01</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
              <replica>
                  <host>clickhouse05</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
          </shard>
          <shard>
              <internal_replication>true</internal_replication>
              <replica>
                  <host>clickhouse02</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
              <replica>
                  <host>clickhouse06</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
          </shard>
          <shard>
              <internal_replication>true</internal_replication>
              <replica>
                  <host>clickhouse03</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
              <replica>
                  <host>clickhouse04</host>
                  <port>9000</port>
                  <user>default</user>
                  <password>cDT5wFrx</password>
              </replica>
          </shard>
      </cluster01>
  </remote_servers>
</yandex>
EOF

新增用户ec2
$ PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d '-'
    Wtk5nhfI
    3fc1db3fa96102d407b21270d7a21bdad6cec744
$ cat > /data/clickhouse/config/users.d/ec2.xml << EOF
<?xml version="1.0"?>
<clickhouse>
  <profiles>
    <ec2>
      <!-- 单次查询最大使用内存 Maximum memory usage for processing single query, in bytes. -->
      <max_memory_usage>10000000000</max_memory_usage>
      <!-- 当分组操作占用超xx时,缓存到磁盘,建议内存一半-->
      <max_bytes_before_external_group_by>5000000000</max_bytes_before_external_group_by>
      <!-- 当排序操作占用超xx时,缓存到磁盘,建议内存一半-->
      <max_bytes_before_external_sort>5000000000</max_bytes_before_external_sort>
      <!-- 单个用户可用最大内存 -->
      <max_memory_usage_for_user>0</max_memory_usage_for_user>
      <!-- 所有查询可用最大内存 -->
      <max_memory_usage_for_all_queries>0</max_memory_usage_for_all_queries>
      <connections_with_failover_max_tries>5</connections_with_failover_max_tries>
      <connect_timeout_with_failover_ms>1000</connect_timeout_with_failover_ms>
      <max_concurrent_queries>300</max_concurrent_queries>
    </ec2>
  </profiles>
  <quotas>
    <ec2>
      <interval>
        <duration>3600</duration>
        <queries>0</queries>
        <errors>0</errors>
        <result_rows>0</result_rows>
        <read_rows>0</read_rows>
        <execution_time>0</execution_time>
      </interval>
    </ec2>
  </quotas>
  <users>
    <ec2>
      <profile>default</profile>
      <networks incl="networks" replace="replace">
        <ip>::/0</ip>
      </networks>
      <quota>ec2</quota>
      <allow_databases>
        <database>ec2</database>
      </allow_databases>
      <password_double_sha1_hex>3fc1db3fa96102d407b21270d7a21bdad6cec744</password_double_sha1_hex>
      <access_management>1</access_management>
    </ec2>
  </users>
</clickhouse>
EOF

$ cat > /data/clickhouse/config/config.d/prometheus.xml << EOF
<yandex>
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>9363</port>

        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
        <status_info>true</status_info>
    </prometheus>
</yandex>
EOF

4. 主机差异配置 [clickhouse01,clickhouse02,clickhouse03,clickhouse04,clickhouse05,clickhouse06]

clickhouse01 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse01</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>01</shard>
      <replica>cluster01-01-01</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>
  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

clickhouse02 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse02</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>02</shard>
      <replica>cluster01-02-01</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>

  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

clickhouse03 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse03</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>03</shard>
      <replica>cluster01-03-01</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>

  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

clickhouse04 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse04</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>03</shard>
      <replica>cluster01-03-02</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>

  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

clickhouse05 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse05</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>01</shard>
      <replica>cluster01-01-02</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>

  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

clickhouse06 配置

$ sed -i 's@<interserver_http_host>example.clickhouse.com</interserver_http_host>@<interserver_http_host>clickhouse06</interserver_http_host>@g' /data/clickhouse/config/config.xml
$ cat > /data/clickhouse/config/config.d/node.xml << EOF
<yandex>
  <macros>
      <layer>01</layer>
      <shard>02</shard>
      <replica>cluster01-02-02</replica>
  </macros>
  <networks>
     <ip>::/0</ip>
  </networks>

  <clickhouse_compression>
    <case>
      <min_part_size>10000000000</min_part_size>
      <min_part_size_ratio>0.01</min_part_size_ratio>
      <method>lz4</method>
    </case>
  </clickhouse_compression>
</yandex>
EOF

5. 启动 zookeeper [zookeeper01,zookeeper02,zookeeper03]

$ cd /data/zookeeper && docker-compose up -d
查看节点角色
$ docker exec zookeeper zkServer.sh status

6. 启动 clickhouse [clickhouse01,clickhouse02,clickhouse03,clickhouse04,clickhouse05,clickhouse06]

$ cd /data/clickhouse && docker-compose up -d
查看节点角色
$ docker exec clickhouse clickhouse-client -m --port 9000  -u default --password cDT5wFrx --query  "select * from system.clusters where cluster = 'cluster01';"
为集群添加数据库
$ docker exec clickhouse clickhouse-client -m --port 9000  -u default --password UEXRrtUd --receive_timeout=100  --query "create database allposs;"

结束