流程-资源池上下线流程¶

资源池节点下线流程¶

前提条件¶

1.确保要下线的节点上，已经没有虚机正在运行

①如果节点正常，下线是因为业务或者别的需求导致的，应将此节点上的虚机进行迁移操作，迁移到别的机器，再进行下一步操作。

迁移命令：

热迁：
[root@hb02-other-172e28e8e132 ~]# openstack server migrate --live hb02-other-172e28e8e137 ea24caf2-5e99-43f5-838f-164bb16891c0
冷迁：
[root@hb02-other-172e28e8e132 ~]# nova migrate --host hb02-other-172e28e8e138 ea24caf2-5e99-43f5-838f-164bb16891c0
[root@hb02-other-172e28e8e132 ~]# nova resize-confirm ea24caf2-5e99-43f5-838f-164bb16891c0

②如果节点异常，确保虚机及磁盘等资源在别的机器正常运行，再下线此节点。

2.cinder-volume管理的卷进行迁移

1）查找下线的计算节点上cinder-volume管理的卷

cinder --os-volume-api-version 3.33 list --all --filters host= <current_host>

2）计算节点上cinder管理的卷进行迁移

cinder-manage --config-file /etc/cinder/cinder.conf volume update_host --currenthost <current_host@storage_backend> --newhost <new_host@storage_backend>

另外，host可以使用"cinder service-list" 查看

usage: cinder-manage [-h] [--config-dir DIR] [--config-file PATH] [--debug]
                     [--log-config-append PATH]
                     [--log-date-format DATE_FORMAT] [--log-dir LOG_DIR]
                     [--log-file PATH] [--nodebug] [--nouse-journal]
                     [--nouse-json] [--nouse-syslog] [--nowatch-log-file]
                     [--state_path STATE_PATH]
                     [--syslog-log-facility SYSLOG_LOG_FACILITY]
                     [--use-journal] [--use-json] [--use-syslog] [--version]
                     [--watch-log-file]

                     {backup,cg,cluster,config,db,host,logs,service,shell,version,volume}
                     ...
usage: cinder-manage volume [-h] {delete,update_host} ...

usage: cinder-manage volume update_host [-h] --currenthost CURRENTHOST
                                        --newhost NEWHOST

具体流程¶

1.先停OS服务¶

登录到物理机节点，

执行 systemctl stop openstack-nova-compute openstack-cinder-backup openstack-cinder-volume neutron-dhcp-agent neutron-l3-agent neutron-metadata-agent neutron-openvswitch-agent，确保nova、cinder、neutron服务全部关闭

执行systemctl disable openstack-nova-compute openstack-cinder-backup openstack-cinder-volume neutron-dhcp-agent neutron-l3-agent neutron-metadata-agent neutron-openvswitch-agent，关闭开机启动。

2.清理nova service¶

[root@sh03-control-10e114e8e17 ~]# nova service-list | grep sh03-compute-10e114e8e93
 | 1f9c32ad-3543-425c-81d5-4ecf34384d87 | nova-compute     | sh03-compute-10e114e8e93  | public   | disable | down    | 2019-01-11T08:36:35.000000 | -               | False       |
 [root@sh03-control-10e114e8e17 ~]# nova service-delete 1f9c32ad-3543-425c-81d5-4ecf34384d87

3.清理neutron agent(4个全删)¶

[root@sh03-control-10e114e8e17 ~]# neutron agent-list | grep sh03-compute-10e114e8e93
| 0e92b0e4-d838-4f68-a888-6994843c1093 | L3 agent           | sh03-compute-10e114e8e93  | nova              | XXX   | True           | neutron-l3-agent          |
| 30644de4-3f07-4429-a7be-d5fdfe32ff52 | DHCP agent         | sh03-compute-10e114e8e93  | nova              | XXX   | True           | neutron-dhcp-agent        |
| 45e23090-fde6-478a-8609-a92b6dd09c97 | Metadata agent     | sh03-compute-10e114e8e93  |                   | XXX   | True           | neutron-metadata-agent    |
| b310ff58-56e7-481f-928e-259bc80b6532 | Open vSwitch agent | sh03-compute-10e114e8e93  |                   | XXX   | True           | neutron-openvswitch-agent |
[root@sh03-control-10e114e8e17 ~]# neutron agent-delete $id

4.清理 cinder service¶

[root@sh03-control-10e114e8e17 ~]# cinder service-list |grep sh03-compute-10e114e8e93
| cinder-backup    | sh03-compute-10e114e8e93       | nova | disable | down    | 2019-01-11T09:02:38.000000 | -               |
| cinder-volume    | sh03-compute-10e114e8e93@ceph  | nova | disable | down    | 2019-01-11T09:02:41.000000 | -               |
[root@sh03-control-10e114e8e17 ~]# cinder-manage service remove <binary> <host_name>

5.zabbix上清理¶

zabbix上先做disable处理，后续确定以后删除

6.底层组执行下线¶

由底层组操作，重装或者报修等。

7.jumpserver删除下线机器¶

由蔡杭狄使用管理员账号在jumpserver界面删除节点。

8.deploy部署代码的hosts文件清理¶

或者做特殊说明备注，或者直接清除下线机器条目。

资源池上线流程¶

网络节点上线¶

1.拿到底层给提供的机器进行检查¶

1）ntp服务及时间是否同步，在该节点执行

[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'date'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status ntpd'

2）主机名是否一致

[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'hostname'

当以上时间正常，ntp服务正常，主机名正常，我们在该步骤需要在部署脚本里注释掉关于 ntp 的操作。同时需要注意在以后新资源池部署时，

也不需要咱们操作 ntp 了，咱们只验证时间是否对上，ntp 服务是否正常即可。

- hosts: network
  tasks:
    - name: Include system_init
      import_role:
        name: system_init

    #- name: Include ntp
    #  import_role:
    #    name: ntp

    - name: Include neutron_compute
      import_role:
        name: neutron_compute
      vars:
        vm_interface_tenant: "{{ network_openstack.tenant.interface.network }}"
        vm_interface_public: "{{ network_openstack.public.interface.network }}"

2.ansible安装部署¶

1）vim hosts文件，注释掉其他网络节点，并注明扩容日期

2）检查部署代码是否有不该注释的注释掉了

[root@hb02-other-172e28e8e143 roles]# grep -rE '#' neutron_compute/
[root@hb02-other-172e28e8e143 deploy-v2]# grep -r '#' playbooks/network_extension.yml

3）执行命令

[root@hb02-other-172e28e8e143 deploy-v2]# ansible-playbook playbooks/network_extension.yml

3.部署成功，验证服务¶

在部署机执行以下命令，全部active (running)为正常

ansible network -mshell -a"systemctl status neutron-openvswitch-agent.service  neutron-dhcp-agent.service neutron-metadata-agent.service neutron-l3-agent.service"

在控制节点 neutron agent-list |grep （主机名）确保所有的neutorn服务都注册上来

[root@hb02-other-172e28e8e132 ~]# neutron agent-list |grep hb02-other-172e28e8e136
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
| 08df9830-4b2d-4c8e-8539-6d14e6dcdea9 | DHCP agent         | hb02-other-172e28e8e136 | nova              | :-)   | True           | neutron-dhcp-agent        |
| 289c7f10-bca8-45d7-aeb2-82fea06deeda | Metadata agent     | hb02-other-172e28e8e136 |                   | :-)   | True           | neutron-metadata-agent    |
| 4abaa367-6758-4a90-8688-fcf13493c24d | L3 agent           | hb02-other-172e28e8e136 | nova              | :-)   | True           | neutron-l3-agent          |
| 5d6bc44d-1850-48b1-868b-b60821f2c39d | Open vSwitch agent | hb02-other-172e28e8e136 |                   | :-)   | True           | neutron-openvswitch-agent |

4.通知监控组安装zabbix-agent¶

5.监控组需要检查zabbix-server服务，并检查dashbord有没有发现这台机器以及状态是否正常¶

在部署机执行

[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status zabbix-agent'

6.确保zabbix界面监控正常¶

从web端看看是否正常继承正确的网络模板，确保所有的监控项都监控并可

7.通知测试进行测试¶

资源池扩容流程¶

计算节点上线¶

1.拿到底层给提供的机器进行检查¶

1）ntp服务及时间是否同步，在该节点执行

[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'date'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status ntpd'

2）主机名是否一致

[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'hostname'

当以上时间正常，ntp服务正常，主机名正常，我们在该步骤需要在部署脚本里注释掉关于 ntp 的操作。同时需要注意在以后新资源池部署时，

也不需要咱们操作 ntp 了，咱们只验证时间是否对上，ntp 服务是否正常即可。

---
- hosts: compute
  tasks:
    - name: Include system_init
      import_role:
        name: system_init

    # 以后都不跑了！！！
    #- name: Include ntp
    #  import_role:
    #    name: ntp

    - name: Include cinder_compute
      import_role:
        name: cinder_compute

    - name: Include nova_compute
      import_role:
        name: nova_compute

    # SDN扩容不跑这个
    - name: Include neutron_compute
      import_role:
        name: neutron_compute
      vars:
        vm_interface_tenant: "{{ network_openstack.tenant.interface.compute }}"
        vm_interface_public: "{{ network_openstack.public.interface.compute }}"

    # SDN 需要安装ovs
    - name: Install openvswitch package
      yum:
        name: openvswitch
        state: present
      tags: openvswitch

    - import_tasks: common-tasks/binding_manage_hosts.yml

2.ansible安装部署¶

1）vim hosts文件，注释掉其他网络节点，并注明扩容日期

2）检查部署代码是否有不该注释的注释掉了

[root@hb02-other-172e28e8e143 roles]# grep -rE '#' nova_compute/
[root@hb02-other-172e28e8e143 deploy-v2]# grep -r '#' playbooks/compute_extension.yml

3）执行命令

[root@hb02-other-172e28e8e143 deploy-v2]# ansible-playbook playbooks/compute_extension.yml

3.部署成功，验证服务¶

在部署机执行以下命令，全部active (running)为正常

[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-nova-compute.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-cinder-volume.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-cinder-backup.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-openvswitch-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-dhcp-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-metadata-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-l3-agent.service'

在控制节点nova service-list |grep （主机名）确保所有的nova服务都注册上

[root@hb02-other-172e28e8e132 ~]# nova service-list|grep hb02-other-172e28e8e138
| 5fb75733-a421-492a-af08-6ba3c6793d8a | nova-compute     | hb02-other-172e28e8e138 | public   | enabled | up    | 2019-10-29T09:01:43.000000 | -               | False       |

4.通知监控组安装zabbix-agent¶

5.检查zabbix-server服务，并检查dashbord有没有发现这台机器以及状态是否正常¶

在部署机执行

[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status zabbix-agent'

6.确保zabbix界面监控正常¶

从web端看看是否正常继承正确的网络模板，确保所有的监控项都监控并可

http://10.129.133.205/

7.将主机加入test-zone域¶

1）检查是否已经存在test-zone,没有则创建

[root@sh03-control-10e114e8e24 ~]# openstack aggregate list
+----+------+-------------------+
| ID | Name | Availability Zone |
+----+------+-------------------+
|  4 | test | test-zone         |
+----+------+-------------------+
# 如果没有，则需要创建
[root@hb02-other-172e28e8e132 ~]# openstack aggregate create --zone TEST-ZONE TEST
+-------------------+----------------------------+
| Field             | Value                      |
+-------------------+----------------------------+
| availability_zone | test-zone                  |
| created_at        | 2019-10-29T06:21:32.985989 |
| deleted           | False                      |
| deleted_at        | None                       |
| id                | 5                          |
| name              | test                       |
| updated_at        | None                       |
+-------------------+----------------------------+

2）将新扩容主机进行加到test-zone域

# 查看新扩容主机
[root@hb02-other-172e28e8e132 ~]# openstack host list |grep compute
+-------------------------+-------------+----------+
| Host Name               | Service     | Zone     |
+-------------------------+-------------+----------+
| hb02-other-172e28e8e139 | compute     | public   |
| hb02-other-172e28e8e138 | compute     | public   |
+-------------------------+-------------+----------+
# 将新扩容主机名复制到new_compute文件
[root@hb02-other-172e28e8e132 ~]# vim new_compute
hb02-other-172e28e8e139
hb02-other-172e28e8e138
# 使用for循环给主机添加域
[root@hb02-other-172e28e8e132 ~]# for i in `cat new_compute`;do openstack aggregate add host TEST $i;done

8.通知测试进行测试¶

9.测试成功将主机添加public域¶

[root@hb02-other-172e28e8e132 ~]# for i in `cat new_compute`;do openstack aggregate remove host TEST $i;done