流程-资源池上下线流程¶
资源池节点下线流程¶
前提条件¶
1.确保要下线的节点上,已经没有虚机正在运行
①如果节点正常,下线是因为业务或者别的需求导致的,应将此节点上的虚机进行迁移操作,迁移到别的机器,再进行下一步操作。
迁移命令:
热迁:
[root@hb02-other-172e28e8e132 ~]# openstack server migrate --live hb02-other-172e28e8e137 ea24caf2-5e99-43f5-838f-164bb16891c0
冷迁:
[root@hb02-other-172e28e8e132 ~]# nova migrate --host hb02-other-172e28e8e138 ea24caf2-5e99-43f5-838f-164bb16891c0
[root@hb02-other-172e28e8e132 ~]# nova resize-confirm ea24caf2-5e99-43f5-838f-164bb16891c0
②如果节点异常, 确保虚机及磁盘等资源在别的机器正常运行,再下线此节点。
2.cinder-volume管理的卷进行迁移
1)查找下线的计算节点上cinder-volume管理的卷
cinder --os-volume-api-version 3.33 list --all --filters host= <current_host>
2)计算节点上cinder管理的卷进行迁移
cinder-manage --config-file /etc/cinder/cinder.conf volume update_host --currenthost <current_host@storage_backend> --newhost <new_host@storage_backend>
另外,host可以使用"cinder service-list" 查看
usage: cinder-manage [-h] [--config-dir DIR] [--config-file PATH] [--debug]
[--log-config-append PATH]
[--log-date-format DATE_FORMAT] [--log-dir LOG_DIR]
[--log-file PATH] [--nodebug] [--nouse-journal]
[--nouse-json] [--nouse-syslog] [--nowatch-log-file]
[--state_path STATE_PATH]
[--syslog-log-facility SYSLOG_LOG_FACILITY]
[--use-journal] [--use-json] [--use-syslog] [--version]
[--watch-log-file]
{backup,cg,cluster,config,db,host,logs,service,shell,version,volume}
...
usage: cinder-manage volume [-h] {delete,update_host} ...
usage: cinder-manage volume update_host [-h] --currenthost CURRENTHOST
--newhost NEWHOST
具体流程¶
1.先停OS服务¶
登录到物理机节点,
执行 systemctl stop openstack-nova-compute openstack-cinder-backup openstack-cinder-volume neutron-dhcp-agent neutron-l3-agent neutron-metadata-agent neutron-openvswitch-agent,确保nova、cinder、neutron服务全部关闭
执行systemctl disable openstack-nova-compute openstack-cinder-backup openstack-cinder-volume neutron-dhcp-agent neutron-l3-agent neutron-metadata-agent neutron-openvswitch-agent,关闭开机启动。
2.清理nova service¶
[root@sh03-control-10e114e8e17 ~]# nova service-list | grep sh03-compute-10e114e8e93
| 1f9c32ad-3543-425c-81d5-4ecf34384d87 | nova-compute | sh03-compute-10e114e8e93 | public | disable | down | 2019-01-11T08:36:35.000000 | - | False |
[root@sh03-control-10e114e8e17 ~]# nova service-delete 1f9c32ad-3543-425c-81d5-4ecf34384d87
3.清理neutron agent(4个全删)¶
[root@sh03-control-10e114e8e17 ~]# neutron agent-list | grep sh03-compute-10e114e8e93
| 0e92b0e4-d838-4f68-a888-6994843c1093 | L3 agent | sh03-compute-10e114e8e93 | nova | XXX | True | neutron-l3-agent |
| 30644de4-3f07-4429-a7be-d5fdfe32ff52 | DHCP agent | sh03-compute-10e114e8e93 | nova | XXX | True | neutron-dhcp-agent |
| 45e23090-fde6-478a-8609-a92b6dd09c97 | Metadata agent | sh03-compute-10e114e8e93 | | XXX | True | neutron-metadata-agent |
| b310ff58-56e7-481f-928e-259bc80b6532 | Open vSwitch agent | sh03-compute-10e114e8e93 | | XXX | True | neutron-openvswitch-agent |
[root@sh03-control-10e114e8e17 ~]# neutron agent-delete $id
4.清理 cinder service¶
[root@sh03-control-10e114e8e17 ~]# cinder service-list |grep sh03-compute-10e114e8e93
| cinder-backup | sh03-compute-10e114e8e93 | nova | disable | down | 2019-01-11T09:02:38.000000 | - |
| cinder-volume | sh03-compute-10e114e8e93@ceph | nova | disable | down | 2019-01-11T09:02:41.000000 | - |
[root@sh03-control-10e114e8e17 ~]# cinder-manage service remove <binary> <host_name>
5.zabbix上清理¶
zabbix上先做disable处理,后续确定以后删除
6.底层组执行下线¶
由底层组操作,重装或者报修等。
7.jumpserver删除下线机器¶
由蔡杭狄使用管理员账号在jumpserver界面删除节点。
8.deploy部署代码的hosts文件清理¶
或者做特殊说明备注,或者直接清除下线机器条目。
资源池上线流程¶
网络节点上线¶
1.拿到底层给提供的机器进行检查¶
1)ntp服务及时间是否同步,在该节点执行
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'date'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status ntpd'
2)主机名是否一致
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'hostname'
当以上时间正常,ntp服务正常,主机名正常,我们在该步骤需要在部署脚本里注释掉关于 ntp 的操作。同时需要注意在以后新资源池部署时,
也不需要咱们操作 ntp 了,咱们只验证时间是否对上,ntp 服务是否正常即可。
- hosts: network
tasks:
- name: Include system_init
import_role:
name: system_init
#- name: Include ntp
# import_role:
# name: ntp
- name: Include neutron_compute
import_role:
name: neutron_compute
vars:
vm_interface_tenant: "{{ network_openstack.tenant.interface.network }}"
vm_interface_public: "{{ network_openstack.public.interface.network }}"
2.ansible安装部署¶
1)vim hosts文件,注释掉其他网络节点,并注明扩容日期
2)检查部署代码是否有不该注释的注释掉了
[root@hb02-other-172e28e8e143 roles]# grep -rE '#' neutron_compute/
[root@hb02-other-172e28e8e143 deploy-v2]# grep -r '#' playbooks/network_extension.yml
3)执行命令
[root@hb02-other-172e28e8e143 deploy-v2]# ansible-playbook playbooks/network_extension.yml
3.部署成功,验证服务¶
在部署机执行以下命令,全部active (running)为正常
ansible network -mshell -a"systemctl status neutron-openvswitch-agent.service neutron-dhcp-agent.service neutron-metadata-agent.service neutron-l3-agent.service"
在控制节点 neutron agent-list |grep (主机名) 确保所有的neutorn服务 都注册上来
[root@hb02-other-172e28e8e132 ~]# neutron agent-list |grep hb02-other-172e28e8e136
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
| 08df9830-4b2d-4c8e-8539-6d14e6dcdea9 | DHCP agent | hb02-other-172e28e8e136 | nova | :-) | True | neutron-dhcp-agent |
| 289c7f10-bca8-45d7-aeb2-82fea06deeda | Metadata agent | hb02-other-172e28e8e136 | | :-) | True | neutron-metadata-agent |
| 4abaa367-6758-4a90-8688-fcf13493c24d | L3 agent | hb02-other-172e28e8e136 | nova | :-) | True | neutron-l3-agent |
| 5d6bc44d-1850-48b1-868b-b60821f2c39d | Open vSwitch agent | hb02-other-172e28e8e136 | | :-) | True | neutron-openvswitch-agent |
4.通知监控组安装zabbix-agent¶
5.监控组需要检查zabbix-server服务,并检查dashbord有没有发现这台机器以及状态是否正常¶
在部署机执行
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status zabbix-agent'
6.确保zabbix界面监控正常¶
从web端看看是否正常继承正确的网络模板,确保所有的监控项都监控并可
7.通知测试进行测试¶
资源池扩容流程¶
计算节点上线¶
1.拿到底层给提供的机器进行检查¶
1)ntp服务及时间是否同步,在该节点执行
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'date'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status ntpd'
2)主机名是否一致
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'hostname'
当以上时间正常,ntp服务正常,主机名正常,我们在该步骤需要在部署脚本里注释掉关于 ntp 的操作。同时需要注意在以后新资源池部署时,
也不需要咱们操作 ntp 了,咱们只验证时间是否对上,ntp 服务是否正常即可。
---
- hosts: compute
tasks:
- name: Include system_init
import_role:
name: system_init
# 以后都不跑了!!!
#- name: Include ntp
# import_role:
# name: ntp
- name: Include cinder_compute
import_role:
name: cinder_compute
- name: Include nova_compute
import_role:
name: nova_compute
# SDN扩容不跑这个
- name: Include neutron_compute
import_role:
name: neutron_compute
vars:
vm_interface_tenant: "{{ network_openstack.tenant.interface.compute }}"
vm_interface_public: "{{ network_openstack.public.interface.compute }}"
# SDN 需要安装ovs
- name: Install openvswitch package
yum:
name: openvswitch
state: present
tags: openvswitch
- import_tasks: common-tasks/binding_manage_hosts.yml
2.ansible安装部署¶
1)vim hosts文件,注释掉其他网络节点,并注明扩容日期
2)检查部署代码是否有不该注释的注释掉了
[root@hb02-other-172e28e8e143 roles]# grep -rE '#' nova_compute/
[root@hb02-other-172e28e8e143 deploy-v2]# grep -r '#' playbooks/compute_extension.yml
3)执行命令
[root@hb02-other-172e28e8e143 deploy-v2]# ansible-playbook playbooks/compute_extension.yml
3.部署成功,验证服务¶
在部署机执行以下命令,全部active (running)为正常
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-nova-compute.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-cinder-volume.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status openstack-cinder-backup.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-openvswitch-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-dhcp-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-metadata-agent.service'
[root@hb02-other-172e28e8e143 deploy-v2]# ansible compute -mshell -a'systemctl status neutron-l3-agent.service'
在控制节点nova service-list |grep (主机名) 确保所有的nova服务 都注册上
[root@hb02-other-172e28e8e132 ~]# nova service-list|grep hb02-other-172e28e8e138
| 5fb75733-a421-492a-af08-6ba3c6793d8a | nova-compute | hb02-other-172e28e8e138 | public | enabled | up | 2019-10-29T09:01:43.000000 | - | False |
4.通知监控组安装zabbix-agent¶
5.检查zabbix-server服务,并检查dashbord有没有发现这台机器以及状态是否正常¶
在部署机执行
[root@hb02-other-172e28e8e143 deploy-v2]# ansible network -mshell -a'systemctl status zabbix-agent'
6.确保zabbix界面监控正常¶
从web端看看是否正常继承正确的网络模板,确保所有的监控项都监控并可
7.将主机加入test-zone域¶
1)检查是否已经存在test-zone,没有则创建
[root@sh03-control-10e114e8e24 ~]# openstack aggregate list
+----+------+-------------------+
| ID | Name | Availability Zone |
+----+------+-------------------+
| 4 | test | test-zone |
+----+------+-------------------+
# 如果没有,则需要创建
[root@hb02-other-172e28e8e132 ~]# openstack aggregate create --zone TEST-ZONE TEST
+-------------------+----------------------------+
| Field | Value |
+-------------------+----------------------------+
| availability_zone | test-zone |
| created_at | 2019-10-29T06:21:32.985989 |
| deleted | False |
| deleted_at | None |
| id | 5 |
| name | test |
| updated_at | None |
+-------------------+----------------------------+
2)将新扩容主机进行加到test-zone域
# 查看新扩容主机
[root@hb02-other-172e28e8e132 ~]# openstack host list |grep compute
+-------------------------+-------------+----------+
| Host Name | Service | Zone |
+-------------------------+-------------+----------+
| hb02-other-172e28e8e139 | compute | public |
| hb02-other-172e28e8e138 | compute | public |
+-------------------------+-------------+----------+
# 将新扩容主机名复制到new_compute文件
[root@hb02-other-172e28e8e132 ~]# vim new_compute
hb02-other-172e28e8e139
hb02-other-172e28e8e138
# 使用for循环给主机添加域
[root@hb02-other-172e28e8e132 ~]# for i in `cat new_compute`;do openstack aggregate add host TEST $i;done
8.通知测试进行测试¶
9.测试成功将主机添加public域¶
[root@hb02-other-172e28e8e132 ~]# for i in `cat new_compute`;do openstack aggregate remove host TEST $i;done