跳转至

流程-虚拟机疏散流程

注意:疏散机制在高版本中才可以使用,低版本中有 bug,使用前需要确认资源池版本,不应该低于6.4.3.0.0.0,可以参考链接5.版本缺陷记录 上面有版本缺陷记录。

什么情况下会选择疏散

当硬件维护或者其他原因导致的计算节点宕机时,我们可以将虚拟机疏散到其他计算节点,使之保持运行中。

使用命令 nova service-list | grep sh03-compute-10e114e56e89 查看nova-compute服务是否处于 down 状态,如果处于 down 状态,可以使用 疏散命令。

具体使用方法

1)在疏散之前,需要把故障机器管理的卷迁移走

如果要疏散虚机挂载的卷正好被故障机器管理,则疏散过程中会出现错误。

使用命令:

在控制节点执行

cinder --os-volume-api-version 3.33 list --filters host=sh03-compute-10e114e56e89 --all

查看是否有管理的卷,如果有,使用如下命令迁移卷,目标节点是正常节点。

cinder-manage --config-file /etc/cinder/cinder.conf volume update_host --currenthost sh03-compute-10e114e56e89@ceph --newhost sh03-compute-10e114e56e141@ceph

cinder-manage --config-file /etc/cinder/cinder.conf backup update_backup_host --currenthost sh03-compute-10e114e56e89 --newhost sh03-compute-10e114e56e141

注意:上述命令中尤其是主机名参数一定不能错,因为该命令参数出错也不会报错。

2) 再次验证故障机器上是否有管理的卷

cinder --os-volume-api-version 3.33 list --filters host=sh03-compute-10e114e56e89 --all

输出空,则说明成功迁移走了卷。

3)两种疏散方式

一种是单台疏散,即{{nova evacuate}}命令,另一种是使用命令{{nova host-evacuate}}疏散整个计算节点上的虚机。后者首先查询计算节点上的虚机列表,然后循环执行evacuate操作。

[root@gd02-control-11e115e64e13 ~]# nova help evacuate
usage: nova evacuate [--password <password>] [--force] <server> [<host>]

Evacuate server from failed host.

Positional arguments:
  <server>               Name or ID of server.
  <host>                 Name or ID of the target host. If no host is
                         specified, the scheduler will choose one.

Optional arguments:
  --password <password>  Set the provided admin password on the evacuated
                         server. Not applicable if the server is on shared
                         storage.
  --force                Force to not verify the scheduler if a host is
                         provided. (Supported by API versions '2.29' -
                         '2.latest')
[root@gd02-control-11e115e64e13 ~]# nova help host-evacuate
usage: nova host-evacuate [--target_host <target_host>] [--force] <host>

Evacuate all instances from failed host.

Positional arguments:
  <host>                       The hypervisor hostname (or pattern) to search
                               for. WARNING: Use a fully qualified domain name
                               if you only want to evacuate from a specific
                               host.

Optional arguments:
  --target_host <target_host>  Name of target host. If no host is specified
                               the scheduler will select a target.
  --force                      Force to not verify the scheduler if a host is
                               provided. (Supported by API versions '2.29' -
                               '2.latest')

示例演示:

a.单个疏散

[root@hb02-other-172e28e8e132 ~]# nova evacuate ea24caf2-5e99-43f5-838f-164bb16891c0 hb02-other-172e28e8e139
ERROR (BadRequest): Compute service of hb02-other-172e28e8e137 is still in use. (HTTP 400) (Request-ID: req-683aee56-af5a-40eb-88c6-c18ebe4890df)
[root@hb02-other-172e28e8e132 ~]# nova hypervisor-list
+--------------------------------------+-------------------------+-------+---------+
| ID                                   | Hypervisor hostname     | State | Status  |
+--------------------------------------+-------------------------+-------+---------+
| 29571b62-cd25-42bf-8bee-79a60db54032 | hb02-other-172e28e8e137 | down  | enabled |
| 4f62f8d2-9fc5-4555-b362-8e99dc421181 | hb02-other-172e28e8e138 | up    | enabled |
| a1a2e14b-ffeb-4a56-895b-56969a1e4de6 | hb02-other-172e28e8e139 | up    | enabled |
+--------------------------------------+-------------------------+-------+---------+
[root@hb02-other-172e28e8e132 ~]# nova list --all --host hb02-other-172e28e8e137
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+
| ID                                   | Name    | Tenant ID                        | Status | Task State | Power State | Networks             |
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+
| ea24caf2-5e99-43f5-838f-164bb16891c0 | dawei_2 | 9e5b5032812940d0830fe674517d5f66 | ACTIVE | -          | Running     | test1=192.168.101.15 |
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+
[root@hb02-other-172e28e8e132 ~]# nova evacuate ea24caf2-5e99-43f5-838f-164bb16891c0 hb02-other-172e28e8e139
[root@hb02-other-172e28e8e132 ~]# nova list --all --host hb02-other-172e28e8e139
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+
| ID                                   | Name    | Tenant ID                        | Status | Task State | Power State | Networks             |
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+
| ea24caf2-5e99-43f5-838f-164bb16891c0 | dawei_2 | 9e5b5032812940d0830fe674517d5f66 | ACTI
VE | -          | Running     | test1=192.168.101.15 |
+--------------------------------------+---------+----------------------------------+--------+------------+-------------+----------------------+

b.批量疏散

[root@sh03-control-10e114e56e42 ~]# nova  host-evacuate --target_host sh03-compute-10e114e56e141  sh03-compute-10e114e56e89 --force
+--------------------------------------+-------------------+---------------+
| Server UUID                          | Evacuate Accepted | Error Message |
+--------------------------------------+-------------------+---------------+
| 4df2296a-536e-44a3-afcd-3750408856ee | True              |               |
| b4244c8d-00fe-4811-8238-c194dbd55376 | True              |               |
| 84136335-f480-4208-97cb-6e8d8542210d | True              |               |
| d98e1e61-4046-4a6e-b9f8-f13c90500026 | True              |               |
| 99f81672-0e37-4e06-949b-16b3412a7388 | True              |               |
| 5d7c6262-ab95-4bc7-8405-774f982b183a | True              |               |
| 06523349-ce43-4304-b6b1-948d7c639f7b | True              |               |
| 45868bf9-0440-477f-ba42-8029a5f824fe | True              |               |
| 571a77a0-653d-4dca-8a97-c7e675c10fa5 | True              |               |
| 54215bcc-2ec6-4e5a-ba31-2b01dff2e0d2 | True              |               |
| 4dfcd1d8-1e7f-46f7-8dbb-e571ba54268e | True              |               |
| ff0d4f74-1c89-4bd9-ace8-f1986abb96a8 | True              |               |
| fa804181-1073-4c5b-a774-2e49188556c6 | True              |               |
| 44c4b9b9-65e3-4b12-9a7c-105ec950e835 | True              |               |
| a455c316-786c-485a-87e8-d481736d747a | True              |               |
| 30bedfc4-dd75-4288-b9f7-3e9488370d08 | True              |               |
| 78a83bfc-3038-41b6-9ec0-7883960db0bc | True              |               |
| 69aafa11-4cab-4b1e-bf62-b98fdfe059f3 | True              |               |
| 8e8fb657-9e1c-492e-bbd2-ce3bf940164a | True              |               |
| 6a66fad7-87c6-47c7-8cf2-1fff99ce43b2 | True              |               |
| 1a726189-a5f7-474f-9d9f-5c5587fade74 | True              |               |
+--------------------------------------+-------------------+---------------+

注意:上述疏散命令是将故障机器上所有机器疏散到目标节点,因此需要确认目标节点资源是否充足,目标节点的vcpu和内存要评估好,否则的就需要先手动疏散一部分,然后再整机疏散。

小结

  • evacuate 相当于根据数据库中instance的信息,在另一台host上创建一台一样的虚机,注意,共享存储的疏散才有意义。如果是本地盘,当物理机宕机时,本地盘无法迁移。
  • nova client有 {{nova evacuate}}和{{nova host-evacuate}}两个command,后者是循环对每个server执行evacuate。