Orphaned resource allocations

Problem

There are orphaned resource allocations in the placement service which can cause resource providers to:

  • Appear to the scheduler to be more utilized than they really are

  • Prevent deletion of compute services

One scenario in which this could happen is a compute service host is having problems so the administrator forces it down and evacuates servers from it. Note that in this case "evacuates" refers to the server evacuate action, not live migrating all servers from the running compute service. Assume the compute host is down and fenced.

In this case, the servers have allocations tracked in placement against both the down source compute node and their current destination compute host. For example, here is a server vm1 which has been evacuated from node devstack1 to node devstack2:

$ openstack--os-compute-api-version2.53computeservicelist--servicenova-compute
+--------------------------------------+--------------+-----------+------+---------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+--------------------------------------+--------------+-----------+------+---------+-------+----------------------------+
| e3c18c2d-9488-4863-b728-f3f292ec5da8 | nova-compute | devstack1 | nova | enabled | down | 2019年10月25日T20:13:51.000000 |
| 50a20add-cc49-46bd-af96-9bb4e9247398 | nova-compute | devstack2 | nova | enabled | up | 2019年10月25日T20:13:52.000000 |
| b92afb2e-cd00-4074-803e-fff9aa379c2f | nova-compute | devstack3 | nova | enabled | up | 2019年10月25日T20:13:53.000000 |
+--------------------------------------+--------------+-----------+------+---------+-------+----------------------------+
$ vm1=$(openstackservershowvm1-fvalue-cid)
$ openstackservershow$vm1-fvalue-cOS-EXT-SRV-ATTR:host
devstack2

The server now has allocations against both devstack1 and devstack2 resource providers in the placement service:

$ devstack1=$(openstackresourceproviderlist--namedevstack1-fvalue-cuuid)
$ devstack2=$(openstackresourceproviderlist--namedevstack2-fvalue-cuuid)
$ openstackresourceprovidershow--allocations$devstack1
+-------------+-----------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------+-----------------------------------------------------------------------------------------------------------+
| uuid | 9546fce4-9fb5-4b35-b277-72ff125ad787 |
| name | devstack1 |
| generation | 6 |
| allocations | {u'a1e6e0b2-9028-4166-b79b-c177ff70fbb7': {u'resources': {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1}}} |
+-------------+-----------------------------------------------------------------------------------------------------------+
$ openstackresourceprovidershow--allocations$devstack2
+-------------+-----------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------+-----------------------------------------------------------------------------------------------------------+
| uuid | 52d0182d-d466-4210-8f0d-29466bb54feb |
| name | devstack2 |
| generation | 3 |
| allocations | {u'a1e6e0b2-9028-4166-b79b-c177ff70fbb7': {u'resources': {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1}}} |
+-------------+-----------------------------------------------------------------------------------------------------------+
$ openstack--os-placement-api-version1.12resourceproviderallocationshow$vm1
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| resource_provider | generation | resources | project_id | user_id |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| 9546fce4-9fb5-4b35-b277-72ff125ad787 | 6 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} | 2f3bffc5db2b47deb40808a4ed2d7c7a | 2206168427c54d92ae2b2572bb0da9af |
| 52d0182d-d466-4210-8f0d-29466bb54feb | 3 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} | 2f3bffc5db2b47deb40808a4ed2d7c7a | 2206168427c54d92ae2b2572bb0da9af |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+

One way to find all servers that were evacuated from devstack1 is:

$ novamigration-list--source-computedevstack1--migration-typeevacuation
+----+--------------------------------------+-------------+-----------+----------------+--------------+-------------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+------------+
| Id | UUID | Source Node | Dest Node | Source Compute | Dest Compute | Dest Host | Status | Instance UUID | Old Flavor | New Flavor | Created At | Updated At | Type |
+----+--------------------------------------+-------------+-----------+----------------+--------------+-------------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+------------+
| 1 | 8a823ba3-e2e9-4f17-bac5-88ceea496b99 | devstack1 | devstack2 | devstack1 | devstack2 | 192.168.0.1 | done | a1e6e0b2-9028-4166-b79b-c177ff70fbb7 | None | None | 2019年10月25日T17:46:35.000000 | 2019年10月25日T17:46:37.000000 | evacuation |
+----+--------------------------------------+-------------+-----------+----------------+--------------+-------------+--------+--------------------------------------+------------+------------+----------------------------+----------------------------+------------+

Trying to delete the resource provider for devstack1 will fail while there are allocations against it:

$ openstackresourceproviderdelete$devstack1
Unable to delete resource provider 9546fce4-9fb5-4b35-b277-72ff125ad787: Resource provider has allocations. (HTTP 409)

Solution

Using the example resources above, remove the allocation for server vm1 from the devstack1 resource provider. If you have osc-placement 1.8.0 or newer, you can use the openstack resource provider allocation unset command to remove the allocations for consumer vm1 from resource provider devstack1:

$ openstack--os-placement-api-version1.12resourceproviderallocation\
unset--provider$devstack1$vm1
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| resource_provider | generation | resources | project_id | user_id |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| 52d0182d-d466-4210-8f0d-29466bb54feb | 4 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} | 2f3bffc5db2b47deb40808a4ed2d7c7a | 2206168427c54d92ae2b2572bb0da9af |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+

If you have osc-placement 1.7.x or older, the unset command is not available and you must instead overwrite the allocations. Note that we do not use openstack resource provider allocation delete here because that will remove the allocations for the server from all resource providers, including devstack2 where it is now running; instead, we use openstack resource provider allocation set to overwrite the allocations and only retain the devstack2 provider allocations. If you do remove all allocations for a given server, you can heal them later. See Using heal_allocations for details.

$ openstack--os-placement-api-version1.12resourceproviderallocationset$vm1\
--project-id2f3bffc5db2b47deb40808a4ed2d7c7a\
--user-id2206168427c54d92ae2b2572bb0da9af\
--allocationrp=52d0182d-d466-4210-8f0d-29466bb54feb,VCPU=1\
--allocationrp=52d0182d-d466-4210-8f0d-29466bb54feb,MEMORY_MB=512\
--allocationrp=52d0182d-d466-4210-8f0d-29466bb54feb,DISK_GB=1
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| resource_provider | generation | resources | project_id | user_id |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+
| 52d0182d-d466-4210-8f0d-29466bb54feb | 4 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} | 2f3bffc5db2b47deb40808a4ed2d7c7a | 2206168427c54d92ae2b2572bb0da9af |
+--------------------------------------+------------+------------------------------------------------+----------------------------------+----------------------------------+

Once the devstack1 resource provider allocations have been removed using either of the approaches above, the devstack1 resource provider can be deleted:

$ openstackresourceproviderdelete$devstack1

And the related compute service if desired:

$ openstack--os-compute-api-version2.53computeservicedeletee3c18c2d-9488-4863-b728-f3f292ec5da8

For more details on the resource provider commands used in this guide, refer to the osc-placement plugin documentation.

Using heal_allocations

If you have a particularly troubling allocation consumer and just want to delete its allocations from all providers, you can use the openstack resource provider allocation delete command and then heal the allocations for the consumer using the heal_allocations command. For example:

$ openstackresourceproviderallocationdelete$vm1
$ nova-manageplacementheal_allocations--verbose--instance$vm1
Looking for instances in cell: 04879596-d893-401c-b2a6-3d3aa096089d(cell1)
Found 1 candidate instances.
Successfully created allocations for instance a1e6e0b2-9028-4166-b79b-c177ff70fbb7.
Processed 1 instances.
$ openstackresourceproviderallocationshow$vm1
+--------------------------------------+------------+------------------------------------------------+
| resource_provider | generation | resources |
+--------------------------------------+------------+------------------------------------------------+
| 52d0182d-d466-4210-8f0d-29466bb54feb | 5 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} |
+--------------------------------------+------------+------------------------------------------------+

Note that deleting allocations and then relying on heal_allocations may not always be the best solution since healing allocations does not account for some things:

  • Migration-based allocations would be lost if manually deleted during a resize. These are allocations tracked by the migration resource record on the source compute service during a migration.

  • Healing allocations only partially support nested allocations. Nested allocations due to Neutron ports having QoS policies are supported since 20.0.0 (Train) release. But nested allocations due to vGPU or Cyborg device profile requests in the flavor are not supported. Also if you are using provider.yaml files on compute hosts to define additional resources, if those resources are defined on child resource providers then instances using such resources are not supported.

If you do use the heal_allocations command to cleanup allocations for a specific trouble instance, it is recommended to take note of what the allocations were before you remove them in case you need to reset them manually later. Use the openstack resource provider allocation show command to get allocations for a consumer before deleting them, e.g.:

$ openstack--os-placement-api-version1.12resourceproviderallocationshow$vm1

Using placement audit

If you have a situation where orphaned allocations exist for an instance that was deleted in the past, example log message:

Instance <uuid> has allocations against this compute host but is not found in the database.

you can use the nova-manage placement audit tool to have it find and optionally delete orphaned placement allocations. This tool will call the placement API to modify allocations.

To list all allocations that are unrelated to an existing instance or migration UUID:

$ nova-manageplacementaudit--verbose

To delete all allocations on all resource providers that are unrelated to an existing instance or migration UUID:

$ nova-manageplacementaudit--verbose--delete

To delete all allocations on a specific resource provider that are unrelated to an existing instance or migration UUID:

$ nova-manageplacementaudit--verbose--delete--resource-provider<uuid>