d7b2dcf66f16a1295613ad5fc1abb8e6c2ad66ee
Commit Graph

1177 Commits

Author SHA1 Message Date
Dmitry Tantsur
d7b2dcf66f Trivial: fix variable in formatting
Change-Id: I6af5e6d2c4781c24345d456cec4d77c364ae2da5
2024年09月18日 13:35:07 +02:00
Zuul
ab99f36baa Merge "Check for the existence of an IPMI device" 2024年09月09日 16:44:27 +00:00
cid
2d79eae382 Check for the existence of an IPMI device
Check for IPMI device files before the use of the `'ipmitool lan.*'`
command, avoiding unnecessary calls on non-IPMI systems.
Closes-Bug: #2076367
Change-Id: Ib800717701e6f2828df55a0da0e999fc014c12e1
2024年09月05日 20:48:07 +01:00
Jay Faulkner
e303a369dc Inspect non-raw images for safety
When IPA gets a non-raw image, it performs an on-the-fly conversion
using qemu-img convert, as well as running qemu-img frequently to get
basic information about the image before validating it.
Now, we ensure that before any qemu-img calls are made, that we have
inspected the image for safety and pass through the detected format.
If given a disk_format=raw image and image streaming is enabled
(default), we retain the existing behavior of not inspecting it in
any way and streaming it bit-perfect to the device. In this case, we
never use qemu-based tools on the image at all.
If given a disk_format=raw image and image streaming is disabled, this
change fixes a bug where the image may have been converted if it was not
actually raw in the first place. We now stream these bit-perfect to the
device.
Adds two config options:
- [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in
 order to disable all security features. Do not do this.
- [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types
 IPA should accept.
Both of these configuration options are wired up to be set by the lookup
data returned by Ironic at lookup time.
This uses a image format inspection module imported from Nova; this
inspector will eventually live in oslo.utils, at which point we'll
migrate our usage of the inspector to it.
Closes-Bug: #2071740
Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7
2024年09月04日 09:11:28 -07:00
Sharpz7
b2ec08a15e Adding support to view indiv. cpu-core info
Closes-Bug: #1639340
This commit adds the relevant changes to the get_cpu function, keeping it backwards compatible with the old method.
Change-Id: I3c3a792e88e9a041236eca7283ebfdf1026910d8
2024年07月15日 12:37:37 +00:00
Jay Faulkner
60132c96d1 Fix issues caused/found by new codespell
Fixed spelling where appropriate, added ignore where appropriate
Change-Id: I07f203d311484321e0dfcbdf02083784693f4b96
2024年05月23日 15:49:48 -07:00
Jay Faulkner
c39517b044 Call evaluate_hardware_support exactly once per hwm
Fixes an issue where we could call evaluate_hardware_support multiple
times each run. Now, instead, we cache the values and use the cache
where needed.
Adds unit test coverage for get_managers and the new method.
Fixes issue where we were caching hardware managers between unit tests.
Also includes fixes for codespell CI:
- skip build files in repo
- fix spelling issues introduced to repo
Closes-bug: 2066308
Change-Id: Iebc5b6d2440bfc9f23daa322493379bbe69e84d0
2024年05月22日 08:46:21 -07:00
Julia Kreger
45a16987dc Remove eventlet workaround
Per https://review.opendev.org/c/openstack/ironic/+/918082 and
contributor recollection, we believe this has been resolved and can
thus be removed.
Change-Id: Icbf0f095cabf52a7b642cd4a6ddfbd62cc77964e
2024年05月03日 08:18:20 -07:00
Julia Kreger
6ac3f350c0 Unmount config drives
If this seems like deja vu, that is because it is. We had this
very same issue with the original CoreOS ramdisk. Since we don't
control the whole OS of the ramdisk, it only made sense to teach
the agent to umount the folder.
The folder is referenced already, and the agent does have safeguards
in place, but unfortunately this issue led to a rebuild breaking where
cloud-init, glean, and the agent were all trying do the right thing
as they thought, and there were just multiple /mnt/config folders
present in the OS. These are separate issues we also need to try and
remedy.
What happens is when the device is locked via a mount, the partition
table is never updated to the running OS as the mount creates a lock.
So the agent ends up thinking, in the case of a rebuild, that everything
including creating a configuration drive on that device has been
successful, but when you reboot, there is no partition table entry
for the new partition as the change was not successfully written.
This state prevented the workload from rebooting properly.
This change eliminates that possibility moving forward by attempting
to ensure that the cloud configuration folder is no longer mounted.
Change-Id: I4399dd0934361003cca9ff95a7e3e3ae9bba3dab
2024年04月29日 15:41:59 -07:00
Zuul
28053644cd Merge "add mixed matching of root device hints" 2024年04月27日 17:26:25 +00:00
Zuul
2b67f277b7 Merge "Step to clean UEFI NVRAM entries" 2024年04月27日 02:10:54 +00:00
Tudor Domnescu
ceec5a7367 destroy_disk_metadata: support 4096 sector size
A sector size of 512 was assumed and hardcoded, causing dd to fail when
it tried to write in chunks smaller than the sector size for disks with
4096 bytes sectors. The size of GPT in sectors also depends on sector size.
Change-Id: Ide5318eb503d728cff3221c26bebbd1c214f6995
2024年04月24日 20:37:44 +00:00
Adam Rozman
84a1195d5a add mixed matching of root device hints
This commit introduces the following changes:
 - New optional `all_serial_and_wwn` argument for the block device
 listing logic. The new argument makes it possible to
 collect wwn and serial number information from both
 lsblk and udevadm at the same time
 - Both the short and the long serials are collected
 from udeavadm without prioritization when the new argument
 has teh value True
 - The new feature is automatically enabled during block device listing
 as part of the root disk selecetion
 - New options are added to the lsblk command when used in the block
 device discovery process, previously lsblk was not looking
 for wwn numbers and now it does
Closes-Bug: #2061437
Change-Id: I438a686d948cd929311e2f418bb02fb771805148
Signed-off-by: Adam Rozman <adam.rozman@est.tech>
2024年04月15日 15:53:50 +03:00
Steve Baker
215fecd447 Step to clean UEFI NVRAM entries
Adds a deploy step ``clean_uefi_nvram`` to remove unrequired extra UEFI
NVRAM boot entries. By default any entry matching ``HD`` as the root
device, or with a ``shim`` or ``grub`` efi file in the path will be
deleted, ensuring that disk based boot entries are removed before the
new entry is created for the written image. The ``match_patterns``
parameter allows a list of regular expressions to be passed, where a
case insensitive search in the device path will result in that entry
being deleted.
Closes-Bug: #2041901
Change-Id: I3559dc800fcdfb0322286eba30ce47041419b0c6
2024年04月11日 01:17:23 +12:00
Zuul
cdd0a83448 Merge "Import disk_{utils,partitioner} from ironic-lib" 2024年04月03日 01:04:10 +00:00
Zuul
c784ee7cb9 Merge "Fix mocking for TestGenericHardwareManager" 2024年04月01日 14:57:34 +00:00
Zuul
b6075156b3 Merge "USB device discovery" 2024年03月28日 21:22:53 +00:00
Daniel King
cae6b15bbc Fix mocking for TestGenericHardwareManager
This test class is testing the GenericHardwareManager, but did no
mocking for dispatch_to_managers. Therefore, if any of its methods
attempted to make a call to that method, it would break the unit tests.
This update adds mocking for get_managers to prevent the tests from
breaking if a method calls dispatch_to_managers.
Additionally, updates test_delete_configuration_skip_list.
test_delete_configuration_skip_list mocks get_skip_list_from_node.
mocks get_skip_list_from_node.
Correcting the return_value from a list to a set to match what is
returned from the original method.
Related-Bug: 2057668
Change-Id: Ifaa800449b49f64c6ba5779bfae1c8e2c3249903
2024年03月25日 12:16:02 -04:00
Dmitry Tantsur
f824930bbd Import disk_{utils,partitioner} from ironic-lib
With the iscsi deploy long gone, these modules are only used in IPA and
in fact represent a large part of its critical logic. Having them
separately sometimes makes fixing issues tricky if an interface of
a function needs changing.
This change imports the code mostly as it is, just removing run_as_root and
a deprecated function, as well as moving configuration options to config.py.
Also migrates one relevant function from ironic_lib.utils.
Change-Id: If8fae8210d85c61abb85c388b300e40a75d0531c
2024年03月15日 18:45:04 +01:00
Zuul
e28b3e72f7 Merge "Use assert_not_called" 2024年03月15日 17:30:21 +00:00
Riccardo Pittau
95b3ed3fed Fix unit tests after ironic-lib changes
Updating tests after change [1] and [2] in ironic-lib.
[1] ae53e8e4b3
[2] 7644196e7d
Change-Id: I880b4f82beb117d8812e60c13040e19476cec32b
2024年03月12日 09:13:14 +01:00
Thomas Goirand
ca6ff4706b Use assert_not_called
IPA still has 3 occurences of not_called() which are failing for me
when building the Ironic Debian package in Debian Unstable (ie: with
Python 3.12).
This patch uses assert_not_called() instead of not_called(), fixing
the problem.
Change-Id: I8bd27fa706b298b28ef5bef405134a2c9803d757
2024年02月26日 11:57:10 +01:00
Damien Rannou
3fd68c0848 USB device discovery
The idea is to retreive USB devices informations via 'lshw' and
return the list to ironic in order to be able to create introspection
rules based on USB devices.
Change-Id: I39d60cb467614fca7a7f701dbe576154213580a5
2024年02月19日 14:49:52 +01:00
Zuul
df7eccd7f1 Merge "Trivial: avoid deprecated utcnow" 2024年02月08日 14:43:41 +00:00
Zuul
6d35c1e949 Merge "Make inspection URL optional if the collectors are provided" 2024年02月07日 23:06:34 +00:00
Zuul
359ac636f0 Merge "Drop usage of run_as_root" 2024年01月31日 16:29:06 +00:00
Dmitry Tantsur
8877e1f319 Trivial: avoid deprecated utcnow
Change-Id: I5dbe3c2be36e23e749fbeebbc448d413d276b401
2024年01月31日 10:09:13 +01:00
Dmitry Tantsur
0010f5c11a Also retry inspection on HTTP CONFLICT
The new implementation can return it when unable to lock the node.
Other possible errors are 400 and 404 (should not be retried), as well as
5xx (already retried).
Change-Id: I74c2f54a624dc47e8e2d1e67ae4c6a6078e01d2f
2024年01月26日 16:21:24 +01:00
Dmitry Tantsur
9f849472ca Drop usage of run_as_root
IPA can only be run as root and does not use rootwrap. We need to
eventually remove support for rootwrap from ironic-lib.
Change-Id: Iffd5cae5e3dc8637bc6dd10b3bcc9fe33932b8cf
2024年01月23日 14:23:23 +01:00
Zuul
1e107bd625 Merge "Add support for reporting CPU socket number" 2024年01月22日 11:52:06 +00:00
Kaifeng Wang
9cafe76225 Add support for reporting CPU socket number
IPA reports a few cpu fields including cores, arch, flags etc.
There is a need that user wants to utilize the physical number in
a baremetal since cores are just a logical representation of the
compute resource.
The socket number is more suitable for the quota control in some
use cases.
Change-Id: I94be86d6b12a3a7e7ca1041d948427a073412a31
2024年01月19日 21:24:37 +00:00
Dmitry Tantsur
6cd36a750f Make inspection URL optional if the collectors are provided
With the new in-band inspection, we can derive the callback URL from
the Ironic URL, there is no need to duplicate it. This change uses
the presence of collectors as a sign to run inspection.
The previous approach of setting an inspection URL, with or without
explicitly setting collectors, still works for compatibility with
ironic-inspector.
Change-Id: Ie4279ee6d2995c9686f1dcdef1d6e5dc1dd20871
2024年01月10日 08:55:42 +01:00
Dmitry Tantsur
0d4ae976c2 Support several API and Inspector URLs
Allows nodes with a single IP stack to be deployed from a dual-stack
Ironic.
Detecting advertised address and usable Ironic URLs are done completely
independently which does open some space for a misconfiguration. I hope
it's not likely in the reality, especially since this feature is
targetting advanced standalone users.
Change-Id: Ifa506c58caebe00b37167d329b81c166cdb323f2
Closes-Bug: #2045548 
2024年01月09日 16:43:23 +01:00
Dmitry Tantsur
2bb74523ae Add missing headers to the inspection callback
Somehow, it has worked correctly for years, but now I've discovered that
the new inspection is (no longer?) tolerant to the missing header.
While here, copy all headers from the heartbeat code.
Change-Id: I9e5c609eb4435e520bc225dea08aedfdf169744b
2024年01月09日 16:38:46 +01:00
Zuul
d298e06b49 Merge "[codespell] Fix spelling issues in IPA" 2024年01月08日 17:22:02 +00:00
Jay Faulkner
dcaed43ef9 Update to latest pep8/code style versions
Update various linting programs to their latest version, and fix any
issues created by the update.
Change-Id: I014c846560663a76a1663b568ef48659d0ab6d4d
2023年12月28日 14:19:27 -08:00
Jay Faulkner
36e5993a04 [codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some
cases, I may have manually modified a line to make the output more clear
or to correct grammatical issues which were obvious in the codespell
output.
Later changes in this chain will provide the codespell config used to
generate this, as well as adding this commit's SHA, once landed, to a
.git-blame-ignore-revs file to ensure it will not pollute git historys
for modern clients.
Related-Bug: 2047654
Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada
2023年12月28日 10:54:46 -08:00
Iury Gregory Melo Ferreira
03b6b0a4ab Fix inspector retries to not take a long time
Since we moved to exponential wait we increased the amount of time
to run unit tests, now we can configure the max time to wait
- before: Ran: 33 tests in 22.6581 sec.
- after: Ran: 33 tests in 4.0256 sec.
Change-Id: Ibdcfebacad0489d17183e43ceb0d603fce67e72b
2023年12月19日 14:26:59 -03:00
Dmitry Tantsur
2ab8364649 Add a jitter to heartbeat retries
Currently, if heartbeat fails, we reschedule it after 5 seconds.
This is fine for the first retry, but it can cause a thundering herd
problem when a lot of nodes fail to heartbeat at once.
This change adds jitter to the minimum wait of 5 seconds. The jitter is
not applied for forced heartbeats: they still have a minimum wait of
exactly 5 seconds from the last heartbeat.
The code is re-ordered to move the interval calculation to one place.
Bonus: correctly logging the next interval.
The unit tests have been rewritten to test the heartbeat process step by
step and not rely on the exact sequence of the calls.
Closes-Bug: #2038438
Change-Id: I4c4207b15fb3d48b55e340b7b3b54af833f92cb5
2023年12月13日 17:34:24 +01:00
Zuul
62041d6d9e Merge "Fix referencing to the raid_device var which is not set" 2023年12月12日 17:01:32 +00:00
Iury Gregory Melo Ferreira
801da9ec1f Retry in ProxyError during post inspector data
* ProxyError is derived from ConnectionError, but it's necessary
to check the Response object to identify.
- Added ProxyError in retry_if_exception_type
- Updated _post_to_inspector to proper handle ProxyError
- Updated the wait to use wait_exponential instead of wait_fixed.
Closes-Bug: 2045429
Change-Id: Iefe3fe581cd4e7c91a0da708e6f6d0fdaacab6fe
2023年12月06日 12:01:35 -03:00
Zuul
beccfe8c92 Merge "Revert "Fix vmedia network config drive handling"" 2023年11月30日 15:14:20 +00:00
Dmitry Tantsur
c57deb7e76 Revert "Fix vmedia network config drive handling"
This reverts commit 33f01fa3c2.
There are a few issues with the patch - see my comments there.
The most pressing and the reasons to revert are:
1) It breaks deployments when the vmedia is present but does not
 have a network_data.json (the case for Metal3).
2) It assumes the presence of Glean which may not be the case.
Neither Julia nor myself have time to thoroughly fix the issue,
leaving a revert as the only option to unblock Metal3.
Change-Id: I3f1a18a4910308699ca8f88d8e814c5efa78baee
Closes-Bug: #2045255 
2023年11月30日 10:33:29 +00:00
Maryna Savchenko
f80330839d Fix referencing to the raid_device var which is not set
Change-Id: I11180e5d61d893a78583ace555f6e90ba8845950
2023年11月29日 12:40:29 +01:00
Zuul
61d17e2225 Merge "Parse efibootmgr type and details" 2023年11月29日 01:10:27 +00:00
Zuul
eea9917023 Merge "Fix vmedia network config drive handling" 2023年11月29日 01:10:25 +00:00
Steve Baker
352df0bc54 Parse efibootmgr type and details
This change improves the regex to match an exact entry name, and to also
match with the the entry type from a set of recognised types.
The boot entry details start from the recognised type onwards.
This can be used by a step which deletes all entries of type 'HW' and
UsbClass.
Related-Bug: #2041901
Change-Id: I5d879f724efc2919b541fd3fef0f931df67ff9c7
2023年11月24日 09:45:40 +13:00
Zuul
768aa17442 Merge "Add mlnx deploy_step entry to enable deploy time firmware" 2023年11月23日 00:12:13 +00:00
Zuul
7a4114512c Merge "Handle different device outputs for multipath" 2023年11月22日 21:36:40 +00:00
Zuul
9f9940efdc Merge "Test coverage for efi_utils.get_boot_record" 2023年11月22日 21:36:39 +00:00