c2c72eef975eda9afbb1fe2ee06740a5d577c187
579 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Zuul
|
c3ef9a563d | Merge "Fix software RAID creation on different physical devices" | ||
|
Dmitry Tantsur
|
9db3cd1e4d |
Graceful way for hardware managers to ignore certain devices
My use case for this feature is to exclude network devices that use the cdc_ether driver. These USB network interfaces often cause all sorts of issues. For example, some models have the same hardcoded MAC address, which breaks inspection. Currently, to exclude a certain device, a hardware manager must override the entire listing function (in my case, list_interfaces). Not only is it tedious, but it also requires constantly updating the hardware managers to match the implementation in GenericHardware. Realistically, it will cause hardware manager authors to inherit GenericHardware, which is the opposite of how hardware managers should be written. Note that the node-level skip list only affects root device selection and cleaning for block devices. This feature affects everything that uses list_block_devices and is applied before the node-level skip list. This change adds a new hardware manager call filter_device. For each network, block or USB device, it allows a hardware manager to do either of four things: 1. Delegate the decision to a lower level hardware manager by raising IncompatibleHardwareMethodError 2. Remove the device by returning None 3. Change the device by returning a modified instance 4. Return the device unchanged to keep it in the listing. Note that I'm removing debug logging when IncompatibleHardwareMethodError is raised. Not only the log message is incorrect (the error does not necessarily mean that the method is not implemented at all), it already noticeable space in the logs, and with this change will become very noisy. Change-Id: I5437343af6c6157882bcf0600dd89bd20478c948 Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com> |
||
|
Dmitry Tantsur
|
9426df9ab3 |
Split hardware manager initialize out of evaluate_hardware_support
The current code in GenericHardware.evaluate_hardware_support ends up using hardware manager calls, which then use partly initialized hardware manager list and can even cause a recursion. This change introduces a new optional call initialize() which is guaranteed to run: 1) After all hardware managers have been evaluated 2) After the hardware manager cache is populated 3) In the order of the support level of hardware managers Change-Id: I068d3d73483c161062aa3b48f3154a2d99941382 Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com> |
||
|
Dmitry Tantsur
|
521811cbcc |
Fix software RAID creation on different physical devices
When creating multiple software RAID logical disks that use different sets of physical devices, the partition indices were incorrectly shared across all devices. This caused the second RAID array creation to fail because it tried to use partition indices that didn't exist on those specific devices. This change fixes the issue by tracking partition indices separately for each physical device, ensuring that each device's partitions are numbered correctly starting from their first available index. Closes-Bug: #2115211 Change-Id: I440db4654f3d1d54274d1eee8c4b21c2b0a18d22 Signed-off-by: Mohammed Naser <mnaser@vexxhost.com> |
||
|
Zuul
|
b51cc75ff3 | Merge "netutils: Use ethtool ioctl to get permanent mac address" | ||
|
Nicolas Belouin
|
48422a532f |
netutils: Use ethtool ioctl to get permanent mac address
Fetching the permanent MAC address of the interface instead of the default one allows to get the right one in case it got changed during setup (likely with a bonding setup). In order to fetch the permanent MAC address of a given interface, one can either use Netlink (either rtnetlink or ethtool), or use ethtool ioctl. The use of ioctl feels simpler and requires no additional dependency. The implementation falls back to older behavior should an error occur. Closes-Bug: #2103450 Change-Id: I54151990e396ddcf775128ca24d3db08e45c256d Signed-off-by: Nicolas Belouin <nicolas.belouin@suse.com> |
||
|
cid
|
c03021fee2 |
Remove eventlet from Ironic Python Agent
This change removes several usages of eventlet from IPA: - Upgrades all requirements on oslo library versions to new ones that support non-eventlet use. - Removes use of the eventlet wsgi server (via oslo_service.wsgi) and replaces it with the cheroot wsgi server. - Removes explicit patching of python modules with eventlet Note that due to some oslo libraries still using ``eventlet`` to detect and workaround it's use. This means that it is still installed in environments alongside IPA, even if it's not used or patched into any modules. Depends-On: https://review.opendev.org/c/openstack/requirements/+/947727 Change-Id: I9accab2d5e9529a88ef5d3db85e76901f14114eb |
||
|
Zuul
|
53349cc7cf | Merge "Remove agent_token_required upgrade knob" | ||
| ac85195b7a |
Update master for stable/2025.1
Add file to the reno documentation build to show release notes for stable/2025.1. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2025.1. Sem-Ver: feature Change-Id: I259249774c39e95b214e77b2ae632c7278e78754 |
|||
|
Julia Kreger
|
94fde4b3b4 |
Remove agent_token_required upgrade knob
To help ease upgrades to Victoria, IPA had a knob added to enable operators to express if agent tokens were required in their deployment. Since then, the feature is required, however we left the logic enabling the fun upgrade case handling. At this point, this knob serves no further use, and can be removed. Change-Id: I202f06e1b6598a802c9853fb99201c55e7a40cb1 |
||
|
Julia Kreger
|
a6ca65201a |
Lockout agent command results if a token is received
This is a second attempt at securing the get command output endpoint which could have data such as logs which could potentially have sensitive details and information after the agent has completed one or more actions. Now, if a token is receieved, the agent locks out the command results endpoint, and requires all future calls to include it. This allows for the agent to be backwards compatible. Special thanks go to cid for his first attempt at this, which I took for the basis of some of the testing required. Closes-Bug: #2086866 Co-Authored-By: cid@gr-oss.io Change-Id: Ia39a3894ef5efaffd7e1d22cc6244059a32175ff |
||
|
Zuul
|
8ab0bfbd9b | Merge "Revert "Add token validation to GET command endpoints"" | ||
|
Dmitry Tantsur
|
3968715908 |
Revert "Add token validation to GET command endpoints"
This reverts commit
|
||
|
Zuul
|
3261052f5d | Merge "follow-up: update release note for bootable container work" | ||
|
Zuul
|
2e9964e126 | Merge "Add token validation to GET command endpoints" | ||
|
cid
|
a42980a016 |
Ensure IPA is locked down in rescue mode
Securely handle state transition by locking down IPA at the final stage of rescue operation to prevent restarts on tenant networks. Closes-Bug: #2086865 Change-Id: I8e1be8da93a8c3fdf3cff7ad386c702d970d15f1 |
||
|
cid
|
6f860995c6 |
Add token validation to GET command endpoints
Currently, we only validate authentication tokens for POST but not for GET requests which could mean anyone can retrieve command results without authentication. Adding that uniformly across all command-related endpoints. Closes-Bug: #2086866 Depends-On: https://review.opendev.org/c/openstack/ironic/+/941607 Change-Id: Ib7f58b1694273beeb25314984c6e049376244d86 |
||
|
Julia Kreger
|
c8763bba06 |
follow-up: update release note for bootable container work
Updates the release note for the bootable container work to clarify the existence of the configuration option which can be utilized to disable bootable container deployments in the ramdisk. Change-Id: I5b269947884c015db38cf98ac782472a62858455 |
||
|
Zuul
|
a6d1921056 | Merge "Bootable container support" | ||
|
Julia Kreger
|
1508cc4cd0 |
Bootable container support
Adds support for bootable containers to be deployed by the agent. Related: https://review.opendev.org/c/openstack/ironic/+/937897 Change-Id: I66cb37d117d2afc335f015fb1fc31bdbd5c3cee5 |
||
|
Kaifeng Wang
|
96bf1ef012 |
Collect bus and driver for interfaces
It's useful to have pci bus address/driver collected, the operator can use the information to configure portgroup in a consistent way. Change-Id: I432bca881ad881bae6d5e67c9b6fb52fe55b4e1e |
||
|
Zuul
|
0c35e7e2da | Merge "Add support for burnin-gpu" | ||
|
kubajj
|
018a5f6253 |
Fix errors in the function erase_devices_express
Prevents the UnboundLocalError in erase_devices_express clean step. Closes-Bug: #2095499 Change-Id: I01ce5005a62638ff960d2a75f225f882b2d56973 |
||
|
Zuul
|
ca07e941cf | Merge "Add a release note for 939340" | ||
|
cid
|
c222626b01 |
Treat 'No space left on device' error as fatal
Fail without retries when Errno 28 - "No space left on device" error is encountered. Closes-Bug: #2094854 Change-Id: Ie84b422916ddc02f2474164fe3da083324ef4824 |
||
|
kubajj
|
2ece938671 |
Add a release note for 939340
Follow-up to 939340 to add a release note about the bug-fix. Change-Id: I202f22d40776ab5d3245b8e14021d1404a9f478d |
||
|
cid
|
dfcb86d738 |
Add support for burnin-gpu
Adds support for running burnin tests on GPUs using gpu-burn[1]. Also refactors stress-ng code to be a bit cleaner. Requires gpu-burn to be pre-installed within the IPA. * https://github.com/wilicc/gpu-burn Co-Authored-By: Scott Solkhon <scottsolkhon@gmail.com> Closes-Bug: #2069085 Change-Id: I8f8cace6ebc2b7f1c245c82a64609cdfc1c492f9 |
||
|
Zuul
|
06077cb88e | Merge "Inventoried MAC address for only ipv6 addresses" | ||
| b010580caf |
reno: Update master for unmaintained/2023.1
Update the 2023.1 release notes configuration to build from unmaintained/2023.1. Change-Id: I0d8b1773367a61b326b5a6ff86ac1f126b15099b |
|||
|
Maximilian Brandt
|
6ccd3965ff |
Inventoried MAC address for only ipv6 addresses
Extended the function that expose BMC MAC address in inventory data for an IPv6 only interface. Previously, if no IPv4 address was configured, no mac address was exposed. Change-Id: I93e49d308cfd63be1c09749ced4428a87a3daff9 |
||
|
Zuul
|
01639aab20 | Merge "Add a command to lock down the agent" | ||
|
Zuul
|
4f9f461ce9 | Merge "A hardware manager call for a full sync before shutdown" | ||
|
Dmitry Tantsur
|
aa98250066 |
Add a command to lock down the agent
To support a safer take-over from the provisioning to the tenant network for hardware that cannot be powered off, this change introduces a new command system.lockdown. When invoked, it stops the API, the heartbeater and disables all network interfaces (if possible). Partial-Bug: #2077432 Change-Id: I211fc64a46226127b0d82ab458029b3c702b3f74 |
||
|
Zuul
|
5746ac1222 | Merge "Vendor metrics library from Ironic-Lib & deprecate" | ||
|
Dmitry Tantsur
|
5aa0c1a2bb |
A hardware manager call for a full sync before shutdown
This is largely required for the future lockdown command but can also be used before the normal shutdown, especially in the sync command which is currently used before an out-of-band shutdown command is issued. In addition to a plain sync, the new command also tells the kernel to drop its cached and issues a low-level sync command to each block device. Partial-Bug: #2077432 Change-Id: I3fc87b20bc5387a466b24ebc19b9982e4e368d20 |
||
|
Jay Faulkner
|
75abdb4148 |
Vendor metrics library from Ironic-Lib & deprecate
We are phasing out use of ironic-lib, and as such are removing the metrics module from it. However, due to it's requirement of having a statsd instance on the same subnet as the agent and there being no support for prometheus exporting of metrics from IPA, these metrics are no longer valuable (in the agent). We are vendoring the module for the deprecation in order to facilitate its removal from ironic-lib. Change-Id: Ie50e078bc3f78d65cfa53680dc4116d1119ce155 |
||
|
Zuul
|
b851ae1bc8 | Merge "Remove Python 3.8 support" | ||
|
Takashi Kajinami
|
b0ef2c0483 |
Remove Python 3.8 support
Python 3.8 was removed from the tested runtimes for 2024.2[1] and has not been tested since then. Also add Python 3.12 which is part of the tested runtimes for 2025.1. Now unit tests job with Python 3.12 is voting. [1] https://governance.openstack.org/tc/reference/runtimes/2024.2.html Change-Id: Id314b4453d81dcab806768e3c7ab5dc050a35136 |
||
|
Steve Baker
|
1a939105ba |
Capture and log sector sizes
``logical_sectors`` and ``physical_sectors`` sizes are now captured for each hardware info ``disks`` entry, and also logged for ``lsblk`` calls. This will be increasingly useful as storage devices with 4096 byte sector sizes become more common. Change-Id: I80b6b137f6e3071d9b8a4c1abe14416249aed9ac |
||
| e4d07fd1ba |
Update master for stable/2024.2
Add file to the reno documentation build to show release notes for stable/2024.2. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2024.2. Sem-Ver: feature Change-Id: Iffa68c4207e97d92382fbff637a661a879c1909d |
|||
|
Zuul
|
ab99f36baa | Merge "Check for the existence of an IPMI device" | ||
|
cid
|
2d79eae382 |
Check for the existence of an IPMI device
Check for IPMI device files before the use of the `'ipmitool lan.*'` command, avoiding unnecessary calls on non-IPMI systems. Closes-Bug: #2076367 Change-Id: Ib800717701e6f2828df55a0da0e999fc014c12e1 |
||
|
Jay Faulkner
|
e303a369dc |
Inspect non-raw images for safety
When IPA gets a non-raw image, it performs an on-the-fly conversion using qemu-img convert, as well as running qemu-img frequently to get basic information about the image before validating it. Now, we ensure that before any qemu-img calls are made, that we have inspected the image for safety and pass through the detected format. If given a disk_format=raw image and image streaming is enabled (default), we retain the existing behavior of not inspecting it in any way and streaming it bit-perfect to the device. In this case, we never use qemu-based tools on the image at all. If given a disk_format=raw image and image streaming is disabled, this change fixes a bug where the image may have been converted if it was not actually raw in the first place. We now stream these bit-perfect to the device. Adds two config options: - [DEFAULT]/disable_deep_image_inspection, which can be set to "True" in order to disable all security features. Do not do this. - [DEFAULT]/permitted_image_formats, default raw,qcow2, for image types IPA should accept. Both of these configuration options are wired up to be set by the lookup data returned by Ironic at lookup time. This uses a image format inspection module imported from Nova; this inspector will eventually live in oslo.utils, at which point we'll migrate our usage of the inspector to it. Closes-Bug: #2071740 Change-Id: I5254b80717cb5a7f9084e3eff32a00b968f987b7 |
||
|
Riccardo Pittau
|
bd3b596ced |
Fix series in release notes
Change-Id: I6844ce33274afdb64e78b79930c8aa32776e7665 |
||
|
Riccardo Pittau
|
599a825554 |
Fix versions in release notes
Change-Id: Ief6299e4b1bbef5fdb33a28b90b078f420cf8508 |
||
|
Jay Faulkner
|
c39517b044 |
Call evaluate_hardware_support exactly once per hwm
Fixes an issue where we could call evaluate_hardware_support multiple times each run. Now, instead, we cache the values and use the cache where needed. Adds unit test coverage for get_managers and the new method. Fixes issue where we were caching hardware managers between unit tests. Also includes fixes for codespell CI: - skip build files in repo - fix spelling issues introduced to repo Closes-bug: 2066308 Change-Id: Iebc5b6d2440bfc9f23daa322493379bbe69e84d0 |
||
| c303bd971b |
reno: Update master for unmaintained/zed
Update the zed release notes configuration to build from unmaintained/zed. Change-Id: I673a729e1598d2100631262d61c91690f500306b |
|||
|
Julia Kreger
|
6ac3f350c0 |
Unmount config drives
If this seems like deja vu, that is because it is. We had this very same issue with the original CoreOS ramdisk. Since we don't control the whole OS of the ramdisk, it only made sense to teach the agent to umount the folder. The folder is referenced already, and the agent does have safeguards in place, but unfortunately this issue led to a rebuild breaking where cloud-init, glean, and the agent were all trying do the right thing as they thought, and there were just multiple /mnt/config folders present in the OS. These are separate issues we also need to try and remedy. What happens is when the device is locked via a mount, the partition table is never updated to the running OS as the mount creates a lock. So the agent ends up thinking, in the case of a rebuild, that everything including creating a configuration drive on that device has been successful, but when you reboot, there is no partition table entry for the new partition as the change was not successfully written. This state prevented the workload from rebooting properly. This change eliminates that possibility moving forward by attempting to ensure that the cloud configuration folder is no longer mounted. Change-Id: I4399dd0934361003cca9ff95a7e3e3ae9bba3dab |
||
|
Zuul
|
28053644cd | Merge "add mixed matching of root device hints" | ||
|
Zuul
|
2b67f277b7 | Merge "Step to clean UEFI NVRAM entries" |