601 Commits

Author SHA1 Message Date
Zuul
8f4a5a3f68 Merge "Skip BMC detection when using out-of-band management" 2025年11月22日 16:02:02 +00:00
Zuul
033a237e96 Merge "Fix API URL reachability test to use full URL with port" 2025年11月19日 12:36:00 +00:00
Zuul
a1739af940 Merge "Implement functionality for the is_root_volume RAID config" 2025年11月18日 19:28:06 +00:00
Riccardo Pittau
ca6f4fb988 Skip BMC detection when using out-of-band management
When Ironic uses out-of-band management interfaces like Redfish,
iDRAC, iLO, or iRMC, the BMC address is already known and configured
in Ironic. This change allows the agent to skip BMC address detection
via ipmitool when instructed by Ironic through the lookup response.
This reduces deployment time by avoiding unnecessary ipmitool calls
during hardware inventory collection.
The agent now checks for the 'agent_skip_bmc_detect' flag in the
config section of the lookup response and skips BMC detection
accordingly. This flag is stored in the cached node data for use
during hardware inventory collection.
Depends-On: I6a432db3eb238894e0ed2676243ce69ec300a9eb
Assisted-By: Claude Sonnet 4.5
Change-Id: Id5470136defb981d1855e3c57cd16c03a6eb916e
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
2025年11月18日 16:46:21 +01:00
Riccardo Pittau
7d7735a216 Fix API URL reachability test to use full URL with port
The _test_ip_reachability method was only using the hostname/IP
address when testing reachability, ignoring the port number from
the API URL. This caused LookupAgentIPError when the Ironic API
was running on a non-standard port (e.g., 6385).
This change modifies _test_ip_reachability to:
- Accept the full API URL instead of just an IP address
- Use the complete URL (including protocol and port) when testing
The _find_routable_addr method now passes the full api_url to
_test_ip_reachability instead of just the hostname, ensuring the
port is included in reachability tests.
Assisted-By: Claude Sonnet 4.5
Change-Id: Ibb407255cfcd5cf9617f040338561fd494e8b41f
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
2025年11月18日 15:49:46 +01:00
Riccardo Pittau
bae591a808 Fix RuntimeError when stopping heartbeater in rescue mode
In rescue mode, the agent attempts to stop the heartbeater thread
even though it was never started, causing a RuntimeError. This fix
adds checks to ensure the heartbeater thread is alive before
attempting to stop it.
Assisted-By: Claude Sonnet 4.5
Change-Id: I3e97b10f2c7f3c454f0db2a3c3c8efb61ffeda5a
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
2025年11月13日 13:26:08 +01:00
Riccardo Pittau
2c6cf7cf1f Test advertised ip reachability before assigning it
The advertised ip for ironic API is checked only as routable but
it could still be unreachable, we need to check the actual
connectivity before assigning it.
Assisted-By: Claude Sonnet 4
Change-Id: I0adca5ad00ba419a7e2aa6883b3690b4507c25e5
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
2025年11月10日 16:17:30 +01:00
Morten Stephansen
487f069ee6 Implement functionality for the is_root_volume RAID config
The is_root_volume config option has been listed in the documentation
for a while, but has not been supported by the IPA.
With this patch, if there is a logical disk
in the target_raid_config with the setting is_root_volume: True,
it will be picked up as the root device (root_device hints
will not even be checked). Additionally, if is_root_volume: False,
for a volume then it will be excluded from the list of
possible root devices used by the root_device hints.
Depends-On: https://review.opendev.org/c/openstack/ironic-python-agent/+/965797
Change-Id: If195b8f2c471cd7cf3f690664c7f13b6cef10ce2
Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch>
Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch>
2025年11月04日 13:11:31 +00:00
Morten Stephansen
bb4b4fdb38 Fix for matching hints with lists of strings
Added logic for matching hints with lists of WWN/Serial. These lists
appear when both lsblk and udev are used to fetch the information about
a device. One consequence of this is that it allows a device on the
skip list to be used as root device, thus overwriting the protected
data. This has previously been handled before matching the hints,
e.g. the removed section in hardware.py. This patch aims to fix the
problem globally by handling the issue inside the find_devices_by_hints
function.
Closes-bug: #2130410
Change-Id: I28129f2ededb37474025f35164d5dc9ece21ec8e
Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch>
Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch>
2025年11月03日 16:37:56 +00:00
6fdb13d6d4 reno: Update master for unmaintained/2024.1
Update the 2024.1 release notes configuration to build from
unmaintained/2024.1.
Change-Id: I10313828ec74436cb1cc1111f40b719e77b0767d
Signed-off-by: OpenStack Release Bot <infra-root@openstack.org>
Generated-By: openstack/project-config:roles/copy-release-tools-scripts/files/release-tools/change_reno_branch_to_unmaintained.sh
2025年10月31日 12:04:46 +00:00
Zuul
f0888131c1 Merge "Fix skip block devices for RAID arrays" 2025年09月30日 09:17:30 +00:00
Zuul
f770d6004e Merge "Fix erasable devices check" 2025年09月29日 23:16:39 +00:00
kubajj
d43913453b Fix skip block devices for RAID arrays
The original implementation of the skip block devices for RAID arrays:
https://review.opendev.org/c/openstack/ironic-python-agent/+/852999
introduced a couple bugs which were uncaught:
1. Key error when a holder disk contains just logical disks on the skip list.
2. RAID arrays on skip list throw "Failed to remove partitions" because they are not removed from the list of remaining RAID devices when running wipefs
3. list_block_devices_check_skip_list does not match volume names to RAID arrays
4. MD superblock wrongly checked (detail instead of examine)
5. Partition tables are being created when a partition is on a skip list
6. EFI partition handling in a scenario when a partition on the same physical disk is not deleted
Closes-bug: #2080871
Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch>
Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch>
Change-Id: I59b65c6b69af2385ed8a5dcd427e4d9c91f90abe
2025年09月26日 12:17:55 +00:00
Riccardo Pittau
26b7d6f300 Remove support for Python 3.9
Support for Python 3.9 has been removed. Now Python 3.10 is the minimum
version supported.
Change-Id: If1066d05e905ef4c639278d0457177875e97871d
Signed-off-by: Riccardo Pittau <elfosardo@gmail.com>
2025年09月24日 09:27:53 +02:00
Jakub Jelinek
f14c187a64 Fix erasable devices check
There is a conditional which is supposed to check whether there are
any erasable devices. However, in the current state, the conditional
is wrong as the call is missing the node as a parameter.
Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch>
Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch>
Change-Id: I38768b9ba3dc1bb5160e5841865450a8d7df5466
2025年09月17日 14:22:52 +00:00
ae3dda4e5b Update master for stable/2025.2
Add file to the reno documentation build to show release notes for
stable/2025.2.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/2025.2.
Sem-Ver: feature
Change-Id: Ia5e895659db836f43c214568bc249bcb529ed079
Signed-off-by: OpenStack Release Bot <infra-root@openstack.org>
Generated-By: openstack/project-config:roles/copy-release-tools-scripts/files/release-tools/add_release_note_page.sh
2025年09月12日 08:46:44 +00:00
Zuul
ffafce66ca Merge "Support transport type as a root device hint" 2025年09月03日 22:24:16 +00:00
Zuul
f46f56decc Merge "Hard stop on image download duration threshold" 2025年07月31日 20:03:08 +00:00
Kaifeng Wang
2e4172a024 Support transport type as a root device hint
Adds a tran field to the block device and allow to use it
as a root device hint.
Change-Id: I3fc83730a6100abb2b2aa98fc894713ecbbe3043
Closes-Bug: #2100951
Signed-off-by: Kaifeng Wang <kaifeng.w@gmail.com>
2025年07月24日 16:36:19 +08:00
Zuul
5a96e0a937 Merge "Vendor own option for tls cert file and key file" 2025年07月23日 19:02:17 +00:00
Afonne-CID
e1a31eb97a Hard stop on image download duration threshold
Adds a wall timeout `image_download_max_timeout` to enforce an upper
bound on total download duration.
While the per-chunk timeout protects against stalled reads, downloads
that trickle in just under the timeout threshold (e.g., due to heavy
TCP retransmits) can hang for longer than intended.
Now, if the total allowed time is exceeded, the download is aborted with
a non-retryable `ImageDownloadTimeoutError` regardless of per-chunk
retry or connection success.
A value of 0 (the default) disables this feature.
Closes-Bug: #2115995
Change-Id: I3b56d21abae0488853bfed14072ba21116d47baf
Signed-off-by: Afonne-CID <afonnepaulc@gmail.com>
2025年07月21日 22:56:05 +01:00
Zuul
c3ef9a563d Merge "Fix software RAID creation on different physical devices" 2025年07月15日 18:29:00 +00:00
Takashi Kajinami
a2739f7e56 Vendor own option for tls cert file and key file
... instead of using oslo.service. Current usage of oslo.service is
too limited to add the dependency, because
 - oslo.service registers multiple options but only two of these are
 used
 - the wrap implementation from oslo.service is not actually used
Change-Id: I4e8f18951d73e329a54cf6546344c5704fe4aa90
Signed-off-by: Takashi Kajinami <kajinamit@oss.nttdata.com>
2025年07月05日 22:07:34 +09:00
Dmitry Tantsur
9db3cd1e4d Graceful way for hardware managers to ignore certain devices
My use case for this feature is to exclude network devices that use
the cdc_ether driver. These USB network interfaces often cause all sorts
of issues. For example, some models have the same hardcoded MAC address,
which breaks inspection.
Currently, to exclude a certain device, a hardware manager must override
the entire listing function (in my case, list_interfaces). Not only is
it tedious, but it also requires constantly updating the hardware
managers to match the implementation in GenericHardware. Realistically,
it will cause hardware manager authors to inherit GenericHardware, which
is the opposite of how hardware managers should be written.
Note that the node-level skip list only affects root device selection
and cleaning for block devices. This feature affects everything that
uses list_block_devices and is applied before the node-level skip list.
This change adds a new hardware manager call filter_device. For each
network, block or USB device, it allows a hardware manager to do either
of four things:
1. Delegate the decision to a lower level hardware manager by raising
 IncompatibleHardwareMethodError
2. Remove the device by returning None
3. Change the device by returning a modified instance
4. Return the device unchanged to keep it in the listing.
Note that I'm removing debug logging when IncompatibleHardwareMethodError
is raised. Not only the log message is incorrect (the error does not
necessarily mean that the method is not implemented at all), it already
noticeable space in the logs, and with this change will become very
noisy.
Change-Id: I5437343af6c6157882bcf0600dd89bd20478c948
Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
2025年07月04日 16:31:02 +02:00
Dmitry Tantsur
9426df9ab3 Split hardware manager initialize out of evaluate_hardware_support
The current code in GenericHardware.evaluate_hardware_support ends up
using hardware manager calls, which then use partly initialized hardware
manager list and can even cause a recursion.
This change introduces a new optional call initialize() which is
guaranteed to run:
1) After all hardware managers have been evaluated
2) After the hardware manager cache is populated
3) In the order of the support level of hardware managers
Change-Id: I068d3d73483c161062aa3b48f3154a2d99941382
Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
2025年07月04日 16:30:40 +02:00
Dmitry Tantsur
521811cbcc Fix software RAID creation on different physical devices
When creating multiple software RAID logical disks that use different
sets of physical devices, the partition indices were incorrectly shared
across all devices. This caused the second RAID array creation to fail
because it tried to use partition indices that didn't exist on those
specific devices.
This change fixes the issue by tracking partition indices separately for
each physical device, ensuring that each device's partitions are numbered
correctly starting from their first available index.
Closes-Bug: #2115211
Change-Id: I440db4654f3d1d54274d1eee8c4b21c2b0a18d22
Signed-off-by: Mohammed Naser <mnaser@vexxhost.com>
2025年06月25日 16:15:14 +00:00
Zuul
b51cc75ff3 Merge "netutils: Use ethtool ioctl to get permanent mac address" 2025年05月07日 21:53:20 +00:00
Nicolas Belouin
48422a532f netutils: Use ethtool ioctl to get permanent mac address
Fetching the permanent MAC address of the interface instead of the
default one allows to get the right one in case it got changed during
setup (likely with a bonding setup).
In order to fetch the permanent MAC address of a given interface, one
can either use Netlink (either rtnetlink or ethtool), or use ethtool
ioctl.
The use of ioctl feels simpler and requires no additional dependency.
The implementation falls back to older behavior should an error occur.
Closes-Bug: #2103450
Change-Id: I54151990e396ddcf775128ca24d3db08e45c256d
Signed-off-by: Nicolas Belouin <nicolas.belouin@suse.com>
2025年04月25日 12:06:29 +02:00
cid
c03021fee2 Remove eventlet from Ironic Python Agent
This change removes several usages of eventlet from IPA:
- Upgrades all requirements on oslo library versions to new ones that
 support non-eventlet use.
- Removes use of the eventlet wsgi server (via oslo_service.wsgi) and
 replaces it with the cheroot wsgi server.
- Removes explicit patching of python modules with eventlet
Note that due to some oslo libraries still using ``eventlet`` to detect
and workaround it's use. This means that it is still installed in
environments alongside IPA, even if it's not used or patched into any
modules.
Depends-On: https://review.opendev.org/c/openstack/requirements/+/947727
Change-Id: I9accab2d5e9529a88ef5d3db85e76901f14114eb
2025年04月23日 11:01:10 -07:00
Zuul
53349cc7cf Merge "Remove agent_token_required upgrade knob" 2025年04月08日 20:38:18 +00:00
ac85195b7a Update master for stable/2025.1
Add file to the reno documentation build to show release notes for
stable/2025.1.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/2025.1.
Sem-Ver: feature
Change-Id: I259249774c39e95b214e77b2ae632c7278e78754
2025年03月18日 17:14:28 +00:00
Julia Kreger
94fde4b3b4 Remove agent_token_required upgrade knob
To help ease upgrades to Victoria, IPA had a knob added
to enable operators to express if agent tokens were required
in their deployment. Since then, the feature is required, however
we left the logic enabling the fun upgrade case handling.
At this point, this knob serves no further use, and can be removed.
Change-Id: I202f06e1b6598a802c9853fb99201c55e7a40cb1
2025年02月18日 14:36:18 +00:00
Julia Kreger
a6ca65201a Lockout agent command results if a token is received
This is a second attempt at securing the get command output endpoint
which could have data such as logs which could potentially have
sensitive details and information after the agent has completed
one or more actions.
Now, if a token is receieved, the agent locks out the command results
endpoint, and requires all future calls to include it.
This allows for the agent to be backwards compatible.
Special thanks go to cid for his first attempt at this, which I took
for the basis of some of the testing required.
Closes-Bug: #2086866
Co-Authored-By: cid@gr-oss.io
Change-Id: Ia39a3894ef5efaffd7e1d22cc6244059a32175ff
2025年02月18日 06:32:48 -08:00
Zuul
8ab0bfbd9b Merge "Revert "Add token validation to GET command endpoints"" 2025年02月17日 18:35:53 +00:00
Dmitry Tantsur
3968715908 Revert "Add token validation to GET command endpoints"
This reverts commit 6f860995c6.
Reason for revert: the change has broken virtually everyone who
has not updated Ironic before IPA. To make the matter worse, the
attached release note is not descriptive and does not explain
the upgrade impact.
The reverted change should be reworked to allow a graceful period.
Change-Id: I2a2a03dd8409af900b938494ceafd45a89e0c197
2025年02月17日 13:40:19 +00:00
Zuul
3261052f5d Merge "follow-up: update release note for bootable container work" 2025年02月14日 22:46:58 +00:00
Zuul
2e9964e126 Merge "Add token validation to GET command endpoints" 2025年02月14日 22:46:56 +00:00
cid
a42980a016 Ensure IPA is locked down in rescue mode
Securely handle state transition by locking down IPA at the final
stage of rescue operation to prevent restarts on tenant networks.
Closes-Bug: #2086865
Change-Id: I8e1be8da93a8c3fdf3cff7ad386c702d970d15f1
2025年02月14日 18:18:50 +01:00
cid
6f860995c6 Add token validation to GET command endpoints
Currently, we only validate authentication tokens for POST but not
for GET requests which could mean anyone can retrieve command results
without authentication. Adding that uniformly across all command-related
endpoints.
Closes-Bug: #2086866
Depends-On: https://review.opendev.org/c/openstack/ironic/+/941607
Change-Id: Ib7f58b1694273beeb25314984c6e049376244d86
2025年02月13日 23:28:56 +00:00
Julia Kreger
c8763bba06 follow-up: update release note for bootable container work
Updates the release note for the bootable container work to
clarify the existence of the configuration option which can
be utilized to disable bootable container deployments in the
ramdisk.
Change-Id: I5b269947884c015db38cf98ac782472a62858455
2025年02月12日 06:39:47 -08:00
Zuul
a6d1921056 Merge "Bootable container support" 2025年02月10日 19:26:34 +00:00
Julia Kreger
1508cc4cd0 Bootable container support
Adds support for bootable containers to be deployed by the agent.
Related: https://review.opendev.org/c/openstack/ironic/+/937897
Change-Id: I66cb37d117d2afc335f015fb1fc31bdbd5c3cee5
2025年02月07日 15:59:48 -08:00
Kaifeng Wang
96bf1ef012 Collect bus and driver for interfaces
It's useful to have pci bus address/driver collected, the operator can
use the information to configure portgroup in a consistent way.
Change-Id: I432bca881ad881bae6d5e67c9b6fb52fe55b4e1e
2025年02月01日 15:22:26 +08:00
Zuul
0c35e7e2da Merge "Add support for burnin-gpu" 2025年01月29日 19:20:10 +00:00
kubajj
018a5f6253 Fix errors in the function erase_devices_express
Prevents the UnboundLocalError in erase_devices_express clean step.
Closes-Bug: #2095499
Change-Id: I01ce5005a62638ff960d2a75f225f882b2d56973
2025年01月22日 14:17:30 +00:00
Zuul
ca07e941cf Merge "Add a release note for 939340" 2025年01月17日 19:40:39 +00:00
cid
c222626b01 Treat 'No space left on device' error as fatal
Fail without retries when Errno 28 - "No space left
on device" error is encountered.
Closes-Bug: #2094854
Change-Id: Ie84b422916ddc02f2474164fe3da083324ef4824
2025年01月17日 11:13:01 +01:00
kubajj
2ece938671 Add a release note for 939340
Follow-up to 939340 to add a release note about the bug-fix.
Change-Id: I202f22d40776ab5d3245b8e14021d1404a9f478d
2025年01月16日 09:34:08 +00:00
cid
dfcb86d738 Add support for burnin-gpu
Adds support for running burnin tests on GPUs
using gpu-burn[1]. Also refactors stress-ng code
to be a bit cleaner.
Requires gpu-burn to be pre-installed within the IPA.
* https://github.com/wilicc/gpu-burn
Co-Authored-By: Scott Solkhon <scottsolkhon@gmail.com>
Closes-Bug: #2069085
Change-Id: I8f8cace6ebc2b7f1c245c82a64609cdfc1c492f9
2025年01月03日 17:59:31 +00:00
Zuul
06077cb88e Merge "Inventoried MAC address for only ipv6 addresses" 2024年12月04日 19:09:09 +00:00