2c6cf7cf1f181d60a8abac1b6e910c516dfa73b6
Commit Graph

132 Commits

Author SHA1 Message Date
Marek Skrobacki
82fef2db31 fix: docs(troubleshooting) update deprecated password hash algorithm
MD5 password hashes are no longer supported on modern Debian-based systems,
as PAM removed MD5 support starting with Debian 11 (2021) [1].
When MD5 is used, no error is shown, but authentication to IPA silently fails.
This change updates the documentation to use the more modern SHA-512 algorithm.
[1]: https://metadata.ftp-master.debian.org/changelogs/main/p/pam/unstable_changelog
Change-Id: I398b771c28ed65e71b279d48cb504afb8383e525
Signed-off-by: Marek Skrobacki <skrobul@skrobul.com>
2025年10月07日 14:39:27 +01:00
kubajj
d43913453b Fix skip block devices for RAID arrays
The original implementation of the skip block devices for RAID arrays:
https://review.opendev.org/c/openstack/ironic-python-agent/+/852999
introduced a couple bugs which were uncaught:
1. Key error when a holder disk contains just logical disks on the skip list.
2. RAID arrays on skip list throw "Failed to remove partitions" because they are not removed from the list of remaining RAID devices when running wipefs
3. list_block_devices_check_skip_list does not match volume names to RAID arrays
4. MD superblock wrongly checked (detail instead of examine)
5. Partition tables are being created when a partition is on a skip list
6. EFI partition handling in a scenario when a partition on the same physical disk is not deleted
Closes-bug: #2080871
Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch>
Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch>
Change-Id: I59b65c6b69af2385ed8a5dcd427e4d9c91f90abe
2025年09月26日 12:17:55 +00:00
Afonne-CID
b4ae46d24a doc: How hardware managers ignore certain devices
Overriding or implementing `filter_device` activates filtering.
Otherwise, GenericHardwareManager returns the device unchanged,
effectively skipping filtering.
Related-Bug: #2117234
Change-Id: Ifdda007e0c5001ab7df38c2c510e3f40c110d03c
Signed-off-by: Afonne-CID <afonnepaulc@gmail.com>
2025年07月21日 14:47:11 +01:00
Zuul
943cb8afff Merge "Graceful way for hardware managers to ignore certain devices" 2025年07月11日 17:45:25 +00:00
Julia Kreger
f9ae319fdd docs: remove tinyipa references
We have started down the path of eradicating tinyipa usage and testing
because it was bifrucating our contirbutor resources and focus now that
we have also been able to fix some of the CI jobs to be a bit more
scalable.
This does mean we're doing more with dib based images and they are larger,
but we're willing ot pay that tax as a project for consistency and
CI job stability.
Change-Id: I8f96d106a85f6ab4493785e88955196da08af8e9
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
2025年07月09日 09:53:19 -07:00
Dmitry Tantsur
9db3cd1e4d Graceful way for hardware managers to ignore certain devices
My use case for this feature is to exclude network devices that use
the cdc_ether driver. These USB network interfaces often cause all sorts
of issues. For example, some models have the same hardcoded MAC address,
which breaks inspection.
Currently, to exclude a certain device, a hardware manager must override
the entire listing function (in my case, list_interfaces). Not only is
it tedious, but it also requires constantly updating the hardware
managers to match the implementation in GenericHardware. Realistically,
it will cause hardware manager authors to inherit GenericHardware, which
is the opposite of how hardware managers should be written.
Note that the node-level skip list only affects root device selection
and cleaning for block devices. This feature affects everything that
uses list_block_devices and is applied before the node-level skip list.
This change adds a new hardware manager call filter_device. For each
network, block or USB device, it allows a hardware manager to do either
of four things:
1. Delegate the decision to a lower level hardware manager by raising
 IncompatibleHardwareMethodError
2. Remove the device by returning None
3. Change the device by returning a modified instance
4. Return the device unchanged to keep it in the listing.
Note that I'm removing debug logging when IncompatibleHardwareMethodError
is raised. Not only the log message is incorrect (the error does not
necessarily mean that the method is not implemented at all), it already
noticeable space in the logs, and with this change will become very
noisy.
Change-Id: I5437343af6c6157882bcf0600dd89bd20478c948
Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com>
2025年07月04日 16:31:02 +02:00
cid
91f520356d Doc: Fix incorrect function in example code
The referenced bug looks mostly fixed. This patch is basically
closing it.
Closes-Bug: #2039072
Change-Id: I22b80f2c995c365e9f19c3a06c80656cb6ce8922
2025年03月07日 15:44:00 +01:00
Doug Goldstein
fbb12a2f22 fix sphinx errors with incorrect backticks
In these cases two backticks must be used instead of one.
Change-Id: I85b00742a06ad1137a2d8f761432af97338995bb
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
2025年01月24日 23:07:51 -05:00
Jay Faulkner
75abdb4148 Vendor metrics library from Ironic-Lib & deprecate
We are phasing out use of ironic-lib, and as such are removing the
metrics module from it. However, due to it's requirement of having
a statsd instance on the same subnet as the agent and there being no
support for prometheus exporting of metrics from IPA, these metrics are
no longer valuable (in the agent).
We are vendoring the module for the deprecation in order to facilitate
its removal from ironic-lib.
Change-Id: Ie50e078bc3f78d65cfa53680dc4116d1119ce155
2024年11月04日 20:02:11 +00:00
Jay Faulkner
b173ce9202 [doc] Clarify Step return values
Clarifying what we require for a return value in a cleaning step;
basically not much.
Change-Id: I28c26d5b2d32d7af8d97900eb029741c8dbb166f
2024年08月19日 15:35:34 +00:00
Doug Goldstein
4cea26f185 update dynamic-login to mention the sshkey option
The docs mentioned using the SSH key option but didn't say what it was.
Added it and reflowed the section to make it more clear that the options
are one or the other and the steps that need to happen.
Change-Id: I8663379d51e5e946915cb9236ccbccb26660bcc4
2024年07月31日 12:22:31 -05:00
Takashi Kajinami
cb58c31c84 Remove old excludes
These are detected as errors since the clean up was done[1] in
the requirements repository.
[1] 314734e938f107cbd5ebcc7af4d9167c11347406
Also remove the note about old pip's behavior because the resolver
in recent pip no longer requires specific order.
Change-Id: If927d65ff67527cab349e5d5249aa97ef5b0aca4
2024年04月30日 22:46:45 +09:00
Zuul
b6075156b3 Merge "USB device discovery" 2024年03月28日 21:22:53 +00:00
Damien Rannou
3fd68c0848 USB device discovery
The idea is to retreive USB devices informations via 'lshw' and
return the list to ironic in order to be able to create introspection
rules based on USB devices.
Change-Id: I39d60cb467614fca7a7f701dbe576154213580a5
2024年02月19日 14:49:52 +01:00
Zuul
6d35c1e949 Merge "Make inspection URL optional if the collectors are provided" 2024年02月07日 23:06:34 +00:00
Zuul
1e107bd625 Merge "Add support for reporting CPU socket number" 2024年01月22日 11:52:06 +00:00
Kaifeng Wang
9cafe76225 Add support for reporting CPU socket number
IPA reports a few cpu fields including cores, arch, flags etc.
There is a need that user wants to utilize the physical number in
a baremetal since cores are just a logical representation of the
compute resource.
The socket number is more suitable for the quota control in some
use cases.
Change-Id: I94be86d6b12a3a7e7ca1041d948427a073412a31
2024年01月19日 21:24:37 +00:00
Dmitry Tantsur
6cd36a750f Make inspection URL optional if the collectors are provided
With the new in-band inspection, we can derive the callback URL from
the Ironic URL, there is no need to duplicate it. This change uses
the presence of collectors as a sign to run inspection.
The previous approach of setting an inspection URL, with or without
explicitly setting collectors, still works for compatibility with
ironic-inspector.
Change-Id: Ie4279ee6d2995c9686f1dcdef1d6e5dc1dd20871
2024年01月10日 08:55:42 +01:00
Zuul
d298e06b49 Merge "[codespell] Fix spelling issues in IPA" 2024年01月08日 17:22:02 +00:00
Jay Faulkner
36e5993a04 [codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some
cases, I may have manually modified a line to make the output more clear
or to correct grammatical issues which were obvious in the codespell
output.
Later changes in this chain will provide the codespell config used to
generate this, as well as adding this commit's SHA, once landed, to a
.git-blame-ignore-revs file to ensure it will not pollute git historys
for modern clients.
Related-Bug: 2047654
Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada
2023年12月28日 10:54:46 -08:00
Dmitry Tantsur
91b7ae96c9 Reformat and update the section on injecting root credentials
Change-Id: I49ad9979daad11bf7a54069564c6b7919de0ea7c
2023年12月15日 12:34:31 +01:00
Michal Nasiadka
c23c913fc2 docs: improve rootpwd password generation command
Currently the command is a bit misleading because you need to escape
dollar ($) signs.
Command copied from DIB dynamic-log element docs [1].
[1]: https://docs.openstack.org/diskimage-builder/latest/elements/dynamic-login/README.html
Change-Id: I7d5dc60aec373372f8faae4242a79f18d8a26d14
2023年11月15日 07:52:08 +00:00
Julia Kreger
eb95273ffb Add get_service_steps logic to the agent
Initial code patches for service steps have merged in
ironic, and it is now time to add support into the
agent which allows service steps to be raised to
the service.
Updates the default hardware manager version to 1.2,
which has *rarely* been incremented due to oversight.
Change-Id: Iabd2c6c551389ec3c24e94b71245b1250345f7a7
2023年08月31日 06:22:22 -07:00
Zuul
4845fd04ba Merge "Follow-up Add documentation for MellanoxDeviceHardwareManager" 2023年05月25日 15:03:19 +00:00
waleedm
406c844aac Follow-up Add documentation for MellanoxDeviceHardwareManager
Add a follow-up documentation for
"update NVIDIA NIC firmware images and settings by ironic-python-agent"
Icfaffd7c58c3c73c3fa28cfc2a6c954d2c93c16e
Change-Id: I481cdd622f360cbba3312c6f3d4af45383bb7e1b
2023年05月25日 10:55:11 +00:00
Jay Faulkner
6098747ec5 Ironic (and IPA) use launchpad now
Correct links to point to launchpad bug tracker, correct docs config
Change-Id: I5d46af2a9d94f3b2e05e4f937e0619a89fe04d4c
2023年05月17日 15:38:57 -07:00
Dmitry Tantsur
9ed232e77e Add network interface speed to the inventory
This is another fact that Metal3's baremetal-operator is currently
consuming from extra-hardware.
Change-Id: I2ec9d5e9369f5508e7583a4e13c2083f5c8b28ba
2023年05月03日 12:20:35 +02:00
Zuul
f37ea85a27 Merge "Disable MD5 image checksums" 2023年05月02日 06:41:25 +00:00
Dmitry Tantsur
3e05a03f7c Deprecate LLDP in inventory in favour of a new collector
Binary LLDP data is bloating inventory causing us to disable its collection
by default. For other similar low-level information, such as PCI devices
or DMI data, we already use inspection collectors instead. Now that the
inventory format is shared with out-of-band inspection, having LLDP
there makes even less sense.
This change adds a new collector ``lldp`` to replace the now-deprecated
inventory field.
Change-Id: I56be06a7d1db28407e1128c198c12bea0809d3a3
2023年04月26日 19:33:51 +00:00
Julia Kreger
32df26a22a Disable MD5 image checksums
MD5 image checksums have long been supersceeded by the use of a
``os_hash_algo`` and ``os_hash_value`` field as part of the
properties of an image.
In the process of doing this, we determined that checksum via
URL usage was non-trivial and determined that an appropriate
path was to allow the checksum type to be determined as needed.
Change-Id: I26ba8f8c37d663096f558e83028ff463d31bd4e6
2023年04月24日 16:54:42 -07:00
Dmitry Tantsur
0304c73c0e Report system firmware information in the inventory
Change-Id: I5b6ceb9cdcf4baa97a6f0482d1030d14f3f2ecff
2023年03月31日 14:28:32 +02:00
Dmitry Tantsur
2ddb693491 Trivial: formatting issue in the inventory docs
Double ticks don't work if followed by a symbol without space.
Change-Id: Ia455650b5e601dadb2b0ab91f71e1d9286d26071
2023年03月30日 13:33:39 +02:00
liuyuanfeng
1846d6f776 modify error word node
Change-Id: Ie5c9fa7489eb891ef1bbe57c7d51ecb64e1c0db8
2022年12月30日 01:18:36 -08:00
Jakub Jelinek
a99bf274e4 SoftwareRAID: Enable skipping RAIDS
Extend the ability to skip disks to RAID devices
This allows users to specify the volume name of
a logical device in the skip list which is then not cleaned
or created again during the create/apply configuration phase
The volume name can be specified in target raid config provided
the change https://review.opendev.org/c/openstack/ironic-python-agent/+/853182/
passes
Story: 2010233
Change-Id: Ib9290a97519bc48e585e1bafb0b60cc14e621e0f
2022年09月05日 20:43:51 +00:00
niuke
4bf88b204f remove unicode prefix from code
Change-Id: I70f0112f1ee3066ffd9316d10b84b9ea5b7fc306
2022年08月23日 19:44:10 +08:00
Jakub Jelinek
1ac61e1dbd Improve function list_block_devices_check_skip_list
Fix minor issues suggested by dtantsur
Add an example of skip list specification to the documentation
A follow-up patch to I3bdad3cca8acb3e0a69ebb218216e8c8419e9d65
Change-Id: Ic94a33b7bc0572a1cc8f92b330474ec63a173e81
2022年08月16日 15:17:15 +00:00
Jakub Jelinek
0212337bd5 Enable skipping disks for cleaning
Introduce a field skip_block_devices in properties - this is a list of dictionaries
Create a helper function list_block_devices_check_skip_list
Update tests of erase_devices_express to use node when calling _list_erasable_devices
Add tests covering various options of the skip list definition
Use the helper function in get_os_install_device when node is cached
Story: 2009914
Change-Id: I3bdad3cca8acb3e0a69ebb218216e8c8419e9d65
2022年08月11日 09:30:00 +00:00
Julia Kreger
beb7484858 Guard shared device/cluster filesystems
Certain filesystems are sometimes used in specialty computing
environments where a shared storage infrastructure or fabric exists.
These filesystems allow for multi-host shared concurrent read/write
access to the underlying block device by *not* locking the entire
device for exclusive use. Generally ranges of the disk are reserved
for each interacting node to write to, and locking schemes are used
to prevent collissions.
These filesystems are common for use cases where high availability
is required or ability for individual computers to collaborate on a
given workload is critical, such as a group of hypervisors supporting
virtual machines because it can allow for nearly seamless transfer
of workload from one machine to another.
Similar technologies are also used for cluster quorum and cluster
durable state sharing, however that is not specifically considered
in scope.
Where things get difficult is becuase the entire device is not
exclusively locked with the storage fabrics, and in some cases locking
is handled by a Distributed Lock Manager on the network, or via special
sector interactions amongst the cluster members which understand
and support the filesystem.
As a reult of this IO/Interaction model, an Ironic-Python-Agent
performing cleaning can effectively destroy the cluster just by
attempting to clean storage which it percieves as attached locally.
This is not IPA's fault, often this case occurs when a Storage
Administrator forgot to update LUN masking or volume settings on
a SAN as it relates to an individual host in the overall
computing environment. The net result of one node cleaning the
shared volume may include restoration from snapshot, backup
storage, or may ultimately cause permenant data loss, depending
on the environment and the usage of that environment.
Included in this patch:
- IBM GPFS - Can be used on a shared block device... apparently according
 to IBM's documentation. The standard use of GPFS is more Ceph
 like in design... however GPFS is also a specially licensed
 commercial offering, so it is a red flag if this is
 encountered, and should be investigated by the environment's
 systems operator.
- Red Hat GFS2 - Is used with shared common block devices in clusters.
- VMware VMFS - Is used with shared SAN block devices, as well as
 local block devices. With shared block devices,
 ranges of the disk are locked instead of the whole
 disk, and the ranges are mapped to virtual machine
 disk interfaces.
 It is unknown, due to lack of information, if this
 will detect and prevent erasure of VMFS logical
 extent volumes.
Co-Authored-by: Jay Faulkner <jay@jvf.cc>
Change-Id: Ic8cade008577516e696893fdbdabf70999c06a5b
Story: 2009978
Task: 44985
2022年07月19日 13:24:03 -07:00
waleedm
eb07839bd4 Fix passing kwargs in clean steps
Pass kwargs to dispatch_to_managers method in execute_clean_step
Change-Id: Ida4ed4646659b2ee3f8f92b0a4d73c0266dd5a99
Story: 2010123
Task: 45705
2022年07月01日 23:03:55 +00:00
Arne Wiebalck
cacdd9bab3 Burn-in: Add network step
Add a clean step for network burn-in via fio. Get basic
run parameters from the node's driver_info.
Story: #2007523
Task: #42385
Change-Id: I2861696740b2de9ec38f7e9fc2c5e448c009d0bf
2021年07月13日 11:36:31 +02:00
Arne Wiebalck
20c5894bc2 Burn-in: Add disk step
Add a clean step for disk burn-in via fio. Get basic
run parameters from the node's driver_info.
Story: #2007523
Task: #42384
Change-Id: I5f5e336bd629846b3d779fd0fc7a2060b385b035
2021年05月21日 16:33:11 +02:00
Arne Wiebalck
5c222560f0 Burn-in: Add memory step
Add a clean step for memory burn-in via stress-ng. Get basic
run parameters from the node's driver_info.
Story: #2007523
Task: #42383
Change-Id: I33a83968c9f87cf795ec7ec922bce98b52c5181c
2021年05月01日 10:36:58 +02:00
Arne Wiebalck
6702fcaa43 Burn-in: Add CPU step
Add a clean step for CPU burn-in via stress-ng. Get basic
run parameters from the node's driver_info.
Story: #2007523
Task: #42382
Change-Id: I14fd4164991fb94263757244f716b6bfe8edf875
2021年05月01日 10:36:20 +02:00
Jay Faulkner
de726d4acf Do not permit IPA standalone to be enabled by conf
IPA standalone mode is a developer-only option, and if enabled
accidentally on a production agent could cause undesired behavior.
Developers who need this behavior should build a purpose-built agent,
with standalone hardcoded to True in cmd/agent.py.
Change-Id: Icc67dbe15acbbf6fee886f274d2169a0769a5053
2021年03月25日 12:45:28 +01:00
Mohammed Naser
2220aaae57 Added comment about IPA logs being uploaded to Ironic
Change-Id: I983ad3bd6fff539e877844e54788f63689ce8a84
2021年03月01日 11:40:51 -05:00
Dmitry Tantsur
59cb08fd28 New deploy step for injecting arbitrary files
This change adds a deploy step inject_files that adds a flexible
way to inject files into the instance.
Change-Id: I0e70a2cbc13744195c9493a48662e465ec010dbe
Story: #2008611
Task: #41794 
2021年02月16日 16:56:52 +01:00
Zuul
4762aca077 Merge "Add clean step 'erase_pstore'" 2020年11月18日 17:38:00 +00:00
Arne Wiebalck
92e26b01e9 Add clean step 'erase_pstore'
Add an automatic clean step to clean the Linux kernel's pstore.
The step is disabled by default.
Story: #2008317
Task: #41214
Change-Id: Ie1a42dfff4c7e1c7abeaf39feca956bb9e2ea497
2020年11月17日 18:00:16 +01:00
Vladyslav Drok
c7858d3cc8 Add UUID to BlockDevice object
It'd allow for example custom ansible playbooks to use UUIDs of the
introspected node's disks. In future it might also enable agent
to use UUID (or by_path value) to refer to a device instead of
name, as it happens currently.
Change-Id: Id00437d2295c39fb12f3c25a92b30b56a58eef13
2020年11月11日 17:25:59 +00:00
Dmitry Tantsur
565d596dae Document ramdisk TLS and update existing TLS docs
Story: #2007214
Task: #40945
Change-Id: I1a930a0e52ab860edcd597df4d95a4e4eb51da96
2020年09月23日 15:07:49 +02:00