2c6cf7cf1f181d60a8abac1b6e910c516dfa73b6
132 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Marek Skrobacki
|
82fef2db31 |
fix: docs(troubleshooting) update deprecated password hash algorithm
MD5 password hashes are no longer supported on modern Debian-based systems, as PAM removed MD5 support starting with Debian 11 (2021) [1]. When MD5 is used, no error is shown, but authentication to IPA silently fails. This change updates the documentation to use the more modern SHA-512 algorithm. [1]: https://metadata.ftp-master.debian.org/changelogs/main/p/pam/unstable_changelog Change-Id: I398b771c28ed65e71b279d48cb504afb8383e525 Signed-off-by: Marek Skrobacki <skrobul@skrobul.com> |
||
|
kubajj
|
d43913453b |
Fix skip block devices for RAID arrays
The original implementation of the skip block devices for RAID arrays: https://review.opendev.org/c/openstack/ironic-python-agent/+/852999 introduced a couple bugs which were uncaught: 1. Key error when a holder disk contains just logical disks on the skip list. 2. RAID arrays on skip list throw "Failed to remove partitions" because they are not removed from the list of remaining RAID devices when running wipefs 3. list_block_devices_check_skip_list does not match volume names to RAID arrays 4. MD superblock wrongly checked (detail instead of examine) 5. Partition tables are being created when a partition is on a skip list 6. EFI partition handling in a scenario when a partition on the same physical disk is not deleted Closes-bug: #2080871 Signed-off-by: Jakub Jelinek <jakub.jelinek@cern.ch> Signed-off-by: Morten Stephansen <morten.kaastrup.stephansen@cern.ch> Change-Id: I59b65c6b69af2385ed8a5dcd427e4d9c91f90abe |
||
|
Afonne-CID
|
b4ae46d24a |
doc: How hardware managers ignore certain devices
Overriding or implementing `filter_device` activates filtering. Otherwise, GenericHardwareManager returns the device unchanged, effectively skipping filtering. Related-Bug: #2117234 Change-Id: Ifdda007e0c5001ab7df38c2c510e3f40c110d03c Signed-off-by: Afonne-CID <afonnepaulc@gmail.com> |
||
|
Zuul
|
943cb8afff | Merge "Graceful way for hardware managers to ignore certain devices" | ||
|
Julia Kreger
|
f9ae319fdd |
docs: remove tinyipa references
We have started down the path of eradicating tinyipa usage and testing because it was bifrucating our contirbutor resources and focus now that we have also been able to fix some of the CI jobs to be a bit more scalable. This does mean we're doing more with dib based images and they are larger, but we're willing ot pay that tax as a project for consistency and CI job stability. Change-Id: I8f96d106a85f6ab4493785e88955196da08af8e9 Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com> |
||
|
Dmitry Tantsur
|
9db3cd1e4d |
Graceful way for hardware managers to ignore certain devices
My use case for this feature is to exclude network devices that use the cdc_ether driver. These USB network interfaces often cause all sorts of issues. For example, some models have the same hardcoded MAC address, which breaks inspection. Currently, to exclude a certain device, a hardware manager must override the entire listing function (in my case, list_interfaces). Not only is it tedious, but it also requires constantly updating the hardware managers to match the implementation in GenericHardware. Realistically, it will cause hardware manager authors to inherit GenericHardware, which is the opposite of how hardware managers should be written. Note that the node-level skip list only affects root device selection and cleaning for block devices. This feature affects everything that uses list_block_devices and is applied before the node-level skip list. This change adds a new hardware manager call filter_device. For each network, block or USB device, it allows a hardware manager to do either of four things: 1. Delegate the decision to a lower level hardware manager by raising IncompatibleHardwareMethodError 2. Remove the device by returning None 3. Change the device by returning a modified instance 4. Return the device unchanged to keep it in the listing. Note that I'm removing debug logging when IncompatibleHardwareMethodError is raised. Not only the log message is incorrect (the error does not necessarily mean that the method is not implemented at all), it already noticeable space in the logs, and with this change will become very noisy. Change-Id: I5437343af6c6157882bcf0600dd89bd20478c948 Signed-off-by: Dmitry Tantsur <dtantsur@protonmail.com> |
||
|
cid
|
91f520356d |
Doc: Fix incorrect function in example code
The referenced bug looks mostly fixed. This patch is basically closing it. Closes-Bug: #2039072 Change-Id: I22b80f2c995c365e9f19c3a06c80656cb6ce8922 |
||
|
Doug Goldstein
|
fbb12a2f22 |
fix sphinx errors with incorrect backticks
In these cases two backticks must be used instead of one. Change-Id: I85b00742a06ad1137a2d8f761432af97338995bb Signed-off-by: Doug Goldstein <cardoe@cardoe.com> |
||
|
Jay Faulkner
|
75abdb4148 |
Vendor metrics library from Ironic-Lib & deprecate
We are phasing out use of ironic-lib, and as such are removing the metrics module from it. However, due to it's requirement of having a statsd instance on the same subnet as the agent and there being no support for prometheus exporting of metrics from IPA, these metrics are no longer valuable (in the agent). We are vendoring the module for the deprecation in order to facilitate its removal from ironic-lib. Change-Id: Ie50e078bc3f78d65cfa53680dc4116d1119ce155 |
||
|
Jay Faulkner
|
b173ce9202 |
[doc] Clarify Step return values
Clarifying what we require for a return value in a cleaning step; basically not much. Change-Id: I28c26d5b2d32d7af8d97900eb029741c8dbb166f |
||
|
Doug Goldstein
|
4cea26f185 |
update dynamic-login to mention the sshkey option
The docs mentioned using the SSH key option but didn't say what it was. Added it and reflowed the section to make it more clear that the options are one or the other and the steps that need to happen. Change-Id: I8663379d51e5e946915cb9236ccbccb26660bcc4 |
||
|
Takashi Kajinami
|
cb58c31c84 |
Remove old excludes
These are detected as errors since the clean up was done[1] in the requirements repository. [1] 314734e938f107cbd5ebcc7af4d9167c11347406 Also remove the note about old pip's behavior because the resolver in recent pip no longer requires specific order. Change-Id: If927d65ff67527cab349e5d5249aa97ef5b0aca4 |
||
|
Zuul
|
b6075156b3 | Merge "USB device discovery" | ||
|
Damien Rannou
|
3fd68c0848 |
USB device discovery
The idea is to retreive USB devices informations via 'lshw' and return the list to ironic in order to be able to create introspection rules based on USB devices. Change-Id: I39d60cb467614fca7a7f701dbe576154213580a5 |
||
|
Zuul
|
6d35c1e949 | Merge "Make inspection URL optional if the collectors are provided" | ||
|
Zuul
|
1e107bd625 | Merge "Add support for reporting CPU socket number" | ||
|
Kaifeng Wang
|
9cafe76225 |
Add support for reporting CPU socket number
IPA reports a few cpu fields including cores, arch, flags etc. There is a need that user wants to utilize the physical number in a baremetal since cores are just a logical representation of the compute resource. The socket number is more suitable for the quota control in some use cases. Change-Id: I94be86d6b12a3a7e7ca1041d948427a073412a31 |
||
|
Dmitry Tantsur
|
6cd36a750f |
Make inspection URL optional if the collectors are provided
With the new in-band inspection, we can derive the callback URL from the Ironic URL, there is no need to duplicate it. This change uses the presence of collectors as a sign to run inspection. The previous approach of setting an inspection URL, with or without explicitly setting collectors, still works for compatibility with ironic-inspector. Change-Id: Ie4279ee6d2995c9686f1dcdef1d6e5dc1dd20871 |
||
|
Zuul
|
d298e06b49 | Merge "[codespell] Fix spelling issues in IPA" | ||
|
Jay Faulkner
|
36e5993a04 |
[codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some cases, I may have manually modified a line to make the output more clear or to correct grammatical issues which were obvious in the codespell output. Later changes in this chain will provide the codespell config used to generate this, as well as adding this commit's SHA, once landed, to a .git-blame-ignore-revs file to ensure it will not pollute git historys for modern clients. Related-Bug: 2047654 Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada |
||
|
Dmitry Tantsur
|
91b7ae96c9 |
Reformat and update the section on injecting root credentials
Change-Id: I49ad9979daad11bf7a54069564c6b7919de0ea7c |
||
|
Michal Nasiadka
|
c23c913fc2 |
docs: improve rootpwd password generation command
Currently the command is a bit misleading because you need to escape dollar ($) signs. Command copied from DIB dynamic-log element docs [1]. [1]: https://docs.openstack.org/diskimage-builder/latest/elements/dynamic-login/README.html Change-Id: I7d5dc60aec373372f8faae4242a79f18d8a26d14 |
||
|
Julia Kreger
|
eb95273ffb |
Add get_service_steps logic to the agent
Initial code patches for service steps have merged in ironic, and it is now time to add support into the agent which allows service steps to be raised to the service. Updates the default hardware manager version to 1.2, which has *rarely* been incremented due to oversight. Change-Id: Iabd2c6c551389ec3c24e94b71245b1250345f7a7 |
||
|
Zuul
|
4845fd04ba | Merge "Follow-up Add documentation for MellanoxDeviceHardwareManager" | ||
|
waleedm
|
406c844aac |
Follow-up Add documentation for MellanoxDeviceHardwareManager
Add a follow-up documentation for "update NVIDIA NIC firmware images and settings by ironic-python-agent" Icfaffd7c58c3c73c3fa28cfc2a6c954d2c93c16e Change-Id: I481cdd622f360cbba3312c6f3d4af45383bb7e1b |
||
|
Jay Faulkner
|
6098747ec5 |
Ironic (and IPA) use launchpad now
Correct links to point to launchpad bug tracker, correct docs config Change-Id: I5d46af2a9d94f3b2e05e4f937e0619a89fe04d4c |
||
|
Dmitry Tantsur
|
9ed232e77e |
Add network interface speed to the inventory
This is another fact that Metal3's baremetal-operator is currently consuming from extra-hardware. Change-Id: I2ec9d5e9369f5508e7583a4e13c2083f5c8b28ba |
||
|
Zuul
|
f37ea85a27 | Merge "Disable MD5 image checksums" | ||
|
Dmitry Tantsur
|
3e05a03f7c |
Deprecate LLDP in inventory in favour of a new collector
Binary LLDP data is bloating inventory causing us to disable its collection by default. For other similar low-level information, such as PCI devices or DMI data, we already use inspection collectors instead. Now that the inventory format is shared with out-of-band inspection, having LLDP there makes even less sense. This change adds a new collector ``lldp`` to replace the now-deprecated inventory field. Change-Id: I56be06a7d1db28407e1128c198c12bea0809d3a3 |
||
|
Julia Kreger
|
32df26a22a |
Disable MD5 image checksums
MD5 image checksums have long been supersceeded by the use of a ``os_hash_algo`` and ``os_hash_value`` field as part of the properties of an image. In the process of doing this, we determined that checksum via URL usage was non-trivial and determined that an appropriate path was to allow the checksum type to be determined as needed. Change-Id: I26ba8f8c37d663096f558e83028ff463d31bd4e6 |
||
|
Dmitry Tantsur
|
0304c73c0e |
Report system firmware information in the inventory
Change-Id: I5b6ceb9cdcf4baa97a6f0482d1030d14f3f2ecff |
||
|
Dmitry Tantsur
|
2ddb693491 |
Trivial: formatting issue in the inventory docs
Double ticks don't work if followed by a symbol without space. Change-Id: Ia455650b5e601dadb2b0ab91f71e1d9286d26071 |
||
|
liuyuanfeng
|
1846d6f776 |
modify error word node
Change-Id: Ie5c9fa7489eb891ef1bbe57c7d51ecb64e1c0db8 |
||
|
Jakub Jelinek
|
a99bf274e4 |
SoftwareRAID: Enable skipping RAIDS
Extend the ability to skip disks to RAID devices This allows users to specify the volume name of a logical device in the skip list which is then not cleaned or created again during the create/apply configuration phase The volume name can be specified in target raid config provided the change https://review.opendev.org/c/openstack/ironic-python-agent/+/853182/ passes Story: 2010233 Change-Id: Ib9290a97519bc48e585e1bafb0b60cc14e621e0f |
||
|
niuke
|
4bf88b204f |
remove unicode prefix from code
Change-Id: I70f0112f1ee3066ffd9316d10b84b9ea5b7fc306 |
||
|
Jakub Jelinek
|
1ac61e1dbd |
Improve function list_block_devices_check_skip_list
Fix minor issues suggested by dtantsur Add an example of skip list specification to the documentation A follow-up patch to I3bdad3cca8acb3e0a69ebb218216e8c8419e9d65 Change-Id: Ic94a33b7bc0572a1cc8f92b330474ec63a173e81 |
||
|
Jakub Jelinek
|
0212337bd5 |
Enable skipping disks for cleaning
Introduce a field skip_block_devices in properties - this is a list of dictionaries Create a helper function list_block_devices_check_skip_list Update tests of erase_devices_express to use node when calling _list_erasable_devices Add tests covering various options of the skip list definition Use the helper function in get_os_install_device when node is cached Story: 2009914 Change-Id: I3bdad3cca8acb3e0a69ebb218216e8c8419e9d65 |
||
|
Julia Kreger
|
beb7484858 |
Guard shared device/cluster filesystems
Certain filesystems are sometimes used in specialty computing environments where a shared storage infrastructure or fabric exists. These filesystems allow for multi-host shared concurrent read/write access to the underlying block device by *not* locking the entire device for exclusive use. Generally ranges of the disk are reserved for each interacting node to write to, and locking schemes are used to prevent collissions. These filesystems are common for use cases where high availability is required or ability for individual computers to collaborate on a given workload is critical, such as a group of hypervisors supporting virtual machines because it can allow for nearly seamless transfer of workload from one machine to another. Similar technologies are also used for cluster quorum and cluster durable state sharing, however that is not specifically considered in scope. Where things get difficult is becuase the entire device is not exclusively locked with the storage fabrics, and in some cases locking is handled by a Distributed Lock Manager on the network, or via special sector interactions amongst the cluster members which understand and support the filesystem. As a reult of this IO/Interaction model, an Ironic-Python-Agent performing cleaning can effectively destroy the cluster just by attempting to clean storage which it percieves as attached locally. This is not IPA's fault, often this case occurs when a Storage Administrator forgot to update LUN masking or volume settings on a SAN as it relates to an individual host in the overall computing environment. The net result of one node cleaning the shared volume may include restoration from snapshot, backup storage, or may ultimately cause permenant data loss, depending on the environment and the usage of that environment. Included in this patch: - IBM GPFS - Can be used on a shared block device... apparently according to IBM's documentation. The standard use of GPFS is more Ceph like in design... however GPFS is also a specially licensed commercial offering, so it is a red flag if this is encountered, and should be investigated by the environment's systems operator. - Red Hat GFS2 - Is used with shared common block devices in clusters. - VMware VMFS - Is used with shared SAN block devices, as well as local block devices. With shared block devices, ranges of the disk are locked instead of the whole disk, and the ranges are mapped to virtual machine disk interfaces. It is unknown, due to lack of information, if this will detect and prevent erasure of VMFS logical extent volumes. Co-Authored-by: Jay Faulkner <jay@jvf.cc> Change-Id: Ic8cade008577516e696893fdbdabf70999c06a5b Story: 2009978 Task: 44985 |
||
|
waleedm
|
eb07839bd4 |
Fix passing kwargs in clean steps
Pass kwargs to dispatch_to_managers method in execute_clean_step Change-Id: Ida4ed4646659b2ee3f8f92b0a4d73c0266dd5a99 Story: 2010123 Task: 45705 |
||
|
Arne Wiebalck
|
cacdd9bab3 |
Burn-in: Add network step
Add a clean step for network burn-in via fio. Get basic run parameters from the node's driver_info. Story: #2007523 Task: #42385 Change-Id: I2861696740b2de9ec38f7e9fc2c5e448c009d0bf |
||
|
Arne Wiebalck
|
20c5894bc2 |
Burn-in: Add disk step
Add a clean step for disk burn-in via fio. Get basic run parameters from the node's driver_info. Story: #2007523 Task: #42384 Change-Id: I5f5e336bd629846b3d779fd0fc7a2060b385b035 |
||
|
Arne Wiebalck
|
5c222560f0 |
Burn-in: Add memory step
Add a clean step for memory burn-in via stress-ng. Get basic run parameters from the node's driver_info. Story: #2007523 Task: #42383 Change-Id: I33a83968c9f87cf795ec7ec922bce98b52c5181c |
||
|
Arne Wiebalck
|
6702fcaa43 |
Burn-in: Add CPU step
Add a clean step for CPU burn-in via stress-ng. Get basic run parameters from the node's driver_info. Story: #2007523 Task: #42382 Change-Id: I14fd4164991fb94263757244f716b6bfe8edf875 |
||
|
Jay Faulkner
|
de726d4acf |
Do not permit IPA standalone to be enabled by conf
IPA standalone mode is a developer-only option, and if enabled accidentally on a production agent could cause undesired behavior. Developers who need this behavior should build a purpose-built agent, with standalone hardcoded to True in cmd/agent.py. Change-Id: Icc67dbe15acbbf6fee886f274d2169a0769a5053 |
||
|
Mohammed Naser
|
2220aaae57 |
Added comment about IPA logs being uploaded to Ironic
Change-Id: I983ad3bd6fff539e877844e54788f63689ce8a84 |
||
|
Dmitry Tantsur
|
59cb08fd28 |
New deploy step for injecting arbitrary files
This change adds a deploy step inject_files that adds a flexible way to inject files into the instance. Change-Id: I0e70a2cbc13744195c9493a48662e465ec010dbe Story: #2008611 Task: #41794 |
||
|
Zuul
|
4762aca077 | Merge "Add clean step 'erase_pstore'" | ||
|
Arne Wiebalck
|
92e26b01e9 |
Add clean step 'erase_pstore'
Add an automatic clean step to clean the Linux kernel's pstore. The step is disabled by default. Story: #2008317 Task: #41214 Change-Id: Ie1a42dfff4c7e1c7abeaf39feca956bb9e2ea497 |
||
|
Vladyslav Drok
|
c7858d3cc8 |
Add UUID to BlockDevice object
It'd allow for example custom ansible playbooks to use UUIDs of the introspected node's disks. In future it might also enable agent to use UUID (or by_path value) to refer to a device instead of name, as it happens currently. Change-Id: Id00437d2295c39fb12f3c25a92b30b56a58eef13 |
||
|
Dmitry Tantsur
|
565d596dae |
Document ramdisk TLS and update existing TLS docs
Story: #2007214 Task: #40945 Change-Id: I1a930a0e52ab860edcd597df4d95a4e4eb51da96 |