d298e06b4996f9efe41cee541e34d21a789d014b
2483 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
Zuul
|
d298e06b49 | Merge "[codespell] Fix spelling issues in IPA" | ||
|
Zuul
|
f1a4aeb29a | Merge "Update to latest pep8/code style versions" | ||
|
Zuul
|
7422a27de4 | Merge "Reformat and update the section on injecting root credentials" | ||
|
Jay Faulkner
|
dcaed43ef9 |
Update to latest pep8/code style versions
Update various linting programs to their latest version, and fix any issues created by the update. Change-Id: I014c846560663a76a1663b568ef48659d0ab6d4d |
||
|
Jay Faulkner
|
36e5993a04 |
[codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some cases, I may have manually modified a line to make the output more clear or to correct grammatical issues which were obvious in the codespell output. Later changes in this chain will provide the codespell config used to generate this, as well as adding this commit's SHA, once landed, to a .git-blame-ignore-revs file to ensure it will not pollute git historys for modern clients. Related-Bug: 2047654 Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada |
||
|
Iury Gregory Melo Ferreira
|
03b6b0a4ab |
Fix inspector retries to not take a long time
Since we moved to exponential wait we increased the amount of time to run unit tests, now we can configure the max time to wait - before: Ran: 33 tests in 22.6581 sec. - after: Ran: 33 tests in 4.0256 sec. Change-Id: Ibdcfebacad0489d17183e43ceb0d603fce67e72b |
||
|
Dmitry Tantsur
|
91b7ae96c9 |
Reformat and update the section on injecting root credentials
Change-Id: I49ad9979daad11bf7a54069564c6b7919de0ea7c |
||
|
Zuul
|
3a757f721f | Merge "docs: improve rootpwd password generation command" | ||
|
Dmitry Tantsur
|
2ab8364649 |
Add a jitter to heartbeat retries
Currently, if heartbeat fails, we reschedule it after 5 seconds. This is fine for the first retry, but it can cause a thundering herd problem when a lot of nodes fail to heartbeat at once. This change adds jitter to the minimum wait of 5 seconds. The jitter is not applied for forced heartbeats: they still have a minimum wait of exactly 5 seconds from the last heartbeat. The code is re-ordered to move the interval calculation to one place. Bonus: correctly logging the next interval. The unit tests have been rewritten to test the heartbeat process step by step and not rely on the exact sequence of the calls. Closes-Bug: #2038438 Change-Id: I4c4207b15fb3d48b55e340b7b3b54af833f92cb5 |
||
|
Zuul
|
62041d6d9e | Merge "Fix referencing to the raid_device var which is not set" | ||
|
Iury Gregory Melo Ferreira
|
801da9ec1f |
Retry in ProxyError during post inspector data
* ProxyError is derived from ConnectionError, but it's necessary to check the Response object to identify. - Added ProxyError in retry_if_exception_type - Updated _post_to_inspector to proper handle ProxyError - Updated the wait to use wait_exponential instead of wait_fixed. Closes-Bug: 2045429 Change-Id: Iefe3fe581cd4e7c91a0da708e6f6d0fdaacab6fe |
||
|
Zuul
|
beccfe8c92 | Merge "Revert "Fix vmedia network config drive handling"" | ||
|
Dmitry Tantsur
|
c57deb7e76 |
Revert "Fix vmedia network config drive handling"
This reverts commit
|
||
|
Maryna Savchenko
|
f80330839d |
Fix referencing to the raid_device var which is not set
Change-Id: I11180e5d61d893a78583ace555f6e90ba8845950 |
||
|
Zuul
|
61d17e2225 | Merge "Parse efibootmgr type and details" | ||
|
Zuul
|
eea9917023 | Merge "Fix vmedia network config drive handling" | ||
|
Steve Baker
|
352df0bc54 |
Parse efibootmgr type and details
This change improves the regex to match an exact entry name, and to also match with the the entry type from a set of recognised types. The boot entry details start from the recognised type onwards. This can be used by a step which deletes all entries of type 'HW' and UsbClass. Related-Bug: #2041901 Change-Id: I5d879f724efc2919b541fd3fef0f931df67ff9c7 |
||
|
Zuul
|
768aa17442 | Merge "Add mlnx deploy_step entry to enable deploy time firmware" 9.8.0 | ||
|
Zuul
|
7a4114512c | Merge "Handle different device outputs for multipath" | ||
|
Zuul
|
9f9940efdc | Merge "Test coverage for efi_utils.get_boot_record" | ||
|
Iury Gregory Melo Ferreira
|
0a29206b8d |
Handle different device outputs for multipath
In some cases the output of the multipath can differ and we would return a wrong parent device. Closes-Bug: 2043992 Change-Id: I848d7df798cc736bd5a55eed8fa46110caea1dc3 |
||
|
Michal Nasiadka
|
c23c913fc2 |
docs: improve rootpwd password generation command
Currently the command is a bit misleading because you need to escape dollar ($) signs. Command copied from DIB dynamic-log element docs [1]. [1]: https://docs.openstack.org/diskimage-builder/latest/elements/dynamic-login/README.html Change-Id: I7d5dc60aec373372f8faae4242a79f18d8a26d14 |
||
|
Adam Rozman
|
7a52314695 |
fix multipathd error handling release notes
This commit: - fixes some "multipathd error handling improvement" release notes - fixes a related comment in the code Related launchpad issue https://bugs.launchpad.net/ironic-python-agent/+bug/2031092 Change-Id: Ie3ba0601fa117b053cb8db6284e47249ca9c9134 Signed-off-by: Adam Rozman <adam.rozman@est.tech> |
||
|
Zuul
|
845df338f8 | Merge "improve multipathd error handling" | ||
|
Julia Kreger
|
33f01fa3c2 |
Fix vmedia network config drive handling
When performing DHCP-less deployments, the agent can start and discover more than one configuration drive present on a host. For example, a host was previously deployed using Ironic, and is now being re-deployed again. If Glean was present in the ramdisk, the glean-early.sh would end mounting the folder based upon label. If cloud-init, somehow is still in the ramdisk, the other folder could somehow get mounted. This patch, which is intended to be backportable, causes the agent to unmount any configuration drive folders, mount the most likely candidate based upon device type, partition, and overall state of the machine, and then utilize that configuration, if present, to re-configure and reload networking. Thus allowing dhcp-less re-deployments to be fixed without forcing any breaking changes. It should also be noted that this fix was generated in concert with an additional tempest test case, because this overall failure case needed to be reproduced to ensure we had a workable non-breaking path forward. Closes-Bug: 2032377 Change-Id: I9a3b3dbb9ca98771ce2decf893eba7a4c1890eee |
||
|
Zuul
|
9d9568ba23 | Merge "Get numa_node info when collecting pci devices info" | ||
|
Steve Baker
|
26be55f763 |
Test coverage for efi_utils.get_boot_record
A step will be developed to delete all EFI entries of type HD. As part of this get_boot_record will need to parse more of the output of `efibootmgr -v`. This change asserts the existing behaviour of get_boot_record, and the test can evolve with the changes in get_boot_record. Related-Bug: #2041901 Change-Id: I0c5ac4adc1044c528c27a4eaf580c619ceef47e0 |
||
|
Jay Faulkner
|
3d42298619 |
Remove standby.cache_image support
Image caching was never fully supported in Ironic or IPA; this is vestigal code leftover from a partial implementation. Even if we implemetented it today, we'd likely use a completely different methodology. Change-Id: Id4ab7b3c4f106b209585dbd090cdcb229b1daa73 |
||
|
Zhou Ya
|
76ad06225a |
Get numa_node info when collecting pci devices info
IPA now includes information about numa node id when collecting information about PCI devices. Closes-bug: #1622940 Co-Authored-By: Jay Faulkner <jay@jvf.cc> Change-Id: I70b0cb3eff66d67bb8168982acbbf335de0599cd |
||
|
Adam Rozman
|
13537db293 |
improve multipathd error handling
This commit: - Adds the ability to ignore inconsequential OS error caused by starting the multipathd service when an instance of the service is already running. Related launchpad issue https://bugs.launchpad.net/ironic-python-agent/+bug/2031092 Change-Id: Iebf486915bfdc2546451e6b38a450b4c241e43a8 |
||
|
Zuul
|
b42f0be422 | Merge "implement basic-auth support for user-image download process" | ||
|
Boushra Bettir
|
dbf3e5408d |
Replace shlex module with helper function
Used helper function, `parse_device_tags` from ironic_lib instead of the shlex module for their identical functionality. Updated mock_execute.side_effect for lsblk compatibility in utils.execute. Closes-Bug: #2037572 Change-Id: I6600e054f9644c67ab003f0e0f6c380b5c217223 |
||
|
Julia Kreger
|
cb61a8d6c0 |
Retry on checksum failures
HTTP is a fun protocol. Size is basically optional. And clients implicitly trust the server and socket has transferred all the bytes. Which *really* means you should always checksum. But... previously we didn't checksum as part of retrying. So if anything happened with python-requests, or lower level library code or the system itself causing bytes to be lost off the buffer, creating an incomplete transfer situation, then we wouldn't know until the checksum. So now, we checksum and re-trigger the download if there is a failure of the checksum. This involved a minor shift in the download logic, and resulted in a needful minor fix to an image checksum test as it would loop for 90 seconds as well. Closes-Bug: 2038934 Change-Id: I543a60555a2621b49dd7b6564bd0654a46db2e9a |
||
|
Adam Rozman
|
70961789a6 |
implement basic-auth support for user-image download process
This feature was proposed in https://bugs.launchpad.net/ironic-python-agent/+bug/2021947 Change-Id: I9dbfc1402240beb75b6736214753fd86dccae676 |
||
|
Zuul
|
89be7bd420 | Merge "Conditional creation of RAIDed ESP for UEFI Software RAID" | ||
|
Boushra Bettir
|
25704d2555 |
Add additional mock tests to unit tests for read only devices.
Change ordering to ensure mock tests work correctly. Closes-Bug: #2037690 Change-Id: Ie9b884e58e4677a47e57c3ad39cadd65db8eec75 |
||
|
Zuul
|
23c8427224 | Merge "Extend the lookup timeout to 600 seconds" | ||
| db9545eeec |
Update master for stable/2023.2
Add file to the reno documentation build to show release notes for stable/2023.2. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2023.2. Sem-Ver: feature Change-Id: I8150eb8f35a444ef5a2bc7a648ec301e5094e52d |
|||
|
Zuul
|
73b76da5fe | Merge "Add get_service_steps logic to the agent" 9.7.0 | ||
|
Zuul
|
1581f91826 | Merge "preserve/handle config drives on 4k block devices" | ||
|
Julia Kreger
|
f86975d53c |
Add mlnx deploy_step entry to enable deploy time firmware
Follow-up from service steps addition change to add a deploy steps alias for the Nvidia Mellanox network device firmware update clean steps. This allows deploy time firmware updates to be codified as part of a deployment with custom steps. Change-Id: I9d80447dee7cfde4d3f8d81d9d39e738916b7824 |
||
|
Julia Kreger
|
eb95273ffb |
Add get_service_steps logic to the agent
Initial code patches for service steps have merged in ironic, and it is now time to add support into the agent which allows service steps to be raised to the service. Updates the default hardware manager version to 1.2, which has *rarely* been incremented due to oversight. Change-Id: Iabd2c6c551389ec3c24e94b71245b1250345f7a7 |
||
|
Zuul
|
667abee812 | Merge "Use sparkingly new metalsmith cs9 job" | ||
|
Zuul
|
5a3c8bd138 | Merge "tox: Remove basepython" | ||
|
Julia Kreger
|
4efcce5310 |
Extend the lookup timeout to 600 seconds
Changes the default lookup timeout to be 600 seconds which reduces the risk of lookup failing as a write operation to the backing database is performed upon lookup thanks to generation of an agent token. Overall, this is fairly harmless since by default ramdisks restart the agent if they were not able to successfully start. Change-Id: I35c64c0b4f9b3b607df1bc0c4c2a852aa3595cbd |
||
|
Julia Kreger
|
b6c263a5dc |
preserve/handle config drives on 4k block devices
When an underlying block device (or driver) only supports 4KB IO, this can cause some issues with aspects like using an ISO9660 filesystem which can only support a maximum of 2KB IO. The agent will now attempt to mount the filesystem *before* deleting the supplied file, and should that fail it will mount the configuration drive file from the ramdisk utilizing a loopback, and then extract the contents of the ramdisk into a newly created VFAT filesystem which supports 4KB block IO. Closes-Bug: #2028002 Change-Id: I336acb8e8eb5a02dde2f5e24c258e23797d200ee |
||
|
Riccardo Pittau
|
51f2115c56 |
Use sparkingly new metalsmith cs9 job
Instead of the old dusty cs8 one. Depends-On: I56a0473ecbff8ab8fc143954d3c493037765cdf1 Change-Id: I7bf9cbff9d10299c1a6b9b19fddd8124c1b185ba |
||
|
Julia Kreger
|
5ed520df89 |
Handle the node being locked
If the node is locked, a lookup cannot be performed when an agent token needs to be generated, which tends to error like this: ironic_python_agent.ironic_api_client [-] Failed looking up node with addresses '00:6f:bb:34:b3:4d,00:6f:bb:34:b3:4b' at https://172.22.0.2:6385. Error 409: Node c25e451b-d2fb-4168-b690-f15bc8365520 is locked by host 172.22.0.2, please retry after the current operation is completed.. Check if inspection has completed. Problem is, if we keep pounding on the door, we can actually worsen the situation, and previously we would just just let tenacity retry. We will now hold for 30 seconds before proceeding, so we have hopefully allowed the operation to complete. Also fixes the error logging to help human's sanity. Change-Id: I97d3e27e2adb731794a7746737d3788c6e7977a0 |
||
|
Arne Wiebalck
|
286d66709a |
Conditional creation of RAIDed ESP for UEFI Software RAID
Rebuilding an instance on a RAIDed ESPs will fail due to sgdisk running against an non-clean disk and bailing out. Check if there is a RAIDed ESP already and skip creation if it exists. Change-Id: I13617ae77515a9d34bc4bb3caf9fae73d5e4e578 |
||
|
Julia Kreger
|
b68a4c8a92 |
minor: fix release notes file path
Change-Id: I458d88bf14b55253179488cb771ae42e7b8c84d79.6.0 |