Re: watchdog: pcf2127: systemd fails on 5.11
From: Bruno Thomsen
Date: Wed Feb 24 2021 - 10:31:49 EST
Den man. 22. feb. 2021 kl. 23.43 skrev Guenter Roeck <linux@xxxxxxxxxxxx>:
>
>
On Thu, Feb 18, 2021 at 01:35:36PM +0100, Bruno Thomsen wrote:
>
> Hi,
>
>
>
> After updating the kernel from 5.8.17 to 5.11 systemd (246.6) is
>
> unable to init watchdog in pcf2127 during boot. Kernel option
>
> CONFIG_WATCHDOG_OPEN_TIMEOUT=300 is working as expected.
>
> It's possible to get watchdog from userspace working in
>
> the following 2 ways.
>
> 1) Disable watchdog in systemd and use busybox watchdog.
>
> 2) Restart systemd after boot with "kill 1".
>
>
>
> During boot setting the system clock from RTC is working.
>
> RTC read/write from userland with hwclock is also working.
>
>
>
> DTS: imx7d-flex-concentrator-mfg.dts
>
> SOC: NXP i.MX7D
>
> Drivers: rtc-pcf2127, spi-imx
>
> Communication: SPI
>
>
>
> There are no patches applied to the kernel.
>
>
>
> When systemd changes watchdog timeout it receives an
>
> error that to our best knowledge comes from spi-imx[1].
>
>
>
> We suspect it's a race condition between drivers or
>
> incompatible error handling.
>
>
>
> Any help in investigating the issue is appreciated.
>
>
>
Difficult to say without access to hardware. The code does have a
>
potential problem, though: It calls pcf2127_wdt_ping not only from
>
watchdog code but also from various rtc related functions, but there
>
is not access protection. This is even more concerning because the ping
>
function is called from an interrupt handler. At the same time, the
>
watchdog initialization sets min_hw_heartbeat_ms to 500, which suggests
>
that there may be a minimum time between heartbeats (which is clearly
>
violated by the current code).
Hi Guenter
Thanks for input.
You could be right about that, I don't think the watchdog feature should
be available for use if the alarm feature is enabled due to how CTRL2
register behaves.
The hardware I am testing on is a custom board, but it's actually
possible to get a Raspberry Pi module called RasClock that has
the chip.
I will test some locking around WD_VAL register access as that is used
in pcf2127_wdt_ping function.
My initial test shows that spin_lock_irqsave around regmap calls are not
a good idea as it result in:
BUG: scheduling while atomic: watchdog/70/0x00000002
BUG: scheduling while atomic: systemd/1/0x00000002
/Bruno