[PLUG] Solved: root "pkill: killing pid * failed: Operation not permitte

JP Vossen via plug on 12 Jun 2024 14:14:28 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] Solved: root "pkill: killing pid * failed: Operation not permitted"


On 6/10/24 09:14 PM, JP Vossen wrote:
On 6/7/24 04:07 PM, JP Vossen wrote:
What could cause "pkill: killing pid * failed: Operation not permitted" *when run by root*?
After patching and reboots the other day I started getting daily Anacron emails from Logrotate on most (but not all) of 50+ VMs saying:
```
/etc/cron.daily/logrotate:
pkill: killing pid NNN failed: Operation not permitted
```
The culprit is the (quite horrible, but mandatory) Crowdstrike `falcon-agent` service, running from the stock vendor RPM that has not changed since April, and we've had patching reboots since then.
The really confusing thing is that *most* of them are doing this, but *not* all, and I can't find any differences!  The 50+ VMs are a mix of (quite horrible, but mandatory) Oracle Linux 7.9 (EoL soon, thus migrating) and 8.10, but the problem doesn't follow the distro.  Also, a few of the ones that complained on Wed did not complain on Thu, so they "fixed" themselves?
...
Solved!
The difference I was unable to find is vendor-side:
	BAD:	/opt/CrowdStrike/falcon-sensor -> falcon-sensor16703
	Good:	/opt/CrowdStrike/falcon-sensor -> falcon-sensor16604
So the installed RPM version is a lie, and I never thought to look into `/opt/CrowdStrike/`, which is a mess, by the way.
It's really annoying that the vendor decided slip-stream updating the agent without updating the RPM was OK, especially since that caused this bug.
Also, it turns out that Crowdstrike `falcon-sensor` is a root-kit (but not a good one). I understand why it has anti-tampering features, but that's infuriating from an administrative perspective. Fortunately, it's not a very _good_ root kit.
Ironically, for the misbehaving (newer falcon-sensor16703) VMs, this Just Worked:
## On VM: `yum erase falcon-sensor`
## Ansible: `crowdstrike_deploy.yml --limit '...'`
Really annoyingly, on the 5 or so servers (older falcon-sensor16604) that were NOT misbehaving, that process failed because of the root-kit behaviors. `yum erase falcon-sensor` removed the RPM metadata and such, but was unable to remove `/opt/CrowdStrike/`. But it did remove `/usr/lib/systemd/system/falcon-sensor.service`, so you see where this is going, right?
Of course reinstall failed with an obscure message:
```
...
Error unpacking rpm package falcon-sensor-7.01.0-15604.el7.x86_64
error: unpacking of archive failed on file /opt/CrowdStrike/KernelModuleArchive;6669fed4: cpio: symlink
...
```
What that really means is that even root can't `rm` files inside `/opt/CrowdStrike/`. I'm not sure how they did that, but I suspect eBPF features, as Will mentioned on the PLUG N call. (See https://www.reddit.com/r/crowdstrike/comments/187p03v/linux_sensor_tamper_protection/.)
But once I rebooted, since the unit file was missing (I assume), the root-kit, err, I mean `falcon-sensor` didn't run, so `rm -rf /opt/CrowdStrike/` worked and *then* I could re-install via Ansible! I probably could have re-installed without the `rm` but a clean install has a *lot less crap* in `/opt/CrowdStrike/`.
So thanks for t-shooting everyone!
And thanks to Walt for some (more) cool Perl one-liners.
Later,
JP
-- -------------------------------------------------------------------
JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/
___________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

AltStyle によって変換されたページ (->オリジナル) /