I have a Raspberry Pi 3B with a USB microphone. I have used a combination of Bash and Python scripts to detect noise levels above a certain threshold that trigger a notification on my phone. The actual sound is not streamed.
The scripts are called using a command line alias for the following:
bash /home/pi/babymonitor.sh | /home/pi/monitor.py
Bash script - babymonitor.sh
This uses arecord
to repeatedly record 2-second snippets via a temporary file, sox
and grep
to get the RMS (root mean square) amplitude (see this example for sox stat output) and then tail
to get only the numerical data at the end of the line. This is then piped to the Python script.
#!/bin/bash
trap break INT
while true; do
arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 /dev/shm/tmp_rec.wav ; sox -t .wav /dev/shm/tmp_rec.wav -n stat 2>&1 | grep "RMS amplitude" | tail -c 9
done
Python processing script - monitor.py
This receives the volume data and compares it to an arbitrarily defined threshold that I chose based on testing. If a noise is detected in one of the 2-second snippets, a push notification is sent and it suppresses further notifications for 10 seconds (5 cycles of 2-second snippets).
#!/usr/bin/env python
import sys
import push
THRESHOLD = 0.01
count = 0
suppress = False
while True:
try:
line = sys.stdin.readline().strip() # e.g. "0.006543"
number = float(line)
if number > THRESHOLD and not suppress:
p = push.PushoverSender("user_key", "api_token")
p.send_notification("There's a noise in the nursery!")
count = 0
suppress = True
elif suppress:
print("Suppressing output")
else:
print("All quiet")
# Count 5 cycles after a trigger
if suppress:
count += 1
if count >= 5:
count = 0
suppress = False
except ValueError:
# Cannot coerce to float or error in stdin
print("Value error: " + line)
break
except KeyboardInterrupt:
# Cancelled by user
print("Baby monitor script ending")
break
sys.exit()
Script for push notification service - push.py
This sends an http
request to the Pushover service, which results in a push notification on my phone.
#!/usr/bin/env python
import httplib
import urllib
class PushoverSender:
def __init__(self, user_key, api_key):
self.user_key = user_key
self.api_key = api_key
def send_notification(self, text):
conn = httplib.HTTPSConnection("api.pushover.net:443")
post_data = {'user': self.user_key, 'token': self.api_key, 'message': text}
conn.request("POST", "/1/messages.json", urllib.urlencode(post_data), {"Content-type": "application/x-www-form-urlencoded"})
print(conn.getresponse().read())
I would be grateful for any feedback on how I can improve my code or the conceptual aspects of this project. I am not very familiar with Bash scripting so feedback is particularly welcome on this.
Is it acceptable to have the Bash script and Python script called together with a pipe between them and both running
while True
loops? It certainly seems to work.Is there a better way to exit? Current pressing ctrl+c will terminate both the Bash script (due to
trap break INT
) and the Python script (due toexcept KeyboardInterrupt
).
2 Answers 2
Bash
arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 /dev/shm/tmp_rec.wav ; sox -t .wav /dev/shm/tmp_rec.wav -n stat 2>&1 | grep "RMS amplitude" | tail -c 9
You can use newlines instead of ;
and after |
to make it easier to see the separate commands:
arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 /dev/shm/tmp_rec.wav
sox -t .wav /dev/shm/tmp_rec.wav -n stat 2>&1 |
grep "RMS amplitude" |
tail -c 9
It's not clear to me why you're using /dev/shm
rather than a pipe. Since you say you're not very familiar with bash, I wonder whether it's because you saw in the man page that sox
requires a filename and didn't know that it's common for -
to be used as a special filename to signify stdin. Untested, but on my reading of the man pages the following should work:
arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 |
sox -t .wav - -n stat 2>&1 |
grep "RMS amplitude" |
tail -c 9
I think the extraction of the amplitude could be more robust in two ways:
- As a minor point, I suggest changing the regex to
"RMS *amplitude"
as a future-proofing precaution against the stats gaining a new stat with a longer name. - More importantly,
tail -c 9
is a very bold assumption. The Python program usesstrip()
, so you don't care about leading whitespace. I propose replacing thetail
withcut -d: -f2
.
Final version of the bash script if you agree with all of my suggestions:
arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 |
sox -t .wav - -n stat 2>&1 |
grep "RMS *amplitude" |
cut -d: -f2
Python
I don't have much to say here except that count
and suppress
looks to me to be at least one variable too many. If you replace with e.g. lines_to_suppress
then if suppress
becomes if lines_to_suppress <= 0
and instead of incrementing count
you decrement lines_to_suppress
.
However, it might be clearest to just eliminate them both and replace
count = 0 suppress = True
with
for _ in range(0, 5):
sys.stdin.readline()
print("Suppressing output")
-
\$\begingroup\$ Thanks for these suggestions. I did indeed think
sox
needed a filename. Your points about extraction of amplitude data are great - I’m not good with regex! :) I’ll try all of these later when I’m back at my computer. \$\endgroup\$Chris– Chris2018年08月29日 12:48:12 +00:00Commented Aug 29, 2018 at 12:48
One thing you might do to the shell script is to swap the commands in your while
:
#!/bin/sh
while arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 \
/dev/shm/tmp_rec.wav \
&& sox -t .wav /dev/shm/tmp_rec.wav -n stat 2>&1 \
| grep "RMS amplitude" | tail -c 9
do true
done
That means that you don't have to set a trap (interrupting the commands gives a false status that will exit the while
). Note that because we're not using any Bash extensions, we can run it with plain /bin/sh
, which often has a smaller footprint than Bash.
There's no need for a temporary file - both arecord
and sox
will use standard in/out streams if not given a file name, so we can pipe them together. This reduces fragility and may also reduce latency (as we can start processing the audio before we've finished recording it):
# untested
while arecord --device=hw:1,0 --format S16_LE --rate 44100 -d 2 \
| sox -t .wav - -n stat 2>&1 \
| grep "RMS amplitude" | tail -c 9
You might also be able to use a smaller audio format - 8U
should be sufficient for simple amplitude measurement, and the rate can be more in the speech range (the arecord
default of 8kHz should be fine).
I believe that sox
or rec
ought to be able to read directly from the microphone, and to do the silence-detection itself using the silence
effect. I'm not a frequent user of sox
so you should consult the manpage yourself for the details.
-
1\$\begingroup\$ I also wondered about
sox -d
as a way of eliminatingarecord
from the pipeline, but I couldn't figure out from the manpage how to specify the 2 second time limit. \$\endgroup\$Peter Taylor– Peter Taylor2018年08月29日 07:48:35 +00:00Commented Aug 29, 2018 at 7:48 -
1\$\begingroup\$ One of the examples in the man page "stops after it sees 10 minutes of silence" - I think perhaps that could be adapted to this purpose? There's also the
trim
filter, which might warrant investigation. Another avenue of research would be to bring the script's function into Python, usingpysox
. \$\endgroup\$Toby Speight– Toby Speight2018年08月29日 08:17:37 +00:00Commented Aug 29, 2018 at 8:17 -
\$\begingroup\$ Thanks for these suggestions. I’m going to try them later. \$\endgroup\$Chris– Chris2018年08月29日 12:50:11 +00:00Commented Aug 29, 2018 at 12:50