Is it even possible to do what I'm trying with Audio Tools and an ESP32 Audio dev kit? #2191

dandeliondandy asked this question in Q&A

Oct 8, 2025

· 1 comments · 1 reply

dandeliondandy
Oct 8, 2025

I'm not asking for a full assist, but I would love to know, first of all: Is what I'm trying to do even possible with the ESP32 A1s? Or am I barking up the wrong tree (module)? I've been going through the wiki and examples and it always feels like I'm very close but something isn't clicking.

I've been scratching my head for the last week or so trying to get a working sketch of a basic four track recorder. I have a working sketch for recording one good .wav file to SD, with a solid sound. What I've been trying to find is a way to mix up to three already-recorded .wav files, and ideally a live monitor of the mic input, to a headphone out. While also recording the new fourth track to SD.

I've had a little luck getting everything working, but it sounded awful. Live monitoring, while recording (to memory), has worked fine in other sketches.

My thinking now is that it might be best to load the existing tracks (for instance track_1.wav, track_2.wav) into memory, and then stream that to headphones, mixing in the split from the mic, and send that memory stream to headphones, where it is thrown away as soon as it's heard. this way nothing is writing or reading from the SD card, while recording, other than the track_3.wav that we are recording straight to SD. That seems more likely than trying to read the first two files off the SD, while recording to it at the same time. The trick would be, is that possible? And can the tracks stay in sync?

Sorry for the long post/question. And thanks for any advice or wiki topics to research.

Here is my current working sketch for recording one track, no playback:

#include "AudioTools.h"
#include "AudioTools/AudioLibs/AudioBoardStream.h"
#include "AudioTools/CoreAudio/AudioStreams.h" // REQUIRED: Header for the Throttle class
#include "SD.h"
// --- Config ---
AudioInfo info(32000, 1, 16);
#define SPI_SPEED 20000000 
// --- Global Objects ---
AudioBoardStream kit(AudioKitEs8388V1);
File audioFile;
WAVEncoder wavEncoder;
EncodedAudioStream *outStream = nullptr;
Throttle *throttle = nullptr; // Correct class name: Throttle
StreamCopy *copier = nullptr;
bool isRecording = false;
// Variables for sequential file naming
char fileName[30];
int fileIndex = 0;
// Variable for button debounce/cooldown
unsigned long lastButtonPressTime = 0;
// ======================================================================================
// BUTTON HANDLER
// ======================================================================================
void handleRecordButton(bool active, int pin, void* ptr) {
 if (millis() - lastButtonPressTime < 500) {
 return;
 }
 
 if (active) { 
 lastButtonPressTime = millis();
 if (!isRecording) {
 // --- START RECORDING ---
 do {
 sprintf(fileName, "/rec_%d.wav", fileIndex++);
 } while (SD.exists(fileName));
 
 Serial.print("Starting recording to file: ");
 Serial.println(fileName);
 
 audioFile = SD.open(fileName, FILE_WRITE);
 if (audioFile) {
 outStream = new EncodedAudioStream(&audioFile, &wavEncoder);
 outStream->begin(info); 
 
 // Create the throttle, wrapping the audio kit (our fast source)
 throttle = new Throttle(kit);
 throttle->begin(info); // Configure the throttle with our audio settings
 
 // Tell the copier to copy from the throttle, not the raw kit
 copier = new StreamCopy(*outStream, *throttle);
 
 isRecording = true;
 Serial.println("--> RECORDING MONO at 32kHz");
 } else {
 Serial.println("Failed to open file!");
 }
 } else {
 // --- STOP RECORDING ---
 Serial.println("Stopping recording...");
 isRecording = false; 
 
 if (copier != nullptr) {
 copier->copy(); 
 delete copier;
 copier = nullptr;
 }
 // Clean up the throttle to prevent memory leaks
 if (throttle != nullptr) {
 delete throttle;
 throttle = nullptr;
 }
 if (outStream != nullptr) {
 outStream->end();
 delete outStream;
 outStream = nullptr;
 }
 if (audioFile) {
 audioFile.close();
 Serial.println("File saved.");
 File f = SD.open(fileName, FILE_READ);
 if (f) {
 Serial.print("Final file size for ");
 Serial.print(fileName);
 Serial.print(": ");
 Serial.print(f.size());
 Serial.println(" bytes");
 f.close();
 }
 }
 }
 }
}
// ======================================================================================
// SETUP
// ======================================================================================
void setup() {
 Serial.begin(115200);
 // Set log level to Error for the quietest possible operation.
 AudioLogger::instance().begin(Serial, AudioLogger::Error);
 Serial.println("\n--- Final Mono WAV Recorder ---");
 auto cfg = kit.defaultConfig(RXTX_MODE);
 cfg.input_device = ADC_INPUT_LINE2; 
 cfg.sd_active = true; 
 cfg.copyFrom(info);
 kit.begin(cfg);
 if (!SD.begin(PIN_AUDIO_KIT_SD_CARD_CS, SPI, SPI_SPEED)) {
 Serial.println("FATAL: SD Card failed to mount!");
 while(1);
 }
 Serial.println("SD Card Initialized.");
 
 kit.audioActions().setDebounceDelay(20);
 kit.audioActions().add(kit.getKey(1), handleRecordButton);
 
 Serial.println("Ready. Press REC button to start/stop recording.");
}
// ======================================================================================
// LOOP
// ======================================================================================
void loop() {
 kit.processActions(); 
 if (isRecording && copier != nullptr) {
 copier->copy();
 // We can now use flush() safely because the Throttle is protecting the SD card from being overwhelmed.
 audioFile.flush(); 
 }
}

Is this a reasonable foundation to build on, if it's possible, or am I misunderstanding something basic?

here is my failed attempt to build the full sketch:

#include "AudioTools.h"
#include "AudioTools/AudioLibs/AudioBoardStream.h"
#include "AudioTools/AudioLibs/MemoryManager.h" 
#include <SD_MMC.h>
using namespace audio_tools;
// --- Config ---
AudioInfo info(32000, 1, 16);
const char* trackFilenames[] = {"/track_1.wav", "/track_2.wav", "/track_3.wav", "/track_4.wav"};
const int NUM_TRACKS = 4;
MemoryManager memory(512); 
// --- State Machine ---
enum State { STATE_IDLE, STATE_ARMED, STATE_RECORDING, STATE_PLAYING, STATE_MONITOR_RECORD };
State currentState = STATE_IDLE;
int armedTrack = 0;
// --- Global Audio Objects ---
AudioBoardStream kit(AudioKitEs8388V1);
StreamCopy* recordCopier = nullptr;
StreamCopy* monitorCopier = nullptr;
// Recording Objects
File recordingFile;
WAVEncoder wavEncoder;
EncodedAudioStream* outStream = nullptr;
Throttle* throttle = nullptr;
// Playback & Monitoring Objects
InputMixer<int16_t> playbackMixer;
InputMixer<int16_t> finalMixer;
DynamicMemoryStream* ramPlaybackStreams[NUM_TRACKS] = {nullptr};
// ======================================================================================
// Load and decode WAV files from SD into PSRAM
// ======================================================================================
void loadTracksToRAM(int trackToExclude) {
 Serial.println("Loading playback tracks to RAM...");
 for (int i = 0; i < NUM_TRACKS; i++) {
 if (i == trackToExclude || !SD_MMC.exists(trackFilenames[i])) {
 continue;
 }
 Serial.printf("-> Loading track %d\n", i + 1);
 File playFile = SD_MMC.open(trackFilenames[i]);
 if (!playFile) {
 Serial.printf("Failed to open %s\n", trackFilenames[i]);
 continue;
 }
 WAVDecoder tempDecoder;
 EncodedAudioStream decoderStream(&playFile, &tempDecoder);
 if (!decoderStream.begin()) {
 Serial.println("Failed to begin decoder stream");
 playFile.close();
 continue;
 }
 ramPlaybackStreams[i] = new DynamicMemoryStream(false, playFile.size(), PS_RAM);
 if (!ramPlaybackStreams[i]){
 Serial.println("Failed to allocate RAM for track");
 playFile.close();
 continue;
 }
 StreamCopy copier(*ramPlaybackStreams[i], decoderStream);
 copier.copy();
 
 decoderStream.end();
 playFile.close();
 Serial.printf("-> Track %d loaded to RAM (%u bytes)\n", i + 1, ramPlaybackStreams[i]->available());
 }
}
// ======================================================================================
// CLEANUP
// ======================================================================================
void stopAllAudio() {
 Serial.println("Stopping all audio...");
 if (recordCopier != nullptr) {
 delete recordCopier;
 recordCopier = nullptr;
 }
 if (monitorCopier != nullptr) {
 delete monitorCopier;
 monitorCopier = nullptr;
 }
 if (throttle != nullptr) {
 delete throttle;
 throttle = nullptr;
 }
 if (outStream != nullptr) {
 outStream->end();
 delete outStream;
 outStream = nullptr;
 }
 if (recordingFile) {
 recordingFile.close();
 }
 finalMixer.end();
 playbackMixer.end();
 for (int i = 0; i < NUM_TRACKS; i++) {
 if (ramPlaybackStreams[i] != nullptr) {
 delete ramPlaybackStreams[i];
 ramPlaybackStreams[i] = nullptr;
 }
 }
 currentState = STATE_IDLE;
 Serial.println("System is IDLE.");
}
// ======================================================================================
// BUTTON HANDLERS
// ======================================================================================
void handleArmTrack(bool active, int pin, void* trackNumPtr) {
 if (active && (currentState == STATE_IDLE || currentState == STATE_ARMED)) {
 stopAllAudio();
 armedTrack = (intptr_t)trackNumPtr;
 currentState = STATE_ARMED;
 Serial.printf("==> Track %d Armed <==\n", armedTrack + 1);
 monitorCopier = new StreamCopy(kit, kit);
 }
}
void handleRecordButton(bool active, int pin, void* ptr) {
 if (!active) return;
 if (currentState == STATE_ARMED) {
 if (monitorCopier != nullptr) {
 delete monitorCopier;
 monitorCopier = nullptr;
 }
 Serial.printf("Starting recording on Track %d...\n", armedTrack + 1);
 // --- Create ONE throttled source from the Mic ---
 throttle = new Throttle(kit);
 throttle->begin(info);
 // --- Path 1: Setup clean recording path (throttle -> encoder -> file) ---
 recordingFile = SD_MMC.open(trackFilenames[armedTrack], FILE_WRITE);
 if (!recordingFile) { Serial.println("File open for recording failed!"); stopAllAudio(); return; }
 outStream = new EncodedAudioStream(&recordingFile, &wavEncoder);
 outStream->begin(info);
 recordCopier = new StreamCopy(*outStream, *throttle);
 // --- Path 2: Setup monitoring path ---
 loadTracksToRAM(armedTrack);
 bool tracksFoundForPlayback = false;
 playbackMixer.begin(info);
 for (int i = 0; i < NUM_TRACKS; i++) {
 if (ramPlaybackStreams[i] != nullptr) {
 ramPlaybackStreams[i]->begin(); 
 playbackMixer.add(*ramPlaybackStreams[i]);
 tracksFoundForPlayback = true;
 }
 }
 finalMixer.begin(info);
 // CORRECTED: Use the 'throttle' as the live mic source, NOT the raw 'kit'
 finalMixer.add(*throttle); 
 if (tracksFoundForPlayback) {
 finalMixer.add(playbackMixer);
 currentState = STATE_MONITOR_RECORD;
 Serial.println("-> Monitoring existing tracks (from RAM) + live input.");
 } else {
 currentState = STATE_RECORDING;
 Serial.println("-> Monitoring live input only.");
 }
 monitorCopier = new StreamCopy(kit, finalMixer);
 Serial.println("--> RECORDING...");
 } else if (currentState == STATE_RECORDING || currentState == STATE_MONITOR_RECORD) {
 stopAllAudio();
 }
}
void handlePlayButton(bool active, int pin, void* ptr) {
 if (!active) return;
 if (currentState == STATE_IDLE || currentState == STATE_ARMED) {
 stopAllAudio();
 Serial.println("Scanning for tracks to play...");
 loadTracksToRAM(-1); 
 bool tracksFound = false;
 playbackMixer.begin(info);
 for (int i = 0; i < NUM_TRACKS; i++) {
 if (ramPlaybackStreams[i] != nullptr) {
 ramPlaybackStreams[i]->begin(); 
 playbackMixer.add(*ramPlaybackStreams[i]);
 tracksFound = true;
 }
 }
 if (tracksFound) {
 monitorCopier = new StreamCopy(kit, playbackMixer);
 currentState = STATE_PLAYING;
 Serial.printf("--> PLAYING from RAM.\n");
 } else {
 Serial.println("No tracks found to play.");
 stopAllAudio();
 }
 } else if (currentState == STATE_PLAYING) {
 stopAllAudio();
 }
}
// ======================================================================================
// SETUP
// ======================================================================================
void setup() {
 Serial.begin(115200);
 delay(2000); 
 Serial.println("\n--- 4-Track Recorder (RAM Playback) ---");
 auto cfg = kit.defaultConfig(RXTX_MODE);
 cfg.input_device = ADC_INPUT_LINE2;
 cfg.sd_active = true;
 cfg.copyFrom(info);
 cfg.buffer_size = 512;
 cfg.buffer_count = 8;
 kit.begin(cfg);
 kit.setVolume(1.0);
 if (!SD_MMC.begin()) {
 Serial.println("FATAL: SD Card failed to mount!");
 while (true);
 }
 Serial.println("SD Card Initialized.");
 kit.audioActions().setDebounceDelay(50);
 auto act_logic = AudioActions::ActiveLow;
 kit.audioActions().add(kit.getKey(1), handleArmTrack, nullptr, act_logic, (void*)0);
 kit.audioActions().add(kit.getKey(2), handleArmTrack, nullptr, act_logic, (void*)1);
 kit.audioActions().add(kit.getKey(3), handleArmTrack, nullptr, act_logic, (void*)2);
 kit.audioActions().add(kit.getKey(4), handleArmTrack, nullptr, act_logic, (void*)3);
 kit.audioActions().add(kit.getKey(5), handlePlayButton, nullptr, act_logic, nullptr);
 kit.audioActions().add(kit.getKey(6), handleRecordButton, nullptr, act_logic, nullptr);
 
 Serial.printf("PSRAM available: %s\n", psramFound() ? "yes" : "no");
 Serial.printf("Total PSRAM: %d bytes\n", ESP.getPsramSize());
 Serial.printf("Free PSRAM: %d bytes\n", ESP.getFreePsram());
 Serial.println("Ready. Arm a track, then press Record or Play.");
}
// ======================================================================================
// LOOP
// ======================================================================================
void loop() {
 kit.processActions();
 if (recordCopier != nullptr) {
 recordCopier->copy();
 }
 if (monitorCopier != nullptr) {
 monitorCopier->copy();
 }
}

Answered by pschatzmann

Oct 8, 2025

Microcontrollers are slow and have only a limited amout of memory. So the challange is always to make things fit with the available memory and computing resources!

Build you sketch in several steps and test each step speparatly. I would start with a simple mixing of files to figure out the max sample rate (using files vs PSRAM vs PROGMEM)

A couple of hints:

Use a fast disk access: SDMMC 4 bit gives you max thruput
WAV requires a lot of space - double check if you have enough PSRAM to store what you need. If the data does not change you can also store it in PROGMEM. This will definitly save you from the disk overhead.
If the mixing can't provide the data fast enough for plyback: reduce th...

View full answer

Replies: 1 comment 1 reply

pschatzmann
Oct 8, 2025
Maintainer

Microcontrollers are slow and have only a limited amout of memory. So the challange is always to make things fit with the available memory and computing resources!

Build you sketch in several steps and test each step speparatly. I would start with a simple mixing of files to figure out the max sample rate (using files vs PSRAM vs PROGMEM)

A couple of hints:

Use a fast disk access: SDMMC 4 bit gives you max thruput
WAV requires a lot of space - double check if you have enough PSRAM to store what you need. If the data does not change you can also store it in PROGMEM. This will definitly save you from the disk overhead.
If the mixing can't provide the data fast enough for plyback: reduce the sample rate (you can dynamically resample)
writing to disk requires to revisit the supported sample rate
Don't use a throttle in this scenario!
Warning: Activating disk access means you are loosing some buttons !

1 reply

@dandeliondandy

dandeliondandy Oct 8, 2025
Author

Thanks for the quick response.

Is there a simple mixing, or memory based output, example you could point me to to build off of? I feel like I found one last week, merging a generated tone with something for the SD card? but I can't for the life of me remember which one, or if it was something that would work with the AudioKit module. Would you suggest mp3 instead of .wav? or maybe just a dynamic resample of the wav down to a lower bitrate for the monitoring?

No Throttle, noted. thanks.

And yes, a bummer about the buttons but I have an expansion board that will hopefully take care of that for now.

Answer selected by pschatzmann

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Is it even possible to do what I'm trying with Audio Tools and an ESP32 Audio dev kit? #2191

Uh oh!

{{title}}

Uh oh!

dandeliondandy
Oct 8, 2025

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

pschatzmann
Oct 8, 2025
Maintainer

Uh oh!

{{title}}

Uh oh!

dandeliondandy Oct 8, 2025
Author

Select a reply

Uh oh!

Uh oh!

Is it even possible to do what I'm trying with Audio Tools and an ESP32 Audio dev kit? #2191

Uh oh!

dandeliondandy Oct 8, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

pschatzmann Oct 8, 2025 Maintainer

Uh oh!

dandeliondandy Oct 8, 2025 Author

dandeliondandy
Oct 8, 2025

Replies: 1 comment 1 reply

pschatzmann
Oct 8, 2025
Maintainer

dandeliondandy Oct 8, 2025
Author