Skip to content

Synchronize ESP32 BLE serial frame queues#2063

Open
robekl wants to merge 1 commit intomeshcore-dev:devfrom
robekl:fix-esp32-ble-queue-synchronization
Open

Synchronize ESP32 BLE serial frame queues#2063
robekl wants to merge 1 commit intomeshcore-dev:devfrom
robekl:fix-esp32-ble-queue-synchronization

Conversation

@robekl
Copy link
Contributor

@robekl robekl commented Mar 17, 2026

Summary

  • replace the ESP32 BLE serial send/receive queues with bounded ring buffers
  • protect queue head/tail publication with a small ESP32 critical section
  • stop shifting queue arrays in checkRecvFrame() while callbacks can append to them

Issue

SerialBLEInterface currently shares recv_queue, send_queue, and their length counters between BLE callback context and the main loop.

The problematic pattern is:

  • onWrite() appends a received BLE frame directly into recv_queue and increments recv_queue_len
  • writeFrame() appends outbound frames directly into send_queue and increments send_queue_len
  • checkRecvFrame() consumes the first element of each queue and compacts the remaining array entries in place

That means the BLE callback can be writing queue entries and lengths at the same time that the main loop is reading and shifting those same arrays. On ESP32, those BLE callbacks are not guaranteed to run in the same execution context as the app loop, so the current code has a real producer/consumer race:

  • a callback can append while the loop is compacting the array
  • queue length can change while the loop is consuming the front element
  • partially moved or overwritten frames can be observed under load

Fix

This change keeps the existing queue sizes and behavior, but changes the queue implementation so callback and loop code stop mutating the same array layout concurrently.

Specifically:

  • both send and receive paths now use bounded ring buffers instead of "array plus compaction"
  • queue reads and writes publish head/tail/count updates inside a short portENTER_CRITICAL / portEXIT_CRITICAL section
  • onWrite() only enqueues a complete received frame
  • writeFrame() only enqueues a complete outbound frame
  • checkRecvFrame() dequeues stable frames and no longer shifts queue contents in place

Why this fixes it

The old race existed because the producer and consumer both rewrote the queue storage layout. Once checkRecvFrame() removed the first element, it had to copy every remaining entry down by one slot, which directly overlapped with callback-time appends.

With the ring-buffer handoff:

  • producers only write the next free slot and publish it atomically
  • consumers only read the next committed slot and advance the read index atomically
  • no in-place array compaction happens anymore

That removes the window where callback and loop code can corrupt each other's queue state or frame contents.

Validation

  • pio run -e Heltec_v3_companion_radio_ble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant