Atomic Sensor Frames: Sending 9-Axis Data with Minimal Skew over Serial

How to pass C Structs on Arduino serial interface


            

If you have ever worked with inertial measurement units or any other multi-axis sensor, you have probably run into a subtle but important problem: how do you ensure that all nine axis values you read and transmit actually belong to the same moment in time? Let’s see.

Naive implementations read axis 1, then axis 2, then axis 3, and so on, sending each one as it comes. By the time axis 9 is transmitted, a non-trivial amount of time has passed. For slow processes this is irrelevant, but for motion tracking, attitude estimation, or any real-time control loop, that inter-sample skew can corrupt your fusion algorithm. This article explains the correct approach (one possible way) and provides clean implementations in C and C++.

The Core Idea: Read Everything First, Then Send Once

The solution is conceptually simple. You treat all nine values as a single atomic snapshot. You read all of them into a tightly packed structure in memory as fast as the bus allows, and only after the last byte has arrived from the sensor do you begin transmitting. This way, the temporal spread of the data is bounded by the sensor read time alone, which is dominated by the I2C or SPI transaction, not by your serial baud rate or your loop overhead.

The transmission itself then becomes a single bulk write of raw bytes. No formatting, no ASCII conversion, no per-value delays. The receiver on the computer side reassembles the frame, optionally validates a checksum, and gets a coherent snapshot.

Struct Packing and the __attribute__ Directive in C

When you define a C struct to hold your sensor frame and then cast its address to a byte pointer for transmission, you must be aware of structure padding. By default, the C compiler is allowed to insert padding bytes between fields to satisfy alignment requirements. A struct with three int16_t fields and one uint8_t checksum field might be four, six, or eight bytes wide depending on the platform and compiler, not three. If you naively transmit sizeof(frame) bytes and the receiver expects a different layout, you will get garbage.

The GCC extension __attribute__((packed)) tells the compiler to suppress all padding and lay out the struct members back to back with no gaps. The syntax is placed either after the struct keyword or after the closing brace, and it applies to the entire struct.

There is a real trade-off here. Packed structs may cause unaligned memory accesses, which are a hardware exception on some architectures (classic ARM Cortex-M without unaligned access support, MIPS, older SPARC). On x86 and on Cortex-M3 and later with the right SCTLR bits, unaligned access works but carries a small performance penalty. On an AVR Arduino, all accesses are byte-wide anyway, so packing costs you nothing. The correct practice is to use __attribute__((packed)) for wire-format (arduino specific) structs that you only ever access through memcpy or byte-pointer iteration, and to copy into a naturally aligned local struct for computation.

Other useful __attribute__ specifiers relevant to embedded work include __attribute__((aligned(N))), which forces the struct to start at an N-byte boundary in memory; __attribute__((section("name"))), which places a variable or function in a named linker section; and __attribute__((unused)), which suppresses the unused-variable warning without requiring a cast.

In the context of interrupt-driven serial code, __attribute__((interrupt)) or __attribute__((signal)) (AVR-specific) marks a function as an interrupt service routine, causing the compiler to emit a proper prologue and epilogue with register save and restore and the RETI instruction.

The __attribute__ syntax is a GCC extension and is also supported by Clang. It is not part of standard C. The C11 standard introduced _Alignas and _Alignof for alignment, and in C23 the [[]] attribute syntax from C++ is being progressively adopted. For maximum portability you can define a PACKED macro that expands to __attribute__((packed)) on GCC/Clang and to #pragma pack on MSVC, but for embedded work targeting GCC you can use the attribute directly without apology.

Implementation in C

The example below targets an Arduino-compatible environment but the principles apply to any microcontroller with a UART. The sensor read functions are stubs; these should be replaced with actual I2C or SPI driver calls. The frame format uses a two-byte start-of-frame marker, nine int16_t axis values, and a one-byte XOR checksum over the payload bytes.

#include <stdint.h>
#include <string.h>

/*
 * Wire-format frame. __attribute__((packed)) ensures no padding bytes
 * are inserted by the compiler. This struct is used only for serialisation;
 * do not perform arithmetic on its fields directly on platforms that fault
 * on unaligned access.
 */
typedef struct __attribute__((packed)) {
    uint8_t  sof[2];      /* Start-of-frame marker: 0xAA 0x55          */
    int16_t  accel[3];    /* Accelerometer X, Y, Z  (raw ADC counts)   */
    int16_t  gyro[3];     /* Gyroscope     X, Y, Z  (raw ADC counts)   */
    int16_t  mag[3];      /* Magnetometer  X, Y, Z  (raw ADC counts)   */
    uint8_t  checksum;    /* XOR of all payload bytes                  */
} SensorFrame;

/* Compute XOR checksum over the payload region of the frame.            */
static uint8_t compute_checksum(const SensorFrame *f)
{
    const uint8_t *p   = (const uint8_t *)f->accel;
    const uint8_t *end = (const uint8_t *)&f->checksum;
    uint8_t csum = 0;
    while (p < end) {
        csum ^= *p++;
    }
    return csum;
}

/*
 * Read all nine axes from the sensor into a SensorFrame.
 * All reads happen before any byte is placed on the wire.
 * Replace the stub bodies with your actual driver calls.
 */
static void read_all_axes(SensorFrame *f)
{
    f->sof[0] = 0xAA;
    f->sof[1] = 0x55;

    /* --- Read accelerometer (single I2C/SPI burst preferred) --- */
    f->accel[0] = sensor_read_accel_x();
    f->accel[1] = sensor_read_accel_y();
    f->accel[2] = sensor_read_accel_z();

    /* --- Read gyroscope ---------------------------------------- */
    f->gyro[0]  = sensor_read_gyro_x();
    f->gyro[1]  = sensor_read_gyro_y();
    f->gyro[2]  = sensor_read_gyro_z();

    /* --- Read magnetometer ------------------------------------- */
    f->mag[0]   = sensor_read_mag_x();
    f->mag[1]   = sensor_read_mag_y();
    f->mag[2]   = sensor_read_mag_z();

    f->checksum = compute_checksum(f);
}

/*
 * Transmit the frame as a single binary blob.
 * Serial_write_bytes() is a thin wrapper over the UART TX register
 * or ring buffer; it does not block waiting for the line to clear.
 */
static void send_frame(const SensorFrame *f)
{
    Serial_write_bytes((const uint8_t *)f, sizeof(SensorFrame));
}

/* Main loop ------------------------------------------------------------ */
void loop(void)
{
    SensorFrame frame;
    read_all_axes(&frame);   /* atomic snapshot */
    send_frame(&frame);      /* single bulk write */
}

Key observations: sizeof(SensorFrame) is exactly 2 + 18 + 1 = 21 bytes with packing. Without __attribute__((packed)), on a 32-bit platform where int16_t has 2-byte natural alignment and no padding is needed between int16_t fields, the size would still be 21 because the fields happen to be naturally aligned. The danger arises in more complex structs mixing 8-bit, 16-bit, and 32-bit fields, which is common when adding timestamp fields or sequence numbers. Applying __attribute__((packed)) defensively and consistently to all wire-format structs avoids surprises as the struct evolves.

Implementation in C++

The C++ version wraps the same ideas in a class with a cleaner interface. It also demonstrates static_assert to catch layout surprises at compile time, which is idiomatic modern embedded C++.

#include <cstdint>;
#include <cstring>;

/*
 * Wire-format frame, packed for serialisation.
 * static_assert enforces the expected size at compile time.
 */
struct __attribute__((packed)) SensorFrame {
    uint8_t  sof[2];
    int16_t  accel[3];
    int16_t  gyro[3];
    int16_t  mag[3];
    uint8_t  checksum;

    static constexpr uint8_t SOF0 = 0xAA;
    static constexpr uint8_t SOF1 = 0x55;
};

static_assert(sizeof(SensorFrame) == 21,
    "SensorFrame layout mismatch: check compiler packing");

/*
 * ImuSensor encapsulates the read-all-then-send pattern.
 * Template parameter SerialPort must provide:
 *   void write(const uint8_t* data, size_t len);
 */
template<typename SerialPort>
class ImuSensor {
public:
    explicit ImuSensor(SerialPort &port) : port_(port) {}

    /*
     * Capture a coherent snapshot and transmit it.
     * No data leaves the MCU until every axis has been read.
     */
    void capture_and_send()
    {
        SensorFrame frame;
        read_snapshot(frame);
        port_.write(reinterpret_cast<const uint8_t *>(&frame),
                    sizeof(frame));
    }

private:
    SerialPort &port_;

    static void read_snapshot(SensorFrame &f)
    {
        f.sof[0] = SensorFrame::SOF0;
        f.sof[1] = SensorFrame::SOF1;

        /* Ideally replaced by a single burst read from the sensor's
         * auto-increment register address, minimising bus idle time. */
        f.accel[0] = sensor_read_accel_x();
        f.accel[1] = sensor_read_accel_y();
        f.accel[2] = sensor_read_accel_z();

        f.gyro[0]  = sensor_read_gyro_x();
        f.gyro[1]  = sensor_read_gyro_y();
        f.gyro[2]  = sensor_read_gyro_z();

        f.mag[0]   = sensor_read_mag_x();
        f.mag[1]   = sensor_read_mag_y();
        f.mag[2]   = sensor_read_mag_z();

        f.checksum = compute_checksum(f);
    }

    static uint8_t compute_checksum(const SensorFrame &f)
    {
        const auto *begin = reinterpret_cast<const uint8_t *>(f.accel);
        const auto *end   = &f.checksum;
        uint8_t csum = 0;
        for (const auto *p = begin; p < end; ++p) {
            csum ^= *p;
        }
        return csum;
    }
};

/* Usage example -------------------------------------------------------- */
void loop()
{
    static ImuSensor<HardwareSerial> imu(Serial);
    imu.capture_and_send();
}

The static_assert on sizeof(SensorFrame) is not optional decoration. If someone later changes the struct on one side of the link without updating the receiver, the size assertion fails at compile time rather than producing a cryptic runtime data corruption. In the C version you would use _Static_assert (C11) or a legacy trick like typedef char size_check[(sizeof(SensorFrame)==21)?1:-1].

Receiver Side: Python Sketch

For completeness, here is how the host software reassembles frames. It scans for the SOF marker, reads exactly sizeof(SensorFrame) – 2 more bytes, verifies the checksum, and hands a coherent nine-value snapshot to the processing layer.

import serial
import struct

SOF       = b'\xAA\x55'
FRAME_FMT = '<9h'          # nine little-endian signed 16-bit integers
PAYLOAD_SZ = struct.calcsize(FRAME_FMT)   # 18 bytes
FRAME_SZ   = 2 + PAYLOAD_SZ + 1          # SOF + payload + checksum

def xor_checksum(data: bytes) -> int:
    result = 0
    for b in data:
        result ^= b
    return result

def read_frame(port: serial.Serial) -> tuple | None:
    # Synchronise to SOF
    buf = b''
    while True:
        buf += port.read(1)
        if buf[-2:] == SOF:
            break
        if len(buf) > 4096:
            buf = buf[-2:]   # prevent unbounded growth

    payload_and_csum = port.read(PAYLOAD_SZ + 1)
    payload  = payload_and_csum[:-1]
    received = payload_and_csum[-1]
    expected = xor_checksum(payload)
    if received != expected:
        return None          # checksum mismatch, discard

    values = struct.unpack(FRAME_FMT, payload)
    return values            # (ax, ay, az, gx, gy, gz, mx, my, mz)

Endianness

Attention: AVR Arduinos are little-endian. Most ARM Cortex-M targets running bare-metal are also little-endian by default. The struct format string on the Python side uses the < prefix to explicitly request little-endian unpacking. If you are targeting a big-endian MCU (some older DSPs, PowerPC-based boards, certain MIPS configurations), use big-endian byteswap either in the MCU code with __builtin_bswap16 or on the host with the > format prefix. Making endianness explicit and documented in the frame specification prevents subtle bugs when porting.

Getting the Sensor Read Time as Low as Possible

The remaining source of timestamp skew is the sensor bus transaction itself. Most IMU sensors (MPU-6050, ICM-42688, LSM6DSO, BMI270 and their relatives) expose all axis registers at consecutive addresses. Nice, isn’t it ? Use your I2C or SPI driver’s burst-read call to pull all registers in a single transaction with the address auto-incrementing. This collapses the read time from nine separate transactions to one, cutting the inter-sample skew by an order of magnitude. Some sensor families also provide a data-ready interrupt and an internal shadow register that freezes all values simultaneously when triggered, giving you true hardware-level simultaneity.

Summary

The pattern is: burst-read all axes into a packed struct, compute checksum, transmit the whole struct as raw bytes in one write call. Use __attribute__((packed)) on the wire-format struct to eliminate padding surprises, and validate the layout with static_assert. On the receiver, synchronise to the SOF marker, verify the checksum, and pass the entire coherent snapshot to your signal processing code. This ensures that all nine axis values your software sees belong to the same sensor integration window, with no inter-axis skew introduced by the transmission layer.

Happy coding !

73

Previous and next posts

Wrapping C and C++ Code for Python

Wrapping native code lets you keep Python’s ergonomics while calling into compiled code for performance-critical paths, hardware drivers, legacy libraries, or numerical algorithms.

Comments are closed.