| name | firmware-review |
| description | Reviews Rust firmware code for the BVR (Base Vectoring Rover) with focus on safety-critical systems, CAN bus protocol compliance, motor control logic, state machine correctness, and embedded testing patterns. Use when reviewing BVR firmware changes, debugging actuator control, testing motor communication, validating safety mechanisms, checking async patterns, or evaluating control system modifications. Covers watchdog implementation, e-stop handling, rate limiting, VESC motor controller integration, and Tokio async runtime patterns. |
| allowed-tools | Read, Grep, Glob, Bash(cargo:test), Bash(cargo:check), Bash(cargo:build) |
BVR Firmware Code Review Skill
This skill provides comprehensive code review for the BVR (Base Vectoring Rover) firmware, focusing on safety-critical systems, real-time control, and CAN bus communication.
Overview
The BVR firmware runs on a Jetson Orin NX and controls a four-wheel differential drive rover using VESC motor controllers over CAN bus. The system is safety-critical with multiple layers of protection including watchdogs, e-stop mechanisms, and rate limiting.
Key Architecture:
- Platform: Jetson Orin NX (aarch64-unknown-linux-gnu)
- Language: Rust Edition 2021
- Runtime: Tokio async multi-threaded
- Communication: SocketCAN (can0 interface at 500kHz)
- Workspace: 18 crates in
bvr/firmware/ - Main daemon: bvrd (1191 lines in
bins/bvrd/src/main.rs)
Critical Files:
- State machine:
crates/state/src/lib.rs(130 lines) - Control logic:
crates/control/src/lib.rs(243 lines) - CAN abstraction:
crates/can/src/lib.rs - VESC protocol:
crates/can/src/vesc.rs - LED control:
crates/can/src/leds.rs - Main daemon:
bins/bvrd/src/main.rs(1191 lines) - Configuration:
config/bvr.toml(152 lines)
Safety-Critical Review Checklist
1. Watchdog Implementation
Purpose: Detect communication timeouts and transition to safe state (Idle) when commands stop arriving.
Location: bvr/firmware/crates/control/src/lib.rs:162-193
Key Points to Review:
- Watchdog initialized with appropriate timeout (default: 500ms)
-
feed()called on every valid command reception -
is_timed_out()checked in main control loop - Timeout triggers
Event::CommandTimeoutin state machine - Watchdog reset on e-stop or mode transitions
Example Pattern:
// Good: Watchdog properly integrated
let mut watchdog = Watchdog::new(Duration::from_millis(500));
loop {
if let Some(cmd) = recv_command() {
watchdog.feed(); // Reset timer on command
}
if watchdog.is_timed_out() {
state_machine.handle(Event::CommandTimeout);
// Motors automatically stop in Idle mode
}
}
Red Flags:
- Watchdog created but never checked
- Timeout value too long (>1 second for teleop)
feed()called unconditionally (bypasses timeout detection)- No transition to safe state on timeout
See: safety-checklist.md for detailed verification steps.
2. E-Stop State Machine
Purpose: Immediate safety override from any operational mode, requiring explicit release to resume.
Location: bvr/firmware/crates/state/src/lib.rs:7-130
State Machine Modes:
Disabled- Initial state, motors offIdle- Ready state, motors enabled but no movementTeleop- Human control via gamepad/keyboardAutonomous- Autonomous navigationEStop- Emergency stop, requires releaseFault- Error state, requires fault clear
Valid E-Stop Transitions:
- ANY mode →
EStop(viaEvent::EStop) EStop→Idle(viaEvent::EStopReleaseonly)
Key Points to Review:
-
Event::EStophandled from all active modes (Idle, Teleop, Autonomous) - Cannot exit e-stop without explicit
Event::EStopRelease -
force_estop()method available for critical overrides - LED feedback tied to e-stop state (red pulsing at 200ms)
- Motor commands blocked in e-stop mode
- State logged at each transition with old/new mode
Example Pattern:
// Good: Proper e-stop handling
match (self.mode, event) {
(_, Event::EStop) => {
info!(?self.mode, "E-stop triggered");
self.mode = Mode::EStop;
// Motors automatically stopped by is_driving() check
}
(Mode::EStop, Event::EStopRelease) => {
info!("E-stop released, transitioning to Idle");
self.mode = Mode::Idle;
}
(Mode::EStop, Event::Enable) => {
warn!("Cannot enable directly from e-stop, must release first");
// No state change - requires explicit release
}
// ...
}
Red Flags:
- Direct transition from e-stop to driving modes without release
- E-stop event not handled in all modes
- State machine allows bypassing e-stop
- Missing logging on e-stop transitions
See: state-machine.md for complete transition diagram.
3. Rate Limiting and Acceleration Control
Purpose: Progressive acceleration/deceleration to prevent wheel slip, mechanical stress, and loss of control.
Location: bvr/firmware/crates/control/src/lib.rs:101-160
Rate Limiter Configuration:
- Max acceleration: 50.0 m/s² (effectively unlimited for teleop responsiveness)
- Max deceleration: 15.0 m/s² (higher for quicker stops)
- Max linear velocity: 5.0 m/s (from chassis config)
- Max angular velocity: 2.5 rad/s (from chassis config)
Key Points to Review:
- Rate limiter applied before motor commands sent
- Direction change detection (applies deceleration rate)
- Velocity clamped to absolute limits
- Rate limiter reset on e-stop or mode change
- Delta time (dt) used for rate calculations
- No divide-by-zero when dt is very small
Example Pattern:
// Good: Rate limiting with direction change detection
impl RateLimiter {
pub fn limit(&mut self, target: Twist, dt: f32) -> Twist {
// Detect direction change (requires deceleration)
let accel = if self.prev.linear * target.linear < 0.0 {
self.max_decel // Braking
} else {
self.max_accel // Accelerating
};
// Apply rate limit
let delta = target.linear - self.prev.linear;
let max_delta = accel * dt;
let limited = self.prev.linear + delta.clamp(-max_delta, max_delta);
// Clamp to absolute limits
Twist {
linear: limited.clamp(-self.max_linear, self.max_linear),
angular: /* similar for angular */,
boost: target.boost,
}
}
}
Red Flags:
- Raw target velocity sent directly to motors
- No rate limiting on angular velocity
- Acceleration limits too high (>100 m/s²)
- Rate limiter bypassed in autonomous mode
- Missing dt parameter or using fixed dt
See: safety-checklist.md for validation tests.
4. State Machine Transitions
Purpose: Enforce valid operational mode transitions and prevent unsafe state combinations.
Key Points to Review:
- All state transitions explicitly defined
- Invalid transitions logged and rejected
-
is_driving()check used to gate motor commands -
is_safe()check used for configuration changes - Unit tests cover all transition paths
- Logging includes both old and new modes
Critical Checks:
// is_driving() - only Teleop and Autonomous should drive
pub fn is_driving(&self) -> bool {
matches!(self.mode, Mode::Teleop | Mode::Autonomous)
}
// is_safe() - only allow config changes in safe modes
pub fn is_safe(&self) -> bool {
matches!(self.mode, Mode::Disabled | Mode::Idle | Mode::EStop)
}
Example Pattern:
// Good: Explicit transition handling with logging
match (self.mode, event) {
(Mode::Idle, Event::Enable) => {
self.mode = Mode::Teleop;
info!(old = ?Mode::Idle, new = ?Mode::Teleop, "Mode transition");
}
(Mode::Disabled, Event::Enable) => {
warn!("Cannot enable from Disabled, transition to Idle first");
// Reject transition
}
_ => {
debug!(?self.mode, ?event, "Unhandled event");
}
}
See: state-machine.md for full transition table and unit tests.
CAN Bus Protocol Review
1. VESC Motor Controller Communication
Purpose: Send velocity/duty commands to 4 VESC motor controllers (FL, FR, RL, RR) and receive status.
Location: bvr/firmware/crates/can/src/vesc.rs
VESC CAN IDs: 0, 1, 2, 3 (configured in bvr.toml)
- ID 0: Front Left (FL)
- ID 1: Front Right (FR)
- ID 2: Rear Left (RL)
- ID 3: Rear Right (RR)
Command Frame Format (Extended CAN):
- Frame ID:
(command_id << 8) | vesc_id - Data Length: 4 bytes (duty), 4 bytes (RPM), 4 bytes (current)
- Byte Order: Big-endian
Command Types:
SetDuty (0): Duty cycle control, -1.0 to 1.0 scaled to ±100,000SetRpm (3): Electrical RPM (divide by pole pairs for mechanical)SetCurrent (1): Current control in milliamps
Key Points to Review:
- VESC IDs match configuration (0-3 for 4 motors)
- Command ID correctly shifted to bits [8:15]
- Big-endian byte order used (VESC protocol requirement)
- Duty cycle clamped to ±1.0 before scaling
- Status parsing includes bounds checking (
data.len() >= 8) - Heartbeat timeout configured (100ms default)
- Status frames parsed correctly (STATUS1, STATUS4, STATUS5)
Example Pattern:
// Good: VESC command with proper encoding
pub fn set_duty(&mut self, duty: f32) -> Result<Frame, CanError> {
let clamped_duty = duty.clamp(-1.0, 1.0);
let duty_value = (clamped_duty * 100_000.0) as i32;
let frame_id = (CMD_SET_DUTY << 8) | self.id; // Extended ID
let data = duty_value.to_be_bytes(); // Big-endian
Frame::new(Id::Extended(frame_id), &data)
}
// Good: Safe status parsing with bounds check
fn parse_status1(&mut self, data: &[u8]) {
if data.len() < 8 {
warn!("STATUS1 frame too short");
return;
}
self.state.status.erpm = i32::from_be_bytes([data[0], data[1], data[2], data[3]]);
self.state.status.current = i16::from_be_bytes([data[4], data[5]]) as f32 / 10.0;
self.state.status.duty = i16::from_be_bytes([data[6], data[7]]) as f32 / 1000.0;
}
Red Flags:
- Little-endian byte order used (VESC expects big-endian)
- No bounds checking on status frame parsing (array panic risk)
- Duty cycle not clamped (can send invalid values >1.0)
- VESC IDs hardcoded instead of using configuration
- Missing timeout detection for lost VESCs
See: can-protocol.md for complete VESC protocol reference.
2. MCU LED Peripheral Communication
Purpose: Send LED mode commands to RP2350/ESP32-S3 MCU for rover status indication.
Location: bvr/firmware/crates/can/src/leds.rs
LED CAN ID Range: 0x0B00-0x0BFF (extended IDs for peripherals)
- LED_CMD: 0x0B00 (Jetson → MCU)
- LED_STATUS: 0x0B01 (MCU → Jetson)
LED Modes:
StateLinked (0x10): MCU automatically sets color based on rover modeSolid: Fixed colorPulse: Breathing effect (configurable period)Flash: Strobe effect (configurable period)Chase: Moving pattern
Rover State Colors:
- Disabled: Red solid
- Idle: Blue solid
- Teleop: Green pulse (2s period)
- Autonomous: Cyan pulse (1.5s period)
- EStop: Red flash (200ms, full brightness)
- Fault: Orange flash (500ms)
Key Points to Review:
- LED commands use 0x0B00 base address
- StateLinked mode used for automatic state indication
- LED state synchronized with state machine transitions
- Color choices distinct and recognizable
- MCU communication timeout handled gracefully
Example Pattern:
// Good: State-linked LED control
pub fn update_leds(&mut self, mode: Mode) {
let led_mode = match mode {
Mode::Idle => LedMode::idle(), // Blue solid
Mode::Teleop => LedMode::teleop(), // Green pulse 2s
Mode::Autonomous => LedMode::autonomous(), // Cyan pulse 1.5s
Mode::EStop => LedMode::estop(), // Red flash 200ms
Mode::Fault => LedMode::fault(), // Orange flash 500ms
_ => LedMode::off(),
};
self.send_led_command(led_mode)?;
}
See: can-protocol.md for LED protocol details.
3. CAN Bus Error Handling
Key Points to Review:
- Timeout handling for lost CAN frames (10ms read/write timeout)
- Invalid frame ID detection and rejection
- Proper use of
Resulttypes withCanError - Mock CAN bus for non-Linux development builds
- Error logging includes context (frame ID, data length, timeout)
Example Pattern:
// Good: Comprehensive error handling
#[derive(Error, Debug)]
pub enum CanError {
#[error("Socket error: {0}")]
Socket(String),
#[error("Invalid CAN ID: {0}")]
InvalidId(u32),
#[error("Timeout waiting for response")]
Timeout,
#[error("Invalid frame data")]
InvalidFrame,
}
// Good: Non-blocking receive with timeout
match self.socket.read_frame_timeout(Duration::from_millis(10)) {
Ok(frame) => { /* process */ }
Err(e) if e.kind() == ErrorKind::WouldBlock => { /* no data */ }
Err(e) => {
error!(?e, "CAN socket read error");
return Err(CanError::Socket(e.to_string()));
}
}
Code Pattern Review
1. Error Handling with thiserror
Convention: Use thiserror for library crates, anyhow for binary crates.
Key Points to Review:
- Library crates define custom error enums with
thiserror - Error variants include context with
#[error("...")] -
#[from]attribute used for error conversion - Binary crates use
anyhow::Resultand.context() - Errors logged with
?e(Debug) not%e(Display)
Example Pattern:
// Library crate (crates/teleop/src/lib.rs)
use thiserror::Error;
#[derive(Error, Debug)]
pub enum TeleopError {
#[error("Network error: {0}")]
Network(#[from] std::io::Error),
#[error("Serialization error: {0}")]
Serialization(String),
#[error("Connection timeout")]
Timeout,
}
// Binary crate (bins/bvrd/src/main.rs)
use anyhow::{Context, Result};
fn load_config() -> Result<Config> {
let contents = fs::read_to_string("bvr.toml")
.context("Failed to read bvr.toml")?;
toml::from_str(&contents)
.context("Failed to parse bvr.toml")
}
Red Flags:
- String-based errors in libraries (
Result<T, String>) - Missing error context in anyhow chains
- Panic on recoverable errors (
unwrap(),expect()in libraries) - Display format used in logs (should use Debug:
?e)
2. Async/Tokio Patterns
Runtime: Multi-threaded Tokio (#[tokio::main])
Key Points to Review:
-
#[tokio::main]on main function - Channels for inter-task communication (mpsc, watch)
- Tasks spawned with
tokio::spawnfor concurrent subsystems - Proper error handling in spawned tasks (log errors, don't panic)
- Graceful shutdown with signal handling
- No blocking operations in async context
Example Pattern:
// Good: Async main with spawned tasks
#[tokio::main]
async fn main() -> Result<()> {
let (cmd_tx, cmd_rx) = mpsc::channel::<Command>(100);
let (telem_tx, telem_rx) = watch::channel(Telemetry::default());
// Spawn teleop server task
tokio::spawn(async move {
if let Err(e) = teleop_server.run(cmd_tx, telem_rx).await {
error!(?e, "Teleop server error");
}
});
// Spawn metrics task
tokio::spawn(async move {
if let Err(e) = metrics.run(telem_rx).await {
error!(?e, "Metrics task error");
}
});
// Main control loop
loop {
tokio::select! {
Some(cmd) = cmd_rx.recv() => { /* handle */ }
_ = tokio::signal::ctrl_c() => {
info!("Shutdown signal received");
break;
}
}
}
Ok(())
}
Red Flags:
- Blocking calls (std::thread::sleep, blocking I/O) in async functions
- Missing error handling in spawned tasks
- Tasks not joined/awaited on shutdown (resource leaks)
- Unbounded channels (prefer bounded with capacity)
3. Module Documentation
Convention: Module-level //! docs at top of lib.rs files, function-level /// docs.
Key Points to Review:
-
//!module-level docs describe crate purpose -
///function docs include examples for public API - Struct fields documented with
/// - No missing_docs warnings in library crates
Example Pattern:
//! State machine and mode management for bvr.
//!
//! This crate implements the rover's operational modes and state
//! transitions, including safety mechanisms like e-stop handling.
/// Represents the rover's current operational mode.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Mode {
/// Motors disabled, awaiting initialization
Disabled,
/// Ready state, motors enabled but stationary
Idle,
/// Under human control via teleop
Teleop,
// ...
}
/// Handles a state machine event and transitions to a new mode if valid.
///
/// # Example
/// ```
/// let mut sm = StateMachine::new();
/// sm.handle(Event::Enable);
/// assert_eq!(sm.mode(), Mode::Idle);
/// ```
pub fn handle(&mut self, event: Event) {
// ...
}
4. Workspace Dependencies
Convention: Use version.workspace = true for shared dependencies.
Key Points to Review:
- Workspace dependencies defined in root
Cargo.toml - Crate
Cargo.tomlusesversion.workspace = true - Consistent versions across all crates (tokio, serde, etc.)
- Optional features documented
Example Pattern:
# Root bvr/firmware/Cargo.toml
[workspace.dependencies]
tokio = { version = "1.41", features = ["full"] }
serde = { version = "1.0", features = ["derive"] }
thiserror = "2.0"
# Crate Cargo.toml
[dependencies]
tokio = { workspace = true }
serde = { workspace = true }
thiserror = { workspace = true }
Testing Requirements
1. Unit Tests
Convention: Tests in #[cfg(test)] mod tests block at bottom of file.
Key Points to Review:
- Safety-critical functions have unit tests
- State machine transitions tested
- Edge cases covered (zero dt, negative values, etc.)
- Test names describe behavior (
test_watchdog_timeout_triggers) - Use
assert!with helpful messages
Example Pattern:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_watchdog_timeout_triggers() {
let mut wd = Watchdog::new(Duration::from_millis(100));
assert!(wd.is_timed_out(), "Watchdog should timeout initially");
wd.feed();
assert!(!wd.is_timed_out(), "Watchdog should not timeout after feed");
std::thread::sleep(Duration::from_millis(150));
assert!(wd.is_timed_out(), "Watchdog should timeout after duration");
}
#[test]
fn test_estop_requires_explicit_release() {
let mut sm = StateMachine::new();
sm.handle(Event::Enable);
assert_eq!(sm.mode(), Mode::Teleop);
sm.handle(Event::EStop);
assert_eq!(sm.mode(), Mode::EStop);
sm.handle(Event::Enable);
assert_eq!(sm.mode(), Mode::EStop, "Cannot enable directly from e-stop");
sm.handle(Event::EStopRelease);
assert_eq!(sm.mode(), Mode::Idle, "Release transitions to Idle");
}
}
2. Integration Tests
Key Points to Review:
- CAN bus simulation/mocking for development
- Teleop protocol tests with mock clients
- Configuration loading tested with fixtures
See: Run tests with cargo test (native) or cargo test --target aarch64-unknown-linux-gnu (cross).
References and Additional Resources
For more detailed information, see:
- safety-checklist.md - Line-by-line safety verification
- can-protocol.md - Complete VESC and LED protocol reference
- state-machine.md - State transition diagram and LED feedback
- CLAUDE.md - Project-wide conventions and architecture
Quick Review Commands
# Run all tests
cargo test
# Check compilation without building
cargo check
# Build for Jetson (requires cross)
cargo build --release --target aarch64-unknown-linux-gnu
# Deploy to rover
./deploy.sh <rover-hostname>