| name | ebpf-packet-redirect |
| description | Implement packet redirection and routing in eBPF programs using bpf_redirect and bpf_redirect_neigh helpers. Includes source-based policy routing, map-based routing tables, load balancing, and CNF router patterns. Use when building routers, gateways, load balancers, or any CNF that needs to control packet forwarding paths. |
eBPF Packet Redirect Skill
This skill provides comprehensive guidance for implementing packet redirection and routing in eBPF-based CNFs.
What This Skill Does
Generates code for:
- Basic packet redirection with
bpf_redirect - Neighbor-aware redirection with
bpf_redirect_neigh - Source-based policy routing
- Map-based routing tables
- Dynamic route updates from userspace
- Load balancing across multiple paths
- Complete CNF router implementations
When to Use
- Building software routers or gateways
- Implementing source-based policy routing
- Creating load balancers
- Handling virtual IPs or anycast addresses
- Building service mesh sidecars
- Solving asymmetric routing problems
- Implementing multi-homing or multiple ISP scenarios
- Creating VPN/tunnel endpoints with custom routing
Redirection Helpers Comparison
bpf_redirect - Basic Redirection
long bpf_redirect(u32 ifindex, u64 flags);
What it does:
- Changes output interface only
- You must handle everything else:
- MAC address rewriting
- ARP/NDP resolution
- Route lookups
- Next-hop determination
Use when:
- Redirecting within same L2 domain
- You control all network configuration
- Simple interface switching needed
Example:
SEC("xdp")
int xdp_redirect_simple(struct xdp_md *ctx) {
// Get interface
struct net_device *eth1 = /* ... */;
// Simple redirect to eth1
return bpf_redirect(eth1->ifindex, 0);
}
bpf_redirect_neigh - Neighbor-Aware Redirection (Preferred)
long bpf_redirect_neigh(u32 ifindex, struct bpf_redir_neigh *params,
int plen, u64 flags);
What it does:
- Automatic ARP/NDP resolution - Handles neighbor discovery
- Route lookup - Finds next hop automatically
- MAC rewriting - Updates Ethernet headers
- Gateway-aware - Can forward through routers
Advantages:
- ✅ Full routing stack in eBPF
- ✅ Handles all L2/L3 details
- ✅ Production-ready forwarding
- ✅ Works across L3 boundaries
Use when:
- Building routers or gateways (most CNF use cases)
- Need L3 forwarding capability
- Want automatic neighbor resolution
- Forwarding across subnets
Structure:
struct bpf_redir_neigh {
__u32 nh_family; // AF_INET or AF_INET6
union {
__be32 ipv4_nh; // IPv4 next hop
__u32 ipv6_nh[4]; // IPv6 next hop
};
};
Source-Based Policy Routing
The Problem: Asymmetric Routing
Scenario:
Client (10.0.2.2)
↓
Router (10.0.2.1 / 10.0.1.1) [eBPF CNF]
↓
Server (10.0.1.2)
- Default gateway: 192.168.0.1
- Virtual IP: 192.168.100.5 on lo
Issue:
- Client → 192.168.100.5 (server's virtual IP)
- Router forwards to server ✅
- Server replies, but routing table says "use default gateway"
- Reply goes to 192.168.0.1 instead of back through router ❌
- Connection fails - asymmetric routing
eBPF Solution: Attach program to server's egress that redirects based on source IP
Complete CNF Router Example
C Code - eBPF Router (kernel space)
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/in.h>
#include <linux/pkt_cls.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
// Routing policy record
struct route_policy {
__u32 interface_id; // Output interface
__u32 next_hop; // Next hop IP (network byte order)
};
// Map: Source IP → Routing Policy
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, __u32); // Source IPv4 address
__type(value, struct route_policy);
__uint(max_entries, 1024);
} policy_routes_v4 SEC(".maps");
// Statistics
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, __u32);
__type(value, __u64);
__uint(max_entries, 3);
} stats SEC(".maps");
#define STAT_TOTAL_PACKETS 0
#define STAT_POLICY_MATCHES 1
#define STAT_REDIRECTS 2
static __always_inline void update_stat(__u32 key) {
__u64 *counter = bpf_map_lookup_elem(&stats, &key);
if (counter)
__sync_fetch_and_add(counter, 1);
}
SEC("tc")
int policy_router(struct __sk_buff *skb) {
void *data = (void *)(long)skb->data;
void *data_end = (void *)(long)skb->data_end;
update_stat(STAT_TOTAL_PACKETS);
// Bounds check for Ethernet header
if (data + sizeof(struct ethhdr) > data_end)
return TC_ACT_OK;
struct ethhdr *eth = data;
// Only handle IPv4
if (bpf_ntohs(eth->h_proto) != ETH_P_IP)
return TC_ACT_OK;
// Bounds check for IP header
if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) > data_end)
return TC_ACT_OK;
struct iphdr *iph = data + sizeof(struct ethhdr);
// Look up source-based policy
__u32 src_key = iph->saddr; // Already in network byte order
struct route_policy *policy = bpf_map_lookup_elem(&policy_routes_v4, &src_key);
if (!policy) {
// No policy match, use normal routing
return TC_ACT_OK;
}
update_stat(STAT_POLICY_MATCHES);
// Apply policy routing with neighbor-aware redirect
struct bpf_redir_neigh nh = {
.nh_family = AF_INET,
.ipv4_nh = policy->next_hop, // Already in network byte order
};
long ret = bpf_redirect_neigh(policy->interface_id, &nh, sizeof(nh), 0);
if (ret == TC_ACT_REDIRECT) {
update_stat(STAT_REDIRECTS);
// Debug logging
bpf_printk("Policy redirect: src=%pI4 iface=%d nexthop=%pI4",
&iph->saddr, policy->interface_id, &policy->next_hop);
}
return ret;
}
char _license[] SEC("license") = "GPL";
Go Code - Policy Configuration (userspace)
package main
import (
"encoding/binary"
"fmt"
"log"
"net"
"os"
"os/signal"
"syscall"
"time"
"github.com/cilium/ebpf"
"github.com/cilium/ebpf/link"
)
//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -type route_policy PolicyRouter policy_router.c
type RoutingPolicy struct {
SourceIP net.IP
Interface string
NextHop net.IP
}
func main() {
// Load eBPF program
spec, err := LoadPolicyRouter()
if err != nil {
log.Fatalf("loading spec: %v", err)
}
objs := &PolicyRouterObjects{}
if err := spec.LoadAndAssign(objs, nil); err != nil {
log.Fatalf("loading objects: %v", err)
}
defer objs.Close()
// Attach to default gateway interface (egress)
iface, err := net.InterfaceByName("eth0")
if err != nil {
log.Fatalf("finding interface: %v", err)
}
l, err := link.AttachTCX(link.TCXOptions{
Program: objs.PolicyRouter,
Attach: ebpf.AttachTCXEgress,
Interface: iface.Index,
})
if err != nil {
log.Fatalf("attaching program: %v", err)
}
defer l.Close()
log.Printf("Policy router attached to %s egress", iface.Name)
// Configure routing policies
policies := []RoutingPolicy{
{
SourceIP: net.ParseIP("192.168.100.5"),
Interface: "eth1",
NextHop: net.ParseIP("10.0.1.1"),
},
// Add more policies as needed
}
for _, policy := range policies {
if err := addRoutingPolicy(objs.PolicyRoutesV4, policy); err != nil {
log.Fatalf("adding policy: %v", err)
}
log.Printf("Added policy: %s via %s (next hop %s)",
policy.SourceIP, policy.Interface, policy.NextHop)
}
// Monitor statistics
go monitorStats(objs.Stats)
// Wait for signal
sig := make(chan os.Signal, 1)
signal.Notify(sig, syscall.SIGINT, syscall.SIGTERM)
<-sig
log.Println("Shutting down policy router...")
}
func addRoutingPolicy(m *ebpf.Map, policy RoutingPolicy) error {
// Get interface index
iface, err := net.InterfaceByName(policy.Interface)
if err != nil {
return fmt.Errorf("interface %s not found: %w", policy.Interface, err)
}
// Convert source IP to key (network byte order)
srcIP := policy.SourceIP.To4()
if srcIP == nil {
return fmt.Errorf("invalid IPv4 address: %s", policy.SourceIP)
}
key := binary.BigEndian.Uint32(srcIP)
// Convert next hop to network byte order
nextHopIP := policy.NextHop.To4()
if nextHopIP == nil {
return fmt.Errorf("invalid next hop IPv4 address: %s", policy.NextHop)
}
nextHop := binary.BigEndian.Uint32(nextHopIP)
// Create policy record
value := PolicyRouterRoutePolicy{
InterfaceId: uint32(iface.Index),
NextHop: nextHop,
}
// Insert into map
if err := m.Put(&key, &value); err != nil {
return fmt.Errorf("map insert failed: %w", err)
}
return nil
}
func removeRoutingPolicy(m *ebpf.Map, sourceIP net.IP) error {
srcIP := sourceIP.To4()
if srcIP == nil {
return fmt.Errorf("invalid IPv4 address: %s", sourceIP)
}
key := binary.BigEndian.Uint32(srcIP)
if err := m.Delete(&key); err != nil {
return fmt.Errorf("map delete failed: %w", err)
}
return nil
}
func monitorStats(m *ebpf.Map) {
ticker := time.NewTicker(5 * time.Second)
defer ticker.Stop()
for range ticker.C {
var (
key uint32
value uint64
)
// Read statistics
stats := make(map[string]uint64)
key = 0 // STAT_TOTAL_PACKETS
if err := m.Lookup(&key, &value); err == nil {
stats["total_packets"] = value
}
key = 1 // STAT_POLICY_MATCHES
if err := m.Lookup(&key, &value); err == nil {
stats["policy_matches"] = value
}
key = 2 // STAT_REDIRECTS
if err := m.Lookup(&key, &value); err == nil {
stats["redirects"] = value
}
log.Printf("Stats: total=%d matches=%d redirects=%d",
stats["total_packets"], stats["policy_matches"], stats["redirects"])
}
}
Use Cases
1. Virtual IP / Anycast Handling
// Server has 192.168.100.5 on loopback
// Route traffic from this IP via specific interface
policy := RoutingPolicy{
SourceIP: net.ParseIP("192.168.100.5"),
Interface: "eth1",
NextHop: net.ParseIP("10.0.1.1"),
}
2. Multi-Homing / Multiple ISPs
// Route customer A traffic via ISP 1
addRoutingPolicy(map, RoutingPolicy{
SourceIP: net.ParseIP("10.10.1.0"), // Customer A subnet
Interface: "isp1",
NextHop: net.ParseIP("203.0.113.1"),
})
// Route customer B traffic via ISP 2
addRoutingPolicy(map, RoutingPolicy{
SourceIP: net.ParseIP("10.10.2.0"), // Customer B subnet
Interface: "isp2",
NextHop: net.ParseIP("198.51.100.1"),
})
3. Service Mesh Sidecar
// Pod has multiple interfaces:
// - Service traffic → service network
// - Management traffic → management network
servicePolicies := []RoutingPolicy{
{
SourceIP: net.ParseIP("10.96.0.10"), // Service IP
Interface: "net1",
NextHop: net.ParseIP("10.96.0.1"),
},
{
SourceIP: net.ParseIP("192.168.1.10"), // Management IP
Interface: "net0",
NextHop: net.ParseIP("192.168.1.1"),
},
}
4. VPN / Tunnel Endpoint
// Traffic from tunnel IPs → tunnel interface
addRoutingPolicy(map, RoutingPolicy{
SourceIP: net.ParseIP("172.16.0.0"), // VPN subnet
Interface: "tun0",
NextHop: net.ParseIP("172.16.0.1"),
})
Advanced Patterns
Load Balancing Across Multiple Paths
#define MAX_BACKENDS 4
struct backend {
__u32 interface_id;
__u32 next_hop;
__u32 weight; // For weighted load balancing
};
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, __u32);
__type(value, struct backend);
__uint(max_entries, MAX_BACKENDS);
} backends SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, __u32);
__type(value, __u32);
__uint(max_entries, 1);
} backend_count SEC(".maps");
static __always_inline struct backend *select_backend() {
__u32 key = 0;
__u32 *count = bpf_map_lookup_elem(&backend_count, &key);
if (!count || *count == 0)
return NULL;
// Round-robin selection
__u32 idx = bpf_get_prandom_u32() % (*count);
return bpf_map_lookup_elem(&backends, &idx);
}
SEC("tc")
int load_balancer(struct __sk_buff *skb) {
// ... parse packet ...
struct backend *backend = select_backend();
if (!backend)
return TC_ACT_OK;
struct bpf_redir_neigh nh = {
.nh_family = AF_INET,
.ipv4_nh = backend->next_hop,
};
return bpf_redirect_neigh(backend->interface_id, &nh, sizeof(nh), 0);
}
Conditional Routing Based on Multiple Criteria
// Route based on source IP + destination port
struct policy_key {
__u32 src_ip;
__u16 dst_port;
__u8 protocol;
__u8 _pad;
} __attribute__((packed));
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, struct policy_key);
__type(value, struct route_policy);
__uint(max_entries, 1024);
} complex_policies SEC(".maps");
SEC("tc")
int complex_router(struct __sk_buff *skb) {
// ... parse packet to get IP + TCP/UDP ...
struct policy_key key = {
.src_ip = iph->saddr,
.dst_port = tcp->dest, // or udp->dest
.protocol = iph->protocol,
};
struct route_policy *policy = bpf_map_lookup_elem(&complex_policies, &key);
// ... redirect based on policy ...
}
Policy Chaining (Fallback Policies)
SEC("tc")
int chained_router(struct __sk_buff *skb) {
struct iphdr *iph = /* ... parse ... */;
struct route_policy *policy = NULL;
// Try source-based policy first
policy = bpf_map_lookup_elem(&src_policies, &iph->saddr);
// Fall back to destination-based policy
if (!policy)
policy = bpf_map_lookup_elem(&dst_policies, &iph->daddr);
// Fall back to default route
if (!policy) {
__u32 default_key = 0;
policy = bpf_map_lookup_elem(&default_route, &default_key);
}
if (!policy)
return TC_ACT_OK;
// Apply policy
struct bpf_redir_neigh nh = {
.nh_family = AF_INET,
.ipv4_nh = policy->next_hop,
};
return bpf_redirect_neigh(policy->interface_id, &nh, sizeof(nh), 0);
}
IPv6 Support
// IPv6 routing policy
struct route_policy_v6 {
__u32 interface_id;
__u32 next_hop[4]; // IPv6 address (16 bytes)
};
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__type(key, struct in6_addr); // IPv6 source address
__type(value, struct route_policy_v6);
__uint(max_entries, 1024);
} policy_routes_v6 SEC(".maps");
SEC("tc")
int policy_router_v6(struct __sk_buff *skb) {
// ... parse IPv6 packet ...
struct ipv6hdr *ip6h = /* ... */;
struct route_policy_v6 *policy = bpf_map_lookup_elem(&policy_routes_v6, &ip6h->saddr);
if (!policy)
return TC_ACT_OK;
struct bpf_redir_neigh nh = {
.nh_family = AF_INET6,
};
__builtin_memcpy(nh.ipv6_nh, policy->next_hop, sizeof(nh.ipv6_nh));
return bpf_redirect_neigh(policy->interface_id, &nh, sizeof(nh), 0);
}
Return Value Handling
TC Return Values for Redirection
// Success
TC_ACT_REDIRECT // Successfully redirected
// Errors
TC_ACT_OK // Continue normal processing
TC_ACT_SHOT // Drop packet
Error Handling Pattern
long ret = bpf_redirect_neigh(ifindex, &nh, sizeof(nh), 0);
if (ret != TC_ACT_REDIRECT) {
// Log failure
bpf_printk("Redirect failed: ifindex=%d ret=%ld", ifindex, ret);
// Increment error counter
__u32 key = STAT_REDIRECT_ERRORS;
__u64 *counter = bpf_map_lookup_elem(&stats, &key);
if (counter)
__sync_fetch_and_add(counter, 1);
// Fall back to normal routing
return TC_ACT_OK;
}
return ret;
Best Practices
- Always validate interfaces exist before adding policies
- Use network byte order for IP addresses in maps
- Add statistics to monitor policy effectiveness
- Implement fallback to normal routing if no policy matches
- Log redirections during development (remove in production)
- Bounds check before accessing packet data
- Use bpf_redirect_neigh for L3 forwarding (handles ARP/MAC automatically)
- Test with real network topologies (not just localhost)
- Monitor for redirect failures and investigate root causes
- Document your policies in userspace code comments
Common Pitfalls
- Forgetting to attach to egress (not ingress) for source-based routing
- Using host byte order instead of network byte order in maps
- Not handling the case where policy lookup fails
- Redirecting without checking if interface is up
- Not validating that next hop is reachable
- Forgetting IPv6 support if dual-stack
- Not implementing statistics/monitoring
Debugging
Enable Debug Logging
// In eBPF program
bpf_printk("Redirect: src=%pI4 dst=%pI4 iface=%d",
&iph->saddr, &iph->daddr, policy->interface_id);
View Logs
# Using tc
sudo tc exec bpf dbg
# Or using bpftool
sudo bpftool prog tracelog
# Or directly
sudo cat /sys/kernel/debug/tracing/trace_pipe
Check Redirect Stats
# View eBPF program stats
bpftool prog show
# Check map contents
bpftool map dump name policy_routes_v4
Verify Interface and Routes
# Check interface is up
ip link show eth1
# Verify next hop is reachable
ip route get 10.0.1.1
Integration with Other Skills
- ebpf-packet-parser: Use to extract flow information before routing decision
- ebpf-map-handler: Manage routing policy maps
- ebpf-attach-hook: Attach router to TC egress
- cnf-networking: Set up test topology with netkit/veth
- ebpf-test-harness: Validate routing with multi-container tests