BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH

Note

  • BPF_MAP_TYPE_DEVMAP was introduced in kernel version 4.14

  • BPF_MAP_TYPE_DEVMAP_HASH was introduced in kernel version 5.4

BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH are BPF maps primarily used as backend maps for the XDP BPF helper call bpf_redirect_map(). BPF_MAP_TYPE_DEVMAP is backed by an array that uses the key as the index to lookup a reference to a net device. While BPF_MAP_TYPE_DEVMAP_HASH is backed by a hash table that uses a key to lookup a reference to a net device. The user provides either <key/ ifindex> or <key/ struct bpf_devmap_val> pairs to update the maps with new net devices.

Note

  • The key to a hash map doesn't have to be an ifindex.

  • While BPF_MAP_TYPE_DEVMAP_HASH allows for densely packing the net devices it comes at the cost of a hash of the key when performing a look up.

The setup and packet enqueue/send code is shared between the two types of devmap; only the lookup and insertion is different.

Usage

Kernel BPF

bpf_redirect_map()

long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)

Redirect the packet to the endpoint referenced by map at index key. For BPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH this map contains references to net devices (for forwarding packets through other ports).

The lower two bits of flags are used as the return code if the map lookup fails. This is so that the return value can be one of the XDP program return codes up to XDP_TX, as chosen by the caller. The higher bits of flags can be set to BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS as defined below.

With BPF_F_BROADCAST the packet will be broadcast to all the interfaces in the map, with BPF_F_EXCLUDE_INGRESS the ingress interface will be excluded from the broadcast.

Note

  • The key is ignored if BPF_F_BROADCAST is set.

  • The broadcast feature can also be used to implement multicast forwarding: simply create multiple DEVMAPs, each one corresponding to a single multicast group.

This helper will return XDP_REDIRECT on success, or the value of the two lower bits of the flags argument if the map lookup fails.

More information about redirection can be found Redirect

bpf_map_lookup_elem()

void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)

Net device entries can be retrieved using the bpf_map_lookup_elem() helper.

User space

Note

DEVMAP entries can only be updated/deleted from user space and not from an eBPF program. Trying to call these functions from a kernel eBPF program will result in the program failing to load and a verifier warning.

bpf_map_update_elem()

int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags);

Net device entries can be added or updated using the bpf_map_update_elem() helper. This helper replaces existing elements atomically. The value parameter can be struct bpf_devmap_val or a simple int ifindex for backwards compatibility.

struct bpf_devmap_val {
    __u32 ifindex;   /* device index */
    union {
        int   fd;  /* prog fd on map write */
        __u32 id;  /* prog id on map read */
    } bpf_prog;
};
The flags argument can be one of the following:
  • BPF_ANY: Create a new element or update an existing element.

  • BPF_NOEXIST: Create a new element only if it did not exist.

  • BPF_EXIST: Update an existing element.

DEVMAPs can associate a program with a device entry by adding a bpf_prog.fd to struct bpf_devmap_val. Programs are run after XDP_REDIRECT and have access to both Rx device and Tx device. The program associated with the fd must have type XDP with expected attach type xdp_devmap. When a program is associated with a device index, the program is run on an XDP_REDIRECT and before the buffer is added to the per-cpu queue. Examples of how to attach/use xdp_devmap progs can be found in the kernel selftests:

  • tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c

  • tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c

bpf_map_lookup_elem()


int bpf_map_lookup_elem(int fd, const void *key, void *value);

Net device entries can be retrieved using the bpf_map_lookup_elem() helper.

bpf_map_delete_elem()


int bpf_map_delete_elem(int fd, const void *key);

Net device entries can be deleted using the bpf_map_delete_elem() helper. This helper will return 0 on success, or negative error in case of failure.

Examples

Kernel BPF

The following code snippet shows how to declare a BPF_MAP_TYPE_DEVMAP called tx_port.

struct {
    __uint(type, BPF_MAP_TYPE_DEVMAP);
    __type(key, __u32);
    __type(value, __u32);
    __uint(max_entries, 256);
} tx_port SEC(".maps");

The following code snippet shows how to declare a BPF_MAP_TYPE_DEVMAP_HASH called forward_map.

struct {
    __uint(type, BPF_MAP_TYPE_DEVMAP_HASH);
    __type(key, __u32);
    __type(value, struct bpf_devmap_val);
    __uint(max_entries, 32);
} forward_map SEC(".maps");

Note

The value type in the DEVMAP above is a struct bpf_devmap_val

The following code snippet shows a simple xdp_redirect_map program. This program would work with a user space program that populates the devmap forward_map based on ingress ifindexes. The BPF program (below) is redirecting packets using the ingress ifindex as the key.

SEC("xdp")
int xdp_redirect_map_func(struct xdp_md *ctx)
{
    int index = ctx->ingress_ifindex;

    return bpf_redirect_map(&forward_map, index, 0);
}

The following code snippet shows a BPF program that is broadcasting packets to all the interfaces in the tx_port devmap.

SEC("xdp")
int xdp_redirect_map_func(struct xdp_md *ctx)
{
    return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS);
}

User space

The following code snippet shows how to update a devmap called tx_port.

int update_devmap(int ifindex, int redirect_ifindex)
{
    int ret;

    ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0);
    if (ret < 0) {
        fprintf(stderr, "Failed to update devmap_ value: %s\n",
            strerror(errno));
    }

    return ret;
}

The following code snippet shows how to update a hash_devmap called forward_map.

int update_devmap(int ifindex, int redirect_ifindex)
{
    struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex };
    int ret;

    ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0);
    if (ret < 0) {
        fprintf(stderr, "Failed to update devmap_ value: %s\n",
            strerror(errno));
    }
    return ret;
}

References