Blog/More random notes about network programming (Episode 2)
As I've mentioned in the previous article (Blog/Random notes about network programming) I've decided to split these notes in multiple pages (I'll be calling them "Episodes").
In this episode, I'm taking notes about receiving and sending UDP traffic.
The goal for this page is to build up a basic udp server and client. I'm particularly interested in using the sendmsg
and recvmsg
functions. The next step would then be to use the Linux-specific sendmmsg
and recvmmsg
.
Creating a server socket
I did skip this topic in the previous episode, mostly because it was still a bit unclear to me, but in order to create a server socket the steps are fairly simple:
- instantiate a socket
- use
socket
fromsys/socket.h
with the appropriate parameters
- use
bind
the socket to an address- use
bind
fromsys/socket.h
- in order to tell the kernel what address you want the socket to be bound to you have to use the
sockaddr_*
structures described in the previous socket
- use
Optionally, if you are working on a connected socket (most often, this means if you are going to do TCP traffic) you'll also need to perform a listen
and then as many accept
s as you'll need.
The msghdr
structure
The msghdr structure is central to both sendmsg
and recvmsg
. It's a fairly big structure in order to limit the number of parameters that have to be passed to those two functions.
It's also a parameter-result value, meaning that those function will take inputs from there and also deposit some outputs in there.
Its fields are:
void* msg_name
- the name is absolutely misleading
- this is a pointer to something like a
struct sockaddr_storage
(but could as well be a sockaddr_in or sockaddr_in6 if you know you'll be receiving a specific type of traffic) - the caller is supposed to allocate/handle the memory that will be pointed by this field
- the pointed memory will be filled with the address information of the sender
- it's actually optional, if you don't care about the informations contained in this struct
socklen_t msg_namelen
- the length of the memory area pointed by
msg_name
- the length of the memory area pointed by
struct iovec* msg_iov
- this is an array of
struct iovec
objects. this is the area of memory where recvmsg will copy the contents of packets received. - it's exploiting the array-pointer equivalence in c - it would have been more intuitive if the type had been
struct iovec msg_iov[]
- More on
struct iovec
later
- this is an array of
int msg_iovlen
- this is the length of the
msg_iov
array
- this is the length of the
void* msg_control
- this is an object of type
struct cmsghdr
- from what i understand, it's mostly useful when using msghdr with recvmsg.
- From the recvmsg syscall manpage: «As an example, Linux uses this ancillary data mechanism to pass extended errors, IP options, or file descriptors over UNIX domain sockets»[1]
- it's often described as "ancillary data"
sys/socket.h
contains some macros (CMSG_*
) to correctly parse this data- caller callocated/manged
- this is an object of type
socklen_t msg_controllen
- the length of the
msg_control
field
- the length of the
int flags
- Flags on received message. Set by the kernel.
As you can see, some of the naming here is not really helpful. Such is life, I guess.
The iovec
structure
This structure is defined in sys/uio.h
and it's supposed to support a technique called "scatter-gather i/o" where essentially you send data from multiple locations using a single syscall. Or at least, that's what I understood.
It has two fields:
void* iov_base
size_t iov_len
Each structure of this kind is essentially a buffer, an area of memory that starts at iov_base
and is iov_len
bytes long.
From my understanding, when an array of this structures are used for sending/receiving (or reading/writing) they will be used in the order they appear in the array.
It seems mostly like a "standard" way to have a buffer and its length in a single structure.
The cmsghdr
structure and its usage
The msg_control
field of struct msghdr is supposed to host "ancillary data". What is meant for "ancillary data" isn't well defined (or even exemplified). However I did find something.
The manpage for cmsg(3) contains some vague description on macros and function needed to handle such data.
The cmsghdr structure has the following definition:
struct cmsghdr {
size_t cmsg_len; /* Data byte count, including header
(type is socklen_t in POSIX) */
int cmsg_level; /* Originating protocol */
int cmsg_type; /* Protocol-specific type */
unsigned char cmsg_data[]; /* the actual data*/
};
The cmsg_level
field is essentially the layer ("layer" as in "layer of the ISO/OSI model), and the cmsg_type
is a layer-specific type of data.
This mean, supposedly ("supposedly" as in "supposing that my understanding is correct") that each layer a packet goes through can leave some data attached to the packet.
An example of this (from the manpage) is getting the TTL field of the IP packet received through sendmsg.
In the example from the cmsg manpage, IPPROTO_IP
(from netinet/in.h
)is used to select messages from the IP layer (so below UDP), and IP_TTL
to select the data about the ttl value.
The question now is obvious... What other values can I use instead of IP_TTL
? And what other values can I use when I swap IPPROTO_IP with another protocol?
So far I've found some other definitions near the definition of IP_TTL
in linux/in.h
(/usr/include/linux/in.h
on my system) but no mention of anything that could answer my questions.
Putting it all together
UDP Server
This simple C program creates an udp socket and binds it to port 5050.
It reads a packet at the time and prints its content, along with the address of the sender.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <sys/uio.h>
// 64kb is the maximum size for an udp datagram AFAIK
#define RECEIVE_BUFFER_SIZE ((2<<15)-1)
// if we ever wanted to have more than one receive buffer..
#define RECEIVE_BUFFER_NENTRIES 1
int main(int argc, char **argv){
// information about where do we want to be bound
struct sockaddr_in local_address ;
local_address.sin_family = AF_INET;
local_address.sin_port = htons(5050);
// translating "0.0.0.0" to network-byte-order
if (inet_pton(AF_INET, "0.0.0.0", &(local_address.sin_addr)) != 1){
printf("Failed to fill local_address structure\n");
exit(1);
}
// instantiating an udp datagram socket on ipv4
int sockfd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
// bind that socket to the address we prepared earlier
if (bind(sockfd,
(struct sockaddr*)&local_address,
sizeof(struct sockaddr_in))) {
perror("Error binding socket to port");
}
printf("Setting up receive buffer sized at %d bytes\n", RECEIVE_BUFFER_SIZE);
struct iovec receive_buffers[RECEIVE_BUFFER_NENTRIES];
// if we had more than one receive buffer, they would all be set up here
for (int i=0 ; i < RECEIVE_BUFFER_NENTRIES; i++){
receive_buffers[i].iov_base = malloc(RECEIVE_BUFFER_SIZE);
receive_buffers[i].iov_len = RECEIVE_BUFFER_SIZE;
}
// prepare the msghdr structure, and make sure it's empty (via memset)
struct msghdr inmsg;
memset(&inmsg, 0, sizeof(struct msghdr));
// allocate memory for address information about the sender
inmsg.msg_name = malloc(sizeof(struct sockaddr_storage)) ;
inmsg.msg_namelen = sizeof(struct sockaddr_storage);
// allocate and prepare memory for the "ancillary data"
struct cmsghdr control_message;
memset(&control_message, 0, sizeof(struct cmsghdr));
inmsg.msg_control = &control_message;
inmsg.msg_controllen = sizeof(struct cmsghdr);
// tell the msghdr where the receive buffers are...
inmsg.msg_iov = receive_buffers;
// ...and how many of them are there
inmsg.msg_iovlen = RECEIVE_BUFFER_NENTRIES ;
// prepare a string large enough to hold an ipv4 address
char *remote_addr = (char *)malloc(INET_ADDRSTRLEN);
// and make sure it's clean
memset(remote_addr, 0, INET_ADDRSTRLEN);
while( 1 ) {
// we can finally receive packets!
int recv_size = recvmsg(sockfd, &inmsg, MSG_WAITFORONE);
if (recv_size == -1 ) {
perror("Something went wrong with recvmsg: ");
continue ;
}
inet_ntop(AF_INET,
// this was messy to get right
(void*) &((struct sockaddr_in*) inmsg.msg_name)->sin_addr,
remote_addr,
INET_ADDRSTRLEN);
printf("Received 1 packet from '%s' containing %d bytes\n", remote_addr, recv_size);
// eh, we should not assume all the content is in the first receive buffer...
// ... but whatever
printf("Message in the udp packet: '%s'\n", (char *)inmsg.msg_iov[0].iov_base);
// let's make sure this string doesn't keep data from previous operations
memset(remote_addr, 0, INET_ADDRSTRLEN);
// i should probably clean other structures, but whatever...
// i should have some kind of exit condition here...
//sleep(1);
}
shutdown(sockfd, SHUT_RDWR);
close(sockfd);
exit(0);
}
Testing the udp server
I'll be using the good old netcat, as I haven't written the udp client yet, and also netcat works for sure.
Compiling and running the server:
[manu@astrolabio network-programming]$ ./a.out
Setting up receive buffer sized at 65535 bytes
Received 1 packet from '127.0.0.1' containing 12 bytes
Message in the udp packet: 'hello world!'
Received 1 packet from '172.12.3.248' containing 12 bytes
Message in the udp packet: 'hello world'
Sending udp packets with netcat:
[manu@astrolabio ~]$ echo -n 'hello world!' | nc -u 127.0.0.1 5050
UDP client
The client part has a fairly simple job: assemble an udp packet and send it.
The target ip address and the target port will be taken from command line arguments, and the payload will be read from stdin.
Needless to say, I'm capping the possible length to 64KB (the maximum size of an UDP datagram, afaik) and i'm not bothering too much with error and bounds checking (that's not really the point of this exercise, so it's low effort).
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <sys/uio.h>
#define SEND_BUFFER_SIZE ((2<<15)-1)
int main(int argc, char **argv){
if (argc < 3) {
printf("usage: ./client host port\n");
printf("payload will be read from stdin\n");
exit(1);
}
char* payload = (char*) malloc(SEND_BUFFER_SIZE);
if (fgets(payload, SEND_BUFFER_SIZE, stdin) == NULL) {
perror("Failed to read payload from stdin");
exit(1);
}
int payload_len = (int)strlen(payload);
if (payload_len >= SEND_BUFFER_SIZE) {
fprintf(stderr, "error: input exceeds send buffer size\n");
exit(1);
}
printf("Target server: '%s'\n", argv[1]);
printf("Target port: '%s'\n", argv[2]);
printf("Payload: '%s'\n", payload);
printf("Payload length: %d\n", payload_len);
struct sockaddr_in target = {
.sin_family = AF_INET,
.sin_port = htons(atoi(argv[2]))
};
// translating "0.0.0.0" to network-byte-order
if (inet_pton(AF_INET, argv[1], &(target.sin_addr)) != 1){
printf("Failed to fill local_address structure\n");
exit(1);
}
// instantiating an udp datagram socket on ipv4
int sockfd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
struct iovec send_buffer = {
.iov_base = payload,
.iov_len = payload_len
};
// prepare the msghdr structure, and make sure it's empty (via memset)
struct msghdr outmsg = {
.msg_name = &target,
.msg_namelen = sizeof(struct sockaddr_in),
// .msg_control = &(struct cmsghdr){},
// .msg_controllen = sizeof(struct cmsghdr);
.msg_iov = &send_buffer,
.msg_iovlen = 1,
};
if ((sendmsg(sockfd, &outmsg, 0)) == -1) {
perror("sendmsg failed");
exit(1);
}
close(sockfd);
exit(0);
}
Interestingly, setting a cmsghdr
structure in the msghdr structure for sending results in sendmsg
failing with "Invalid argument" as reason (it's not a very helpful reason, by the way).
One of the most interesting things about the udp client code, for me, is that by copying from comparing to the previous code it seems to me I got a bit better at this endeavour.
Testing the UDP client
Fairly simple, as usual:
[manu@astrolabio network-programming]$ echo -n "hello world!" | ./client 127.0.0.1 9095
Target server: '127.0.0.1'
Target port: '9095'
Payload: 'hello world!'
Payload length: 12
sendmsg failed: Invalid argument
[manu@astrolabio network-programming]$
Closing marks
I got carried away and I've written a very small sample udp server using io_uring
:)
It will appear in episode three of this series, I guess.
References
- ↑ The recvmsg Linux syscall manpage: https://www.man7.org/linux/man-pages/man2/recvmsg.2.html