0x07 - Things I've learned about lwIP's TCP library
I’ve been using lwip’s raw TCP stack in anger over the past year or so and had the idea to document some useful information I’ve discovered when using it. I’ve mostly been using the official documentation1,a fandom.com2 page I came across and the tcp_client
example in the official pico SDK3. It can be a bit of a faff to use so I figured I’d summarise stuff I’ve learned about using it on this page. It will be a living page that is updated when I find something relevant to add.
What the hell is LWIP?
It stands for Lightweight Internet Protocol and is a minimal TCP/IP stack designed for use on embedded platforms1. I’ve mostly been using the raw TCP functionality for sending and receiving MQTT data to/from a broker on a local network using a Raspberry Pi Pico W (henceforth known as ‘pico’ in this article).
Preliminary info
When using lwIP on a pico W there are three ways you can use it which are poll, thread safe background and , freertos 4. For my applications I use the thread safe background, where lwIP tasks are executed in an interrupt routine. The Pico SDK documentation states that when lwIP is used with this configuration, all calls into lwIP during the normal execution context (i.e. NOT in an interrupt callback) need to be within cyw43_arch_lwip_begin()
and cyw43_arch_lwip_end()
blocks as the library is not thread safe4. These functions act as a Critical Section to prevent lwIP’s internals getting trashed because of buffer access via interrupt-context and normal-context. Inspecting the SDK you can see that these functions call into functions that block other contexts from accessing the lwip resources while the given context is using it 5. When using lwIP on a different platform, these calls should be replaced with critical sections for the interrupt handler that is driving lwip operation. In my experience this is either a timer (as is the case with the pico) or an Ethernet peripheral ISR.
Initialising
Initialising the raw TCP stack begins with a call to tcp_new()
which creates a new instance of the TCP PCB (protocol control block). I found that this function would fail if I hadn’t properly freed the previous instances when
attempting to restart the connection. Presumably the other PCB’s were still being handled by lwIP and it had ran out of memory.
With the PCB acquired next is to initialise all the callback functions that will be used in our program.
Once an acquired PCB has been acquired and the callbacks defined you can then attempt to establish a connection. In my application I’m connecting to a local server which has a fixed IP, so there is no need to perform any DNS requests to resolve a domain name. However if the IP is stored in a string you’ll need to convert it from the ASCII for (with dot notation) to network byte order6 7 format using the following example code:
Connecting
With all the various gubbins initialised we can attempt a connection using the following code, the return of the function determines whether the attempt to connect was successful, not whether the actual connection itself was established. In the tcp_connect
function, we’re passing the pointer to the PCB instance, our remote address (in the right format), the port number and finally a pointer to a callback function which lwIP will call upon a successful connection. Upon attempting the connection, the return code is checked and if it is not ERR_OK
, then de-allocate the PCB instance using tcp_abort
and presumably get the program to try again.
Aborting the connection
In the previous section I introduced tcp_abort
, which I think of as the nuclear approach to ending the TCP connection. You want to use this function to kill the tcp_pcb
instance in the event of an ungracious severing of the connection. By ungracious I mean scenarios such as physically removing the Ethernet cable, turning off the router etc, essentially any scenario where a gracious closing of the connection cannot be carried out on the wire (or in the air).
For scenarios where the tcp connection is graciously severed, this is scenarios such as the server MQTT broker shutting down or your client/server intentionally ending the connection by sending a TCP reset. In this scenario you would use tcp_close
and if that fails, call abort which the documentation states “never fails”8.
Sending Data
Sending data is pretty straight forward, first you need to call tcp_write
, which will copy the data to lwIP’s stack or simply point to the data depending on what flags you provide. In the snippet below I’m using a flag which copies the data to the lwIP stack. Once the data is written, then you can call tcp_output
which will attempt to transmit the enqueued data. For my usage the data packets have always been small so my Send
function has tcp_write
and tcp_output
called one after the other.
Receiving Data
Receiving data is a bit more involved than the previously discussed sections. It utilises a callback function which lwIP calls when there is incoming TCP data to process. The received data is referenced by a packet buffer (pbuf
) singly-linked list9. This linked-list represents a single data packet9. The callback function needs to iterate through the linked-list and copy the data into a local buffer using pbuf_copy_partial
if it is needed outside of the callback. The pbuf data can also be interacted with directly if necessary. Once the linked-list has been traversed tcp_recved
is called which notifies the lwIP driver that more data is ready to be received. Finally, the linked-list is de-allocated using pbuf_free
. This last step is crucial or else you will eventually run out of memory.
Handling Errors
Registering a callback for errors is useful for driving a comms state machine. I typically use the Error callback function to detect a connection reset from the server so that a re-connection routine can be initiated.