Common Lisp socket client - reconnect on failure

I'm working on an IMAP client for Common Lisp. Details about that in a future post, but I'd like to share here a few bits of code that I had to dig for.

The CL socket tutorials that I could find focus on how to connect to a server, fetch some data and then disconnect. In my case, however, the connection is supposed to be long lived. There's a thread that continuously reads data from the server (it fetches new emails or flag changes and updates a local maildir). Of course, in the wild there is no such thing as a 100% live connection — it will go down for a variety of reasons out of our control, so we need to handle errors and reconnect.

Here is a simplified version of what I came up with. It's probably not perfect, but it works. If you have any ideas for improvements, or if you know some other tutorial about this, please drop me a note!

To play with it, load the libraries first:

CL-USER> (ql:quickload "usocket" "cl+ssl" "flexi-streams" "bordeaux-threads")

Then, you can paste the following in some temporary buffer and evaluate it.

(defpackage :test
  (:use :cl))

(in-package :test)

(defparameter sock nil)
(defparameter binstr nil)
(defparameter txt nil)

(defun connect-internal ()
  (format *error-output* "Connecting...~%")

  ;; Here we connect to the server and request a binary stream
  (setf sock (usocket:socket-connect "imap.gmail.com" 993
                                     :element-type '(unsigned-byte 8)))

  ;; I'm not sure the following two are actually needed; I think
  ;; they're both on by default.
  (setf (usocket:socket-option sock :tcp-nodelay) t
        (usocket:socket-option sock :tcp-keepalive) t)

  #+sbcl
  (progn
    ;; For these, `usocket' doesn't offer a portable `socket-option'
    ;; implementation, so I'm setting them here only for SBCL. But
    ;; they are important!
    (setf (sb-bsd-sockets:sockopt-tcp-keepidle (usocket:socket sock)) 5)
    (setf (sb-bsd-sockets:sockopt-tcp-keepintvl (usocket:socket sock)) 3)
    (setf (sb-bsd-sockets:sockopt-tcp-keepcnt (usocket:socket sock)) 3))

  ;; SSL-ify our connection. We get here a binary stream
  (setf binstr (cl+ssl:make-ssl-client-stream (usocket:socket-stream sock)
                                              :verify nil))

  ;; And now get a text stream for it via flexi-streams. I could have
  ;; used the :external-format argument on the above line as well, but
  ;; in my case I'd like to have both a binary, and a text stream.
  (setf txt (flex:make-flexi-stream
             binstr :external-format '(:iso-8859-1 :eol-style :crlf))))

;; This macro takes one form (`on-error') and a `body'. The body is
;; wrapped in a `handler-case' that catches network error conditions,
;; and calls `on-error' when they occur.
(defmacro with-network-conditions (on-error &body body)
  `(handler-case
       (progn
         ,@body)
     (usocket:ns-error (ex)
       (format *error-output* "ERROR: ~A~%" ex)
       ,on-error)
     (usocket:socket-error (ex)
       (format *error-output* "ERROR: ~A~%" ex)
       ,on-error)
     (cl+ssl::ssl-error (ex)
       (format *error-output* "ERROR: ~A~%" ex)
       ,on-error)))

;; Connect to the server and start the read loop thread.
(defun connect ()
  (with-network-conditions (progn
                             (format t "Sleeping 2 seconds~%")
                             (sleep 2)
                             (connect))
    (connect-internal)
    ;; start the read loop
    (bt2:make-thread
     (lambda ()
       (with-network-conditions (when sock (connect))
         (loop for i from 1 while sock do
           (loop while (listen txt)
                 do (format t "~&~A~%" (read-line txt)))
           (usocket:wait-for-input sock :timeout 1)
           (format t ".")
           (when (zerop (mod i 70))
             (format t "~%"))))))))

;; Close the connection; by setting `sock' to nil we're also telling
;; the read loop thread to exit.
(defun stop ()
  (let ((s sock))
    (setf sock nil)
    (usocket:socket-close s)))

;; Send one line to the server.
(defun send (str)
  (write-string str txt)
  (write-char #\Newline txt)
  (force-output txt))

Now in the REPL, switch to the :test package and try it:

TEST> (connect)
Connecting...
.
* OK Gimap ready for requests from ********
.........
TEST> (send "1 capability")
.
* CAPABILITY IMAP4rev1 UNSELECT IDLE NAMESPACE QUOTA ID XLIST CHILDREN X-GM-EXT-1 XYZZY SASL-IR AUTH=XOAUTH2 AUTH=PLAIN AUTH=PLAIN-CLIENTTOKEN AUTH=OAUTHBEARER
1 OK Thats all she wrote! ********
.......

The "OK" lines are printed from the read thread, which also ticks one dot every second, so that I know it's running. Now if you shut down the network — or, say, switch to another wifi network — the socket will no longer work. After about 10 seconds it should print out an error and attempt to reconnect, and it'll do that every 2 seconds until it succeeds and starts another read thread.

TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT

Note the SBCL-specific lines in the code:

(setf (sb-bsd-sockets:sockopt-tcp-keepidle (usocket:socket sock)) 5)
(setf (sb-bsd-sockets:sockopt-tcp-keepintvl (usocket:socket sock)) 3)
(setf (sb-bsd-sockets:sockopt-tcp-keepcnt (usocket:socket sock)) 3)

Without setting these values, it took something between half an hour and two hours for the system to realize that the socket is dead and send out an error. What I found weird is that I could even send out commands during this time, and I still got no error - just nothing happened. See “when TCP sockets refuse to die” for more information about these parameters.

For now, usocket:socket-option doesn't support setting them, and I don't know how to do it for implementations other than SBCL.

No comments yet. Wanna add one? Please?

Add your comment

Feb
3
2025

Common Lisp socket client - reconnect on failure

  • Published: 2025-02-03
  • Modified: 2025-02-21 15:34
  • By: Mishoo
  • Comments: 0 (add)