Less Than Exceptional

26/02/2017

As of a few weeks ago, I’ve become part of Evil’s new maintainer team. Most of the bugs we get are not quite Evil’s fault, but rather that of other components it relies upon, such as Emacs. #785 is a perfect example of this observation. Take a look at the sources of the function responsible for the reported behavior and see whether you can find what’s wrong about it:

DEFUN ("lookup-key", Flookup_key, Slookup_key, 2, 3, 0,
       doc: ...)
  (Lisp_Object keymap, Lisp_Object key, Lisp_Object accept_default)
{
  ptrdiff_t idx;
  Lisp_Object cmd;
  Lisp_Object c;
  ptrdiff_t length;
  bool t_ok = !NILP (accept_default);

  keymap = get_keymap (keymap, 1, 1);

  length = CHECK_VECTOR_OR_STRING (key);
  if (length == 0)
    return keymap;

  idx = 0;
  while (1)
    {
      c = Faref (key, make_number (idx++));

      if (CONSP (c) && lucid_event_type_list_p (c))
        c = Fevent_convert_list (c);

      /* Turn the 8th bit of string chars into a meta modifier.  */
      if (STRINGP (key) && XINT (c) & 0x80 && !STRING_MULTIBYTE (key))
        XSETINT (c, (XINT (c) | meta_modifier) & ~0x80);

      /* Allow string since binding for `menu-bar-select-buffer'
         includes the buffer name in the key sequence.  */
      if (!INTEGERP (c) && !SYMBOLP (c) && !CONSP (c) && !STRINGP (c))
        message_with_string ("Key sequence contains invalid event %s", c, 1);

      cmd = access_keymap (keymap, c, t_ok, 0, 1);
      if (idx == length)
        return cmd;

      keymap = get_keymap (cmd, 0, 1);
      if (!CONSP (keymap))
        return make_number (idx);

      maybe_quit ();
    }
}

If you haven’t spotted it, the function never signals an exception for the exceptional case of an invalid key sequence. Instead, it returns one of the few types that aren’t allowed in a definition, namely an integer. It’s up to the caller to handle this case on their own, something the majority of Emacs packages rightfully neglect to do as even + informs you about invalid input. This design decision reeks of the many ways of signalling errors in C and worse, cannot be properly fixed as there’s code inside Emacs relying on the integer return type. There is even a lookup-key-ignore-too-long in Emacs core that looks a lot like what I wrote, but it’s part of menu-bar.el for whatever reason and not advertised at all.

tl;dr: In-band signalling sucks big time.


Strange Code

08/02/2017
(defface w3m-history-current-url
  ;; The following strange code compounds the attributes of the
  ;; `secondary-selection' face and the `w3m-arrived-anchor' face,
  ;; and generates the new attributes for this face.
  (let ((base 'secondary-selection)
        (fn (if (featurep 'xemacs)
                'face-custom-attributes-get
              'custom-face-attributes-get));; What a perverseness it is.
        ;; Both `face-custom-attributes-get' in XEmacs and
        ;; `custom-face-attributes-get' in CUSTOM 1.9962 attempt to
        ;; require `font' in Emacs/w3 and `cl' arbitrarily. :-/
        (features (cons 'font features))
        base-attributes attributes attribute)
    (setq base-attributes (funcall fn base nil)
          attributes (funcall fn 'w3m-arrived-anchor nil))
    (while base-attributes
      (setq attribute (car base-attributes))
      (unless (memq attribute '(:foreground :underline))
        (setq attributes (plist-put attributes attribute
                                    (cadr base-attributes))))
      (setq base-attributes (cddr base-attributes)))
    (list (list t attributes)))
  "Face used to highlight the current url in the \"about://history/\" page."
  :group 'w3m-face)

These days one would simply use :inherit for both w3m-arrived-anchor and secondary-selection.


No Obvious Way To Do It Properly

05/01/2017

You can often catch me griping way more about an “Emacs API” than Emacs Lisp, often without any explanation how the interfaces Emacs provides could be possibly worse than the language used for this. Here comes a not too obvious example of the issue at hand that will not go away if you were to rewrite the code in question in <insert favorite toy language>:

(defvar flyspell-prog-text-faces
  '(font-lock-string-face font-lock-comment-face font-lock-doc-face)
  "Faces corresponding to text in programming-mode buffers.")

(defun flyspell-generic-progmode-verify ()
  "Used for `flyspell-generic-check-word-predicate' in programming modes."
  ;; (point) is next char after the word. Must check one char before.
  (let ((f (get-text-property (- (point) 1) 'face)))
    (memq f flyspell-prog-text-faces)))

The above is helper code for Flyspell to detect whether point is in a string or comment. While this is most certainly the easiest way to achieve this goal, it isn’t what the manual recommends. What you’re supposed to do instead is to use the syntax table parser by accessing its state via syntax-ppss and checking slots 3 and 4[1]. The issue with that approach is that it’s not that simple, see the sources of Smartparens for the ugly details, more specifically the sp-point-in-string-or-comment function.

Let’s take a step back. Why exactly do we need to check the syntactical state of Emacs? To understand what’s going on, you’ll need to know that the first step when building a major mode for a programming language is to set up the syntax tables so that Emacs can recognize strings and comments properly. Once this has been done, both syntax highlighting and movement work correctly. This hack has been done back then to have that special case dealt with in a performant manner, albeit it only works properly for languages looking like C or Lisp. Single characters can be marked as having a special syntax, with extra slots for two-character sequences. In Lisp, a semicolon would start the comment and every newline would terminate one, in C the comment starters and enders would be /* and */. If you’re using a language with more complicated rules than that, there is the crutch of writing a custom syntax-propertize-function that walks over an arbitrary buffer region and applies syntax properties manually. This is how triple-quoted strings are implemented in python.el, the same technique is used to highlight a S-expression comment in scheme.el which starts with #; and extends to the following S-expression.

Unfortunately, not all modes play well. Some resort to just painting over the thing to be a comment or string by applying the right face to it and sacrifice proper movement over symbols and S-expressions. It shouldn’t be surprising that given the choice to fix faulty modes or circumvent a cumbersome API, plenty Emacs hackers pick the latter and get more robust handling for free. After all, practicality beats purity, right?

[1]I’ve seen at least one mode checking slot 8 instead. A detailed analysis of the difference is left to the reader as exercise.

When Data Becomes Code

28/11/2016

If you’ve hung out on #emacs for a while, chances are you’ve been recommended ibuffer for advanced buffer management (like, killing many buffers at once). There is a horror lurking beneath its pretty interface though and it starts out with a customizable:

(defcustom ibuffer-always-compile-formats (featurep 'bytecomp)
  "If non-nil, then use the byte-compiler to optimize `ibuffer-formats'.
This will increase the redisplay speed, at the cost of loading the
elisp byte-compiler."
  :type 'boolean
  :group 'ibuffer)

Uh-oh:

(defun ibuffer-compile-make-eliding-form (strvar elide from-end-p)
  (let ((ellipsis (propertize ibuffer-eliding-string 'font-lock-face 'bold)))
    (if (or elide (with-no-warnings ibuffer-elide-long-columns))
        `(if (> strlen 5)
             ,(if from-end-p
                  ;; FIXME: this should probably also be using
                  ;; `truncate-string-to-width' (Bug#24972)
                  `(concat ,ellipsis
                           (substring ,strvar
                                      (string-width ibuffer-eliding-string)))
                `(concat
                  (truncate-string-to-width
                   ,strvar (- strlen (string-width ,ellipsis)) nil ?.)
                  ,ellipsis))
           ,strvar)
      strvar)))

(defun ibuffer-compile-make-substring-form (strvar maxvar from-end-p)
  (if from-end-p
      ;; FIXME: not sure if this case is correct (Bug#24972)
      `(truncate-string-to-width str strlen (- strlen ,maxvar) nil ?\s)
    `(truncate-string-to-width ,strvar ,maxvar nil ?\s)))

(defun ibuffer-compile-make-format-form (strvar widthform alignment)
  (let* ((left `(make-string tmp2 ?\s))
         (right `(make-string (- tmp1 tmp2) ?\s)))
    `(progn
       (setq tmp1 ,widthform
             tmp2 (/ tmp1 2))
       ,(pcase alignment
          (:right `(concat ,left ,right ,strvar))
          (:center `(concat ,left ,strvar ,right))
          (:left `(concat ,strvar ,left ,right))
          (_ (error "Invalid alignment %s" alignment))))))

(defun ibuffer-compile-format (format)
  (let ((result nil)
        ;; We use these variables to keep track of which variables
        ;; inside the generated function we need to bind, since
        ;; binding variables in Emacs takes time.
        (vars-used ()))
    (dolist (form format)
      (push
       ;; Generate a form based on a particular format entry, like
       ;; " ", mark, or (mode 16 16 :right).
       (if (stringp form)
           ;; It's a string; all we need to do is insert it.
           `(insert ,form)
         (let* ((form (ibuffer-expand-format-entry form))
                (sym (nth 0 form))
                (min (nth 1 form))
                (max (nth 2 form))
                (align (nth 3 form))
                (elide (nth 4 form)))
           (let* ((from-end-p (when (cl-minusp min)
                                (setq min (- min))
                                t))
                  (letbindings nil)
                  (outforms nil)
                  minform
                  maxform
                  min-used max-used strlen-used)
             (when (or (not (integerp min)) (>= min 0))
               ;; This is a complex case; they want it limited to a
               ;; minimum size.
               (setq min-used t)
               (setq strlen-used t)
               (setq vars-used '(str strlen tmp1 tmp2))
               ;; Generate code to limit the string to a minimum size.
               (setq minform `(progn
                                (setq str
                                      ,(ibuffer-compile-make-format-form
                                        'str
                                        `(- ,(if (integerp min)
                                                 min
                                               'min)
                                            strlen)
                                        align)))))
             (when (or (not (integerp max)) (> max 0))
               (setq max-used t)
               (cl-pushnew 'str vars-used)
               ;; Generate code to limit the string to a maximum size.
               (setq maxform `(progn
                                (setq str
                                      ,(ibuffer-compile-make-substring-form
                                        'str
                                        (if (integerp max)
                                            max
                                          'max)
                                        from-end-p))
                                (setq strlen (string-width str))
                                (setq str
                                      ,(ibuffer-compile-make-eliding-form
                                        'str elide from-end-p)))))
             ;; Now, put these forms together with the rest of the code.
             (let ((callform
                    ;; Is this an "inline" column?  This means we have
                    ;; to get the code from the
                    ;; `ibuffer-inline-columns' alist and insert it
                    ;; into our generated code.  Otherwise, we just
                    ;; generate a call to the column function.
                    (ibuffer-aif (assq sym ibuffer-inline-columns)
                        (nth 1 it)
                      `(,sym buffer mark)))
                   ;; You're not expected to understand this.  Hell, I
                   ;; don't even understand it, and I wrote it five
                   ;; minutes ago.
                   (insertgenfn
                    (if (get sym 'ibuffer-column-summarizer)
                        ;; I really, really wish Emacs Lisp had closures.
                        ;; FIXME: Elisp does have them now.
                        (lambda (arg sym)
                          `(insert
                            (let ((ret ,arg))
                              (put ',sym 'ibuffer-column-summary
                                   (cons ret (get ',sym
                                                  'ibuffer-column-summary)))
                              ret)))
                      (lambda (arg _sym)
                        `(insert ,arg))))
                   (mincompform `(< strlen ,(if (integerp min)
                                                min
                                              'min)))
                   (maxcompform `(> strlen ,(if (integerp max)
                                                max
                                              'max))))
               (if (or min-used max-used)
                   ;; The complex case, where we have to limit the
                   ;; form to a maximum or minimum size.
                   (progn
                     (when (and min-used (not (integerp min)))
                       (push `(min ,min) letbindings))
                     (when (and max-used (not (integerp max)))
                       (push `(max ,max) letbindings))
                     (push
                      (if (and min-used max-used)
                          `(if ,mincompform
                               ,minform
                             (if ,maxcompform
                                 ,maxform))
                        (if min-used
                            `(when ,mincompform
                               ,minform)
                          `(when ,maxcompform
                             ,maxform)))
                      outforms)
                     (push `(setq str ,callform
                                  ,@(when strlen-used
                                      `(strlen (string-width str))))
                           outforms)
                     (setq outforms
                           (append outforms
                                   (list (funcall insertgenfn 'str sym)))))
                 ;; The simple case; just insert the string.
                 (push (funcall insertgenfn callform sym) outforms))
               ;; Finally, return a `let' form which binds the
               ;; variables in `letbindings', and contains all the
               ;; code in `outforms'.
               `(let ,letbindings
                  ,@outforms)))))
       result))
    ;; We don't want to unconditionally load the byte-compiler.
    (funcall (if (or ibuffer-always-compile-formats
                     (featurep 'bytecomp))
                 #'byte-compile
               #'identity)
             ;; Here, we actually create a lambda form which
             ;; inserts all the generated forms for each entry
             ;; in the format string.
             `(lambda (buffer mark)
                (let ,vars-used
                  ,@(nreverse result))))))

(defun ibuffer-recompile-formats ()
  "Recompile `ibuffer-formats'."
  (interactive)
  (setq ibuffer-compiled-formats
        (mapcar #'ibuffer-compile-format ibuffer-formats))
  (when (boundp 'ibuffer-filter-format-alist)
    (setq ibuffer-compiled-filter-formats
          (mapcar (lambda (entry)
                    (cons (car entry)
                          (mapcar (lambda (formats)
                                    (mapcar #'ibuffer-compile-format formats))
                                  (cdr entry))))
                  ibuffer-filter-format-alist))))

(defun ibuffer-clear-summary-columns (format)
  (dolist (form format)
    (when (and (consp form)
               (get (car form) 'ibuffer-column-summarizer))
      (put (car form) 'ibuffer-column-summary nil))))

(defun ibuffer-check-formats ()
  (when (null ibuffer-formats)
    (error "No formats!"))
  (let ((ext-loaded (featurep 'ibuf-ext)))
    (when (or (null ibuffer-compiled-formats)
              (null ibuffer-cached-formats)
              (not (eq ibuffer-cached-formats ibuffer-formats))
              (null ibuffer-cached-eliding-string)
              (not (equal ibuffer-cached-eliding-string ibuffer-eliding-string))
              (eql 0 ibuffer-cached-elide-long-columns)
              (not (eql ibuffer-cached-elide-long-columns
                        (with-no-warnings ibuffer-elide-long-columns)))
              (and ext-loaded
                   (not (eq ibuffer-cached-filter-formats
                            ibuffer-filter-format-alist))
                   (and ibuffer-filter-format-alist
                        (null ibuffer-compiled-filter-formats))))
      (message "Formats have changed, recompiling...")
      (ibuffer-recompile-formats)
      (setq ibuffer-cached-formats ibuffer-formats
            ibuffer-cached-eliding-string ibuffer-eliding-string
            ibuffer-cached-elide-long-columns (with-no-warnings ibuffer-elide-long-columns))
      (when ext-loaded
        (setq ibuffer-cached-filter-formats ibuffer-filter-format-alist))
      (message "Formats have changed, recompiling...done"))))

Another weird one is that the extracted autoloads for ibuffer-ext.el reside in ibuffer.el, but that’s the lesser evil of the two.

Credits go to holomorph for discovering that maintenance nightmare.


PSA: Emacs is not a proper GTK application

29/10/2016

Update: I’ve been pointed to an emacs-devel discussion about giving Emacs a proper GTK frontend, most comparable to the Win32 and NS frontends (which are free from X11isms). This would be a better approach than what’s been outlined below, with the Cairo code being repurposed for the drawing bits.

Daniel Colascione’s excellent write-up on bringing double-buffered rendering to Emacs has prompted me to do the same on a set of questions that can be occasionally spotted on #emacs:

  • Which GUI build of Emacs shall I choose?
  • What’s the difference between the GTK build and the other builds of Emacs on Linux?
  • Does the GTK build run on Wayland?
  • Does the GTK build run on Broadway?
  • Why does Emacs not render as nicely as <insert more modern text editor>?

If you’ve ever programmed a GUI application with a modern/popular GUI toolkit, you’ll have noticed that while it gives you loads of desirable features, it forces you to structure your application around its idea of an event loop. In other words, your application is forced to react asynchronously to user events. Games are pretty much the only major kind of graphical application that can get away with doing their own event loop, but end up doing their own GUI instead.

Now, the issue with Emacs is that it does its own event loop, pretending that the frontend is a textual or graphical terminal. It’s pretty much the graphical equivalent of a REPL and at the time it only had a text terminal frontend, this way of doing things worked out fairly well. However, by the time users demanded having pretty GTK widgets in Emacs, it became clear that more involved hacks were needed to make that happen. This is why Emacs runs the GTK event loop one iteration at a time, pushes its own user events into it (to make widgets react) and a plethora of more hacks to reconcile their rendering with everything done by X11.

In other words, Emacs is more of a X11 application plus your favorite widgets. The choice of GUI toolkit to build it with is mostly irrelevant, save an infamous bug with the GTK3 frontend that can crash the daemon. Emacs will therefore not run on a pure Wayland system or under Broadway in the browser. If anyone would want to make that happen, either the GTK frontend would need to yank out everything X11 (unlikely as it’s deeply entrenched) or to create a new frontend doing platform-agnostic drawing (the Cairo feature being a prime candidate for that).

Further reading material: