The Dude Abides


While crafting another patch for cc-mode, I stumbled upon the following:

;; RMS says don't make these the default.
;; (April 2006): RMS has now approved these commands as defaults.
(unless (memq 'argumentative-bod-function c-emacs-features)
  (define-key c-mode-base-map "\e\C-a"    'c-beginning-of-defun)
  (define-key c-mode-base-map "\e\C-e"    'c-end-of-defun))

I’ve seen this kind of thing before, but it keeps puzzling me. Why would you:

  • Put a comment there about not making the following a default
  • Put a clarifying comment how these have been approved as defaults now
  • Keep both comments to trip up future readers

Any other project I’m involved in would immediately remove that commentary. Emacs is the only one where suggesting such drastic action would yield a lengthy bikeshedding discussion about the merits of pre-VCS traditions and how keeping them will toughen up whoever may join in the collaborative development effort. Yes, seriously.

Less Than Exceptional


As of a few weeks ago, I’ve become part of Evil’s new maintainer team. Most of the bugs we get are not quite Evil’s fault, but rather that of other components it relies upon, such as Emacs. #785 is a perfect example of this observation. Take a look at the sources of the function responsible for the reported behavior and see whether you can find what’s wrong about it:

DEFUN ("lookup-key", Flookup_key, Slookup_key, 2, 3, 0,
       doc: ...)
  (Lisp_Object keymap, Lisp_Object key, Lisp_Object accept_default)
  ptrdiff_t idx;
  Lisp_Object cmd;
  Lisp_Object c;
  ptrdiff_t length;
  bool t_ok = !NILP (accept_default);

  keymap = get_keymap (keymap, 1, 1);

  length = CHECK_VECTOR_OR_STRING (key);
  if (length == 0)
    return keymap;

  idx = 0;
  while (1)
      c = Faref (key, make_number (idx++));

      if (CONSP (c) && lucid_event_type_list_p (c))
        c = Fevent_convert_list (c);

      /* Turn the 8th bit of string chars into a meta modifier.  */
      if (STRINGP (key) && XINT (c) & 0x80 && !STRING_MULTIBYTE (key))
        XSETINT (c, (XINT (c) | meta_modifier) & ~0x80);

      /* Allow string since binding for `menu-bar-select-buffer'
         includes the buffer name in the key sequence.  */
      if (!INTEGERP (c) && !SYMBOLP (c) && !CONSP (c) && !STRINGP (c))
        message_with_string ("Key sequence contains invalid event %s", c, 1);

      cmd = access_keymap (keymap, c, t_ok, 0, 1);
      if (idx == length)
        return cmd;

      keymap = get_keymap (cmd, 0, 1);
      if (!CONSP (keymap))
        return make_number (idx);

      maybe_quit ();

If you haven’t spotted it, the function never signals an exception for the exceptional case of an invalid key sequence. Instead, it returns one of the few types that aren’t allowed in a definition, namely an integer. It’s up to the caller to handle this case on their own, something the majority of Emacs packages rightfully neglect to do as even + informs you about invalid input. This design decision reeks of the many ways of signalling errors in C and worse, cannot be properly fixed as there’s code inside Emacs relying on the integer return type. There is even a lookup-key-ignore-too-long in Emacs core that looks a lot like what I wrote, but it’s part of menu-bar.el for whatever reason and not advertised at all.

tl;dr: In-band signalling sucks big time.

Strange Code

(defface w3m-history-current-url
  ;; The following strange code compounds the attributes of the
  ;; `secondary-selection' face and the `w3m-arrived-anchor' face,
  ;; and generates the new attributes for this face.
  (let ((base 'secondary-selection)
        (fn (if (featurep 'xemacs)
              'custom-face-attributes-get));; What a perverseness it is.
        ;; Both `face-custom-attributes-get' in XEmacs and
        ;; `custom-face-attributes-get' in CUSTOM 1.9962 attempt to
        ;; require `font' in Emacs/w3 and `cl' arbitrarily. :-/
        (features (cons 'font features))
        base-attributes attributes attribute)
    (setq base-attributes (funcall fn base nil)
          attributes (funcall fn 'w3m-arrived-anchor nil))
    (while base-attributes
      (setq attribute (car base-attributes))
      (unless (memq attribute '(:foreground :underline))
        (setq attributes (plist-put attributes attribute
                                    (cadr base-attributes))))
      (setq base-attributes (cddr base-attributes)))
    (list (list t attributes)))
  "Face used to highlight the current url in the \"about://history/\" page."
  :group 'w3m-face)

These days one would simply use :inherit for both w3m-arrived-anchor and secondary-selection.

No Obvious Way To Do It Properly


You can often catch me griping way more about an “Emacs API” than Emacs Lisp, often without any explanation how the interfaces Emacs provides could be possibly worse than the language used for this. Here comes a not too obvious example of the issue at hand that will not go away if you were to rewrite the code in question in <insert favorite toy language>:

(defvar flyspell-prog-text-faces
  '(font-lock-string-face font-lock-comment-face font-lock-doc-face)
  "Faces corresponding to text in programming-mode buffers.")

(defun flyspell-generic-progmode-verify ()
  "Used for `flyspell-generic-check-word-predicate' in programming modes."
  ;; (point) is next char after the word. Must check one char before.
  (let ((f (get-text-property (- (point) 1) 'face)))
    (memq f flyspell-prog-text-faces)))

The above is helper code for Flyspell to detect whether point is in a string or comment. While this is most certainly the easiest way to achieve this goal, it isn’t what the manual recommends. What you’re supposed to do instead is to use the syntax table parser by accessing its state via syntax-ppss and checking slots 3 and 4[1]. The issue with that approach is that it’s not that simple, see the sources of Smartparens for the ugly details, more specifically the sp-point-in-string-or-comment function.

Let’s take a step back. Why exactly do we need to check the syntactical state of Emacs? To understand what’s going on, you’ll need to know that the first step when building a major mode for a programming language is to set up the syntax tables so that Emacs can recognize strings and comments properly. Once this has been done, both syntax highlighting and movement work correctly. This hack has been done back then to have that special case dealt with in a performant manner, albeit it only works properly for languages looking like C or Lisp. Single characters can be marked as having a special syntax, with extra slots for two-character sequences. In Lisp, a semicolon would start the comment and every newline would terminate one, in C the comment starters and enders would be /* and */. If you’re using a language with more complicated rules than that, there is the crutch of writing a custom syntax-propertize-function that walks over an arbitrary buffer region and applies syntax properties manually. This is how triple-quoted strings are implemented in python.el, the same technique is used to highlight a S-expression comment in scheme.el which starts with #; and extends to the following S-expression.

Unfortunately, not all modes play well. Some resort to just painting over the thing to be a comment or string by applying the right face to it and sacrifice proper movement over symbols and S-expressions. It shouldn’t be surprising that given the choice to fix faulty modes or circumvent a cumbersome API, plenty Emacs hackers pick the latter and get more robust handling for free. After all, practicality beats purity, right?

[1]I’ve seen at least one mode checking slot 8 instead. A detailed analysis of the difference is left to the reader as exercise.

When Data Becomes Code


If you’ve hung out on #emacs for a while, chances are you’ve been recommended ibuffer for advanced buffer management (like, killing many buffers at once). There is a horror lurking beneath its pretty interface though and it starts out with a customizable:

(defcustom ibuffer-always-compile-formats (featurep 'bytecomp)
  "If non-nil, then use the byte-compiler to optimize `ibuffer-formats'.
This will increase the redisplay speed, at the cost of loading the
elisp byte-compiler."
  :type 'boolean
  :group 'ibuffer)


(defun ibuffer-compile-make-eliding-form (strvar elide from-end-p)
  (let ((ellipsis (propertize ibuffer-eliding-string 'font-lock-face 'bold)))
    (if (or elide (with-no-warnings ibuffer-elide-long-columns))
        `(if (> strlen 5)
             ,(if from-end-p
                  ;; FIXME: this should probably also be using
                  ;; `truncate-string-to-width' (Bug#24972)
                  `(concat ,ellipsis
                           (substring ,strvar
                                      (string-width ibuffer-eliding-string)))
                   ,strvar (- strlen (string-width ,ellipsis)) nil ?.)

(defun ibuffer-compile-make-substring-form (strvar maxvar from-end-p)
  (if from-end-p
      ;; FIXME: not sure if this case is correct (Bug#24972)
      `(truncate-string-to-width str strlen (- strlen ,maxvar) nil ?\s)
    `(truncate-string-to-width ,strvar ,maxvar nil ?\s)))

(defun ibuffer-compile-make-format-form (strvar widthform alignment)
  (let* ((left `(make-string tmp2 ?\s))
         (right `(make-string (- tmp1 tmp2) ?\s)))
       (setq tmp1 ,widthform
             tmp2 (/ tmp1 2))
       ,(pcase alignment
          (:right `(concat ,left ,right ,strvar))
          (:center `(concat ,left ,strvar ,right))
          (:left `(concat ,strvar ,left ,right))
          (_ (error "Invalid alignment %s" alignment))))))

(defun ibuffer-compile-format (format)
  (let ((result nil)
        ;; We use these variables to keep track of which variables
        ;; inside the generated function we need to bind, since
        ;; binding variables in Emacs takes time.
        (vars-used ()))
    (dolist (form format)
       ;; Generate a form based on a particular format entry, like
       ;; " ", mark, or (mode 16 16 :right).
       (if (stringp form)
           ;; It's a string; all we need to do is insert it.
           `(insert ,form)
         (let* ((form (ibuffer-expand-format-entry form))
                (sym (nth 0 form))
                (min (nth 1 form))
                (max (nth 2 form))
                (align (nth 3 form))
                (elide (nth 4 form)))
           (let* ((from-end-p (when (cl-minusp min)
                                (setq min (- min))
                  (letbindings nil)
                  (outforms nil)
                  min-used max-used strlen-used)
             (when (or (not (integerp min)) (>= min 0))
               ;; This is a complex case; they want it limited to a
               ;; minimum size.
               (setq min-used t)
               (setq strlen-used t)
               (setq vars-used '(str strlen tmp1 tmp2))
               ;; Generate code to limit the string to a minimum size.
               (setq minform `(progn
                                (setq str
                                        `(- ,(if (integerp min)
             (when (or (not (integerp max)) (> max 0))
               (setq max-used t)
               (cl-pushnew 'str vars-used)
               ;; Generate code to limit the string to a maximum size.
               (setq maxform `(progn
                                (setq str
                                        (if (integerp max)
                                (setq strlen (string-width str))
                                (setq str
                                        'str elide from-end-p)))))
             ;; Now, put these forms together with the rest of the code.
             (let ((callform
                    ;; Is this an "inline" column?  This means we have
                    ;; to get the code from the
                    ;; `ibuffer-inline-columns' alist and insert it
                    ;; into our generated code.  Otherwise, we just
                    ;; generate a call to the column function.
                    (ibuffer-aif (assq sym ibuffer-inline-columns)
                        (nth 1 it)
                      `(,sym buffer mark)))
                   ;; You're not expected to understand this.  Hell, I
                   ;; don't even understand it, and I wrote it five
                   ;; minutes ago.
                    (if (get sym 'ibuffer-column-summarizer)
                        ;; I really, really wish Emacs Lisp had closures.
                        ;; FIXME: Elisp does have them now.
                        (lambda (arg sym)
                            (let ((ret ,arg))
                              (put ',sym 'ibuffer-column-summary
                                   (cons ret (get ',sym
                      (lambda (arg _sym)
                        `(insert ,arg))))
                   (mincompform `(< strlen ,(if (integerp min)
                   (maxcompform `(> strlen ,(if (integerp max)
               (if (or min-used max-used)
                   ;; The complex case, where we have to limit the
                   ;; form to a maximum or minimum size.
                     (when (and min-used (not (integerp min)))
                       (push `(min ,min) letbindings))
                     (when (and max-used (not (integerp max)))
                       (push `(max ,max) letbindings))
                      (if (and min-used max-used)
                          `(if ,mincompform
                             (if ,maxcompform
                        (if min-used
                            `(when ,mincompform
                          `(when ,maxcompform
                     (push `(setq str ,callform
                                  ,@(when strlen-used
                                      `(strlen (string-width str))))
                     (setq outforms
                           (append outforms
                                   (list (funcall insertgenfn 'str sym)))))
                 ;; The simple case; just insert the string.
                 (push (funcall insertgenfn callform sym) outforms))
               ;; Finally, return a `let' form which binds the
               ;; variables in `letbindings', and contains all the
               ;; code in `outforms'.
               `(let ,letbindings
    ;; We don't want to unconditionally load the byte-compiler.
    (funcall (if (or ibuffer-always-compile-formats
                     (featurep 'bytecomp))
             ;; Here, we actually create a lambda form which
             ;; inserts all the generated forms for each entry
             ;; in the format string.
             `(lambda (buffer mark)
                (let ,vars-used
                  ,@(nreverse result))))))

(defun ibuffer-recompile-formats ()
  "Recompile `ibuffer-formats'."
  (setq ibuffer-compiled-formats
        (mapcar #'ibuffer-compile-format ibuffer-formats))
  (when (boundp 'ibuffer-filter-format-alist)
    (setq ibuffer-compiled-filter-formats
          (mapcar (lambda (entry)
                    (cons (car entry)
                          (mapcar (lambda (formats)
                                    (mapcar #'ibuffer-compile-format formats))
                                  (cdr entry))))

(defun ibuffer-clear-summary-columns (format)
  (dolist (form format)
    (when (and (consp form)
               (get (car form) 'ibuffer-column-summarizer))
      (put (car form) 'ibuffer-column-summary nil))))

(defun ibuffer-check-formats ()
  (when (null ibuffer-formats)
    (error "No formats!"))
  (let ((ext-loaded (featurep 'ibuf-ext)))
    (when (or (null ibuffer-compiled-formats)
              (null ibuffer-cached-formats)
              (not (eq ibuffer-cached-formats ibuffer-formats))
              (null ibuffer-cached-eliding-string)
              (not (equal ibuffer-cached-eliding-string ibuffer-eliding-string))
              (eql 0 ibuffer-cached-elide-long-columns)
              (not (eql ibuffer-cached-elide-long-columns
                        (with-no-warnings ibuffer-elide-long-columns)))
              (and ext-loaded
                   (not (eq ibuffer-cached-filter-formats
                   (and ibuffer-filter-format-alist
                        (null ibuffer-compiled-filter-formats))))
      (message "Formats have changed, recompiling...")
      (setq ibuffer-cached-formats ibuffer-formats
            ibuffer-cached-eliding-string ibuffer-eliding-string
            ibuffer-cached-elide-long-columns (with-no-warnings ibuffer-elide-long-columns))
      (when ext-loaded
        (setq ibuffer-cached-filter-formats ibuffer-filter-format-alist))
      (message "Formats have changed, recompiling...done"))))

Another weird one is that the extracted autoloads for ibuffer-ext.el reside in ibuffer.el, but that’s the lesser evil of the two.

Credits go to holomorph for discovering that maintenance nightmare.