Originally I planned to blog about a fun hack, porting the infamous cloud-to-butt browser extension to Emacs. The idea was that whenever you interact with subprocesses instances of “cloud” would be replaced with “butt”, I picked shell.el for ease of hacking[1]. The following snippet is loosely modeled after ansi-color-process-output, so pardon any weirdness.

(defun my-filter-shell-output (string)
  (let ((start-marker comint-last-output-start)
        (end-marker (process-mark (get-buffer-process (current-buffer)))))
      (goto-char start-marker)
      (while (search-forward "cloud" end-marker t)
        (replace-match "butt")))))

(with-eval-after-load 'shell
  (add-hook 'comint-output-filter-functions 'my-filter-shell-output t))

The API is somewhat murky. A comint output filter function receives a string argument and is expected to modify the buffer. There’s no documentation on how to retrieve the positions of the last output, so I did whatever aforementioned exemplary function does and restrict the search and replace operations to two markers. How could this possibly go wrong? See for yourself in the following test session:

[wasa@box ~]$ echo cloud
echo butt
[wasa@box ~]$ echo butt
[wasa@box ~]$ echo ponies

Something is definitely wrong here, an extra line is printed if and only if the replacement would have happened. Most curiously, it doesn’t mirror the user input, but has the replacement as well. After debugging this a bit[2] I remembered that long time ago I’ve set comint-process-echoes because M-x shell kept printing the user input after sending it to the shell. Time to gaze into the abyss:

;; Optionally delete echoed input (after checking it).
(when (and comint-process-echoes (not artificial))
  (let ((echo-len (- comint-last-input-end
    ;; Wait for all input to be echoed:
    (while (and (> (+ comint-last-input-end echo-len)
                (accept-process-output proc)
                  nil comint-last-input-start
                  (- (point-max) echo-len)
                  ;; Above difference is equivalent to
                  ;; (+ comint-last-input-start
                  ;;    (- (point-max) comint-last-input-end))
                  nil comint-last-input-end (point-max)))))
    (if (and
         (<= (+ comint-last-input-end echo-len)
           nil comint-last-input-start comint-last-input-end
           nil comint-last-input-end
           (+ comint-last-input-end echo-len))))
        ;; Certain parts of the text to be deleted may have
        ;; been mistaken for prompts.  We have to prevent
        ;; problems when `comint-prompt-read-only' is non-nil.
        (let ((inhibit-read-only t))
          (delete-region comint-last-input-end
                         (+ comint-last-input-end echo-len))
          (when comint-prompt-read-only
              (goto-char comint-last-input-end)

Echoes are canceled by adhering to the following procedure:

  • Waiting for process output until enough characters have been emitted
  • Comparing the emitted text with the last user input
  • Only if they match that echoed text is deleted
  • A hack is applied to not delete the prompt

Unfortunately my output filter is run before that, so it makes the last check fail. I can only wonder whether it’s even possible to use this API meaningfully and whether it will involve breaking changes. Yet everyone and their dog keep proclaiming loudly how great Emacs and its approach to text processing are…

[1]term.el is out because it doesn’t offer anything that deserves to be called an API, eshell.el doesn’t even have documentation and is huge, shell.el is small and simple.
[2]I recommend adding a (sit-for 1) between functions doing buffer manipulation to visualize what’s going on in the buffer. Note that edebug supports doing this for everything by switching to edebug-trace-mode.

Determining if the server is started, or the wonders of server-running-p


Update: Bug report thread with a workaround.

(This is a contributed post by thblt )

Trivia: How can you determine if the current Emacs instance has the Emacs server running?

A quick search gives us three potential candidates: server-mode, (daemonp) and (server-running-p). That’s way too much, but surely one of them is the right one, isn’t it? Well, no. Because the real answer to this trivial question is: you can’t.

  • server-mode is t if, and only if, the server was started using the function with the same name. But there are other ways to run the server, like M-x server-start or emacs --daemon.
  • (daemonp) returns t if, and only if, Emacs was started in daemon mode.

What about (server-running-p), then? Well, it may look friendly, but here be monsters.

It starts by looking promising: after M-x server-start, (server-running-p) now returns t! Do we have a winner? Not yet! Let’s pop a new Emacs instance and eval (server-running-p) without starting the server. t again!

What’s happening? The truth is that (server-running-p) is not what it seems to be. Here’s its complete source code:

(defun server-running-p (&optional name)
  "Test whether server NAME is running.

Return values:
  nil              the server is definitely not running.
  t                the server seems to be running.
  something else   we cannot determine whether it's running without using
                   commands which may have to wait for a long time."
  (unless name (setq name server-name))
  (condition-case nil
      (if server-use-tcp
            (insert-file-contents-literally (expand-file-name name server-auth-dir))
            (or (and (looking-at "127\\.0\\.0\\.1:[0-9]+ \\([0-9]+\\)")
                     (assq 'comm
                            (string-to-number (match-string 1))))
          :name "server-client-test" :family 'local :server nil :noquery t
          :service (expand-file-name name server-socket-dir)))
    (file-error nil)))

The horror starts as soon as the docstring. The -p suffix in the name promises a predicate, that is, a boolean function. But in server-running-p, non-nil is not a loud and clear “Yes!”, it’s a mumbled “well, maybe, who knows?”. Ternary logic, because Emacs is above the law of excluded middle.

But what does this function do? It tries to determine if a server called NAME is running, by assuming that this server would be configured exactly the same as the running instance. It may end up looking at the socket file of the current server, or it may try to initiate a TCP connection, which is extremely expensive. server-running-p is the kind of function you may be tempted to call while building the mode line: try it, and get an instant and unrecoverable Emacs freeze. What it’s supposed to be useful for is extremely unclear. It’s unable to determine if the running instance has a server — but it uses this server’s config to search for a potentially completely different server.



It’s halloween, so here’s a real treat for you, the commentary in bytecomp.el! The author of that piece of code is Jamie Zawinski who did invaluable work for both GNU Emacs and XEmacs, these days he runs a night club and blogs. Here are my favorite parts of the file:

  • ";; We successfully didn't compile this file."
  • (insert "\n") ; aaah, unix.
  •             (when old-style-backquotes
                  (byte-compile-warn "!! The file uses old-style backquotes !!
    This functionality has been obsolete for more than 10 years already
    and will be removed soon.  See (elisp)Backquote in the manual."))
  • ;; Insert semicolons as ballast, so that byte-compile-fix-header
    ;; can delete them so as to keep the buffer positions
    ;; constant for the actual compiled code.
  • ;; To avoid consing up monstrously large forms at load time, we split
    ;; the output regularly.
  • ;; If things not being bound at all is ok, so must them being
    ;; obsolete.  Note that we add to the existing lists since Tramp
    ;; (ab)uses this feature.
  • ;; If foo.el declares `toto' as obsolete, it is likely that foo.el will
    ;; actually use `toto' in order for this obsolete variable to still work
    ;; correctly, so paradoxically, while byte-compiling foo.el, the presence
    ;; of a make-obsolete-variable call for `toto' is an indication that `toto'
    ;; should not trigger obsolete-warnings in foo.el.
  • ;; FIXME: we also use this hunk-handler to implement the function's dynamic
    ;; docstring feature.  We could actually implement it more elegantly in
    ;; byte-compile-lambda so it applies to all lambdas, but the problem is that
    ;; the resulting .elc format will not be recognized by make-docfile, so
    ;; either we stop using DOC for the docstrings of preloaded elc files (at the
    ;; cost of around 24KB on 32bit hosts, double on 64bit hosts) or we need to
    ;; build DOC in a more clever way (e.g. handle anonymous elements).
  • ;; Don't reload the source version of the files below
    ;; because that causes subsequent byte-compilation to
    ;; be a lot slower and need a higher max-lisp-eval-depth,
    ;; so it can cause recompilation to fail.
  • ;; To avoid "lisp nesting exceeds max-lisp-eval-depth" when bytecomp compiles
    ;; itself, compile some of its most used recursive functions (at load time).

Don’t get me wrong, I’m aware that these are all necessary and don’t indicate deeper faults in the source code. I merely find it interesting what hacks one has to come up with for byte-code compilation and found studying the file enlightening to say the least.

Unjustified Indirection


I finally made that EPUB mode. This adventure mostly taught me that eww, or rather, shr.el isn’t quite reusable. That itself is not really a problem, but I handed in a patch to improve the situation. An old saying among programmers is that every problem can be solved by applying an extra level of indirection, so that’s what I did after discussing it out on the bug tracker, however after my patch got merged it was deemed too much:

;; We don't use shr-indirect-call here, since shr-descend is
;; the central bit of shr.el, and should be as fast as
;; possible.  Having one more level of indirection with its
;; negative effect on performance is deemed unjustified in
;; this case.

Hadn’t I spoken up about inclusion of this comment, an unsuspecting future hacker wouldn’t even know why there’s duplicated code not using the helper. I can only wonder how production-ready browser engines solve this kind of problem…



Update: Reddit points out that this has been fixed on master by replacing most of the code with a call to gnulib’s gen_tempname.

For someone not terribly experienced in writing safe programs, one can only hope that building blocks like make-temp-file are doing the right thing and cannot be subverted by a malicious third party. The general advice here is that it’s preferable to use the primitive for creating the temporary file instead of the primitive to generate its name. Now, does Emacs reuse mkstemp(3) for this? Or at least tmpnam(3)? Of course not! Where we go, we can just invent our own source of randomness:

make-temp-file looks as follows:

static const char make_temp_name_tbl[64] =

static unsigned make_temp_name_count, make_temp_name_count_initialized_p;

/* Value is a temporary file name starting with PREFIX, a string.

   The Emacs process number forms part of the result, so there is
   no danger of generating a name being used by another process.
   In addition, this function makes an attempt to choose a name
   which has no existing file.  To make this work, PREFIX should be
   an absolute file name.

   BASE64_P means add the pid as 3 characters in base64
   encoding.  In this case, 6 characters will be added to PREFIX to
   form the file name.  Otherwise, if Emacs is running on a system
   with long file names, add the pid as a decimal number.

   This function signals an error if no unique file name could be
   generated.  */

make_temp_name (Lisp_Object prefix, bool base64_p)
  Lisp_Object val, encoded_prefix;
  ptrdiff_t len;
  printmax_t pid;
  char *p, *data;
  char pidbuf[INT_BUFSIZE_BOUND (printmax_t)];
  int pidlen;

  CHECK_STRING (prefix);

  /* VAL is created by adding 6 characters to PREFIX.  The first
     three are the PID of this process, in base 64, and the second
     three are incremented if the file already exists.  This ensures
     262144 unique file names per PID per PREFIX.  */

  pid = getpid ();

  if (base64_p)
      pidbuf[0] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidbuf[1] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidbuf[2] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidlen = 3;
      pidlen = sprintf (pidbuf, "%"pMd, pid);
      pidbuf[0] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidbuf[1] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidbuf[2] = make_temp_name_tbl[pid & 63], pid >>= 6;
      pidlen = 3;

  encoded_prefix = ENCODE_FILE (prefix);
  len = SBYTES (encoded_prefix);
  val = make_uninit_string (len + 3 + pidlen);
  data = SSDATA (val);
  memcpy (data, SSDATA (encoded_prefix), len);
  p = data + len;

  memcpy (p, pidbuf, pidlen);
  p += pidlen;

  /* Here we try to minimize useless stat'ing when this function is
     invoked many times successively with the same PREFIX.  We achieve
     this by initializing count to a random value, and incrementing it

     We don't want make-temp-name to be called while dumping,
     because then make_temp_name_count_initialized_p would get set
     and then make_temp_name_count would not be set when Emacs starts.  */

  if (!make_temp_name_count_initialized_p)
      make_temp_name_count = time (NULL);
      make_temp_name_count_initialized_p = 1;

  while (1)
      unsigned num = make_temp_name_count;

      p[0] = make_temp_name_tbl[num & 63], num >>= 6;
      p[1] = make_temp_name_tbl[num & 63], num >>= 6;
      p[2] = make_temp_name_tbl[num & 63], num >>= 6;

      /* Poor man's congruential RN generator.  Replace with
         ++make_temp_name_count for debugging.  */
      make_temp_name_count += 25229;
      make_temp_name_count %= 225307;

      if (!check_existing (data))
          /* We want to return only if errno is ENOENT.  */
          if (errno == ENOENT)
            return DECODE_FILE (val);
            /* The error here is dubious, but there is little else we
               can do.  The alternatives are to return nil, which is
               as bad as (and in many cases worse than) throwing the
               error, or to ignore the error, which will likely result
               in looping through 225307 stat's, which is not only
               dog-slow, but also useless since eventually nil would
               have to be returned anyway.  */
            report_file_error ("Cannot create temporary name for prefix",
          /* not reached */

DEFUN ("make-temp-name", Fmake_temp_name, Smake_temp_name, 1, 1, 0,
       doc: /* Generate temporary file name (string) starting with PREFIX (a string).
The Emacs process number forms part of the result, so there is no
danger of generating a name being used by another Emacs process
\(so long as only a single host can access the containing directory...).

This function tries to choose a name that has no existing file.
For this to work, PREFIX should be an absolute file name.

There is a race condition between calling `make-temp-name' and creating the
file, which opens all kinds of security holes.  For that reason, you should
normally use `make-temp-file' instead.  */)
  (Lisp_Object prefix)
  return make_temp_name (prefix, 0);

The generated file name is therefore a combination of the prefix, the Emacs PID and three characters from the above table. This makes about 200.000 possible temporary files that can be generated with a given prefix in an Emacs session. This range can be traversed in a negligible amount of time to recreate the state of the RNG and accurately predict the next temporary file name.

(defun make-temp-file (prefix &optional dir-flag suffix)
  "Create a temporary file.
The returned file name (created by appending some random characters at the end
of PREFIX, and expanding against `temporary-file-directory' if necessary),
is guaranteed to point to a newly created empty file.
You can then use `write-region' to write new data into the file.

If DIR-FLAG is non-nil, create a new empty directory instead of a file.

If SUFFIX is non-nil, add that at the end of the file name."
  ;; Create temp files with strict access rights.  It's easy to
  ;; loosen them later, whereas it's impossible to close the
  ;; time-window of loose permissions otherwise.
  (with-file-modes ?\700
    (let (file)
      (while (condition-case ()
                   (setq file
                          (if (zerop (length prefix))
                            (expand-file-name prefix
                   (if suffix
                       (setq file (concat file suffix)))
                   (if dir-flag
                       (make-directory file)
                     (write-region "" nil file nil 'silent nil 'excl))
               (file-already-exists t))
        ;; the file was somehow created by someone else between
        ;; `make-temp-name' and `write-region', let's try again.

It’s interesting that the docstring of this function states that the return value “is guaranteed to point to a newly created empty file.”. If there were to exist a file for every possible combination for a prefix, this function would just fall into an infinite loop and block Emacs for no apparent reason. Both of these issues have been solved in a better way in glibc.

At least the impact of predicting the name is lessened if one uses make-temp-file instead of make-temp-name on its own. An attacker cannot create a symlink pointing to a rogue location with the predicted name as that would trigger a file-already-exists error and make the function use the next random name. All they could do is read out the file afterwards iff they have the same permission as the user Emacs runs with. A symlink attack can only be executed successfully with a careless make-temp-name user, thankfully I’ve not been able to find one worth subverting on GitHub yet.

Thanks to dale on #emacs for bringing this to my attention!