Today I took another stab at my abandoned Emacs Lisp IRC bot and
thought to myself that it would be nice if it were able to notify me
on RSS and Atom feed updates. Now, I’m not terribly fond of
reinventing the wheel, so like every good programmer I dared taking a
look at existing solutions, like News Ticker:
(defun newsticker--do-xml-workarounds ()
"Fix all issues which `xml-parse-region' could be choking on."
;; a very very dirty workaround to overcome the
;; problems with the newest (20030621) xml.el:
;; remove all unnecessary whitespace
(while (re-search-forward ">[ \t\r\n]+<" nil t)
(replace-match "><" nil t))
;; and another brutal workaround (20031105)! For some
;; reason the xml parser does not like the colon in the
;; doctype name "rdf:RDF"
(if (re-search-forward "<!DOCTYPE[ \t\n]+rdf:RDF" nil t)
(replace-match "<!DOCTYPE rdfColonRDF" nil t))
;; finally.... ~##^°!!!!!
(while (search-forward "\r\n" nil t)
(replace-match "\n" nil t))
;; still more brutal workarounds (20040309)! The xml
;; parser does not like doctype rss
(if (re-search-forward "<!DOCTYPE[ \t\n]+rss[ \t\n]*>" nil t)
(replace-match "" nil t))
;; And another one (20050618)! (Fixed in GNU Emacs 18.104.22.168)
;; Remove comments to avoid this xml-parsing bug:
;; "XML files can have only one toplevel tag"
(while (search-forward "<!--" nil t)
(let ((start (match-beginning 0)))
(unless (search-forward "-->" nil t)
(error "Can't find end of comment"))
(delete-region start (point))))
;; And another one (20050702)! If description is HTML
;; encoded and starts with a `<', wrap the whole
;; description in a CDATA expression. This happened for
"<description>\\(<img.*?\\)</description>" nil t)
"<description><![CDATA[ \\1 ]]></description>"))
;; And another one (20051123)! XML parser does not
;; like this: <yweather:location city="Frankfurt/Main"
;; region="" country="GM" />
;; try to "fix" empty attributes
;; This happened for
(while (re-search-forward "\\(<[^>]*\\)=\"\"" nil t)
(replace-match "\\1=\" \""))
I guess I’ll not be using this and go for a special-purpose solution
instead without over a decade old workarounds. After all, I’m not
stuck in 2005 with Emacs 21…
byte-opt.el is one of those files Jamie Zawinski laid his golden
hands on. It seems that back in the days, there wasn’t much of a
concern about Emacs Lisp execution speed until he got annoyed enough
to bolt on an optimizer. Its sources start with a wonderful quote:
“No matter how hard you try, you can’t make a racehorse out of a pig.
You can, however, make a faster pig.”
I recommend reading it to get an idea what compiler jargon like
“peephole optimizer” could possibly mean. During my last study, I
found this curious piece of code:
(defun byte-optimize-approx-equal (x y)
(<= (* (abs (- x y)) 100) (abs (+ x y))))
So, according to this 99 and 100 are equal. Awesome!
<JordiGH> Strictly speaking, isn’t “idiomatic lisp” whatever rms writes?
I’m afraid this is not the case. See this snippet.
;;Function that handles term messages: code by rms (and you can see the
;;difference ;-) -mm
(defun term-handle-ansi-terminal-messages (message)
;; Is there a command here?
(while (string-match "\eAnSiT.+\n" message)
;; Extract the command code and the argument.
(let* ((start (match-beginning 0))
(command-code (aref message (+ start 6)))
(+ start 8)
(string-match "\r?\n" message
(+ start 8)))))
;; Delete this command from MESSAGE.
(setq message (replace-match "" t t message))
;; If we recognize the type of command, set the appropriate variable.
(cond ((= command-code ?c)
(setq term-ansi-at-dir argument))
((= command-code ?h)
(setq term-ansi-at-host argument))
((= command-code ?u)
(setq term-ansi-at-user argument))
;; Otherwise ignore this one.
(setq ignore t)))
;; Update default-directory based on the changes this command made.
(if (and (string= term-ansi-at-host (system-name))
(string= term-ansi-at-user (user-real-login-name)))
(if (string= term-ansi-at-user (user-real-login-name))
(concat "/" term-ansi-at-host ":" term-ansi-at-dir)
(concat "/" term-ansi-at-user "@" term-ansi-at-host ":"
;; I'm not sure this is necessary,
;; but it's best to be on the safe side.
(if (string= term-ansi-at-host (system-name))
(setq ange-ftp-default-user term-ansi-at-save-user)
(setq ange-ftp-default-password term-ansi-at-save-pwd)
(setq ange-ftp-generate-anonymous-password term-ansi-at-save-anon))
(setq term-ansi-at-save-user ange-ftp-default-user)
(setq term-ansi-at-save-pwd ange-ftp-default-password)
(setq term-ansi-at-save-anon ange-ftp-generate-anonymous-password)
(setq ange-ftp-default-user nil)
(setq ange-ftp-default-password nil)
(setq ange-ftp-generate-anonymous-password nil)))))
This isn’t bad code by any means, just clumsy and careful as opposed
to the highly compressed nature of the surrounding code. The “I’m not
sure this is necessary, but it’s best to be on the safe side.” comment
reminds me of The Daily WTF.
This is a codeless post that will instead focus on a design issue
present in all (at the time of writing) stable releases of Emacs. Be
assured that you will not have to work around it in the upcoming Emacs
Have you ever wondered why some commands deactivate the region
afterwards, although there’s no explicit call to the
deactivate-mark function? It turns out that this is intentional
behavior as can be seen in the documentation of the
If an editing command sets this to t, deactivate the mark afterward.
The command loop sets this to nil before each command,
and tests the value when the command returns.
Buffer modification stores t in this variable.
So, any command modifying a buffer will deactivate the region. Makes
sense and if you for some reason need the region again, it’s a C-x
C-x away. There is a major problem with this though, it doesn’t
matter which buffer is modified…
This bit me hard with eyebrowse. I am using a modeline indicator
to visualize its state which is using the built-in format-spec
package. As that package is using a temporary buffer for turning a
format string into a formatted string and the modeline indicator is
recalculated very often, this led to the region being deactivated on
any command. It took me quite a bit to figure this one out. I
consider it madness for anyone to expect this behavior when writing
functions that should not interfere with the region, so I’m glad it
has been fixed in Emacs 25 by making the variable buffer-local.
I’m currently writing my second mode, this time for textual
markup. As I still don’t have much experience with it, I did look at
other modes of that kind, ultimately ending up with rst.el.
It’s not unusual for older code to redefine things that could possibly
not supported by all Emacs versions out there. What I did not expect
however, was an implementation of symbolic regular expressions:
(defvar rst-re-alist) ; Forward declare to use it in `rst-re'.
;; FIXME: Use `sregex' or `rx' instead of re-inventing the wheel.
;; testcover: ok.
(defun rst-re (&rest args)
"Interpret ARGS as regular expressions and return a regex string.
Each element of ARGS may be one of the following:
A string which is inserted unchanged.
A character which is resolved to a quoted regex.
A symbol which is resolved to a string using `rst-re-alist-def'.
A list with a keyword in the car. Each element of the cdr of such
a list is recursively interpreted as ARGS. The results of this
interpretation are concatenated according to the keyword.
For the keyword `:seq' the results are simply concatenated.
For the keyword `:shy' the results are concatenated and
surrounded by a shy-group (\"\\(?:...\\)\").
For the keyword `:alt' the results form an alternative (\"\\|\")
which is shy-grouped (\"\\(?:...\\)\").
For the keyword `:grp' the results are concatenated and form a
referenceable group (\"\\(...\\)\").
After interpretation of ARGS the results are concatenated as for
(cadr (assoc re rst-re-alist)))
(regexp-quote (char-to-string re)))
(mapcar (lambda (elt)
((eq (car re) :seq)
(mapconcat 'identity nested ""))
((eq (car re) :shy)
(concat "\\(?:" (mapconcat 'identity nested "") "\\)"))
((eq (car re) :grp)
(concat "\\(" (mapconcat 'identity nested "") "\\)"))
((eq (car re) :alt)
(concat "\\(?:" (mapconcat 'identity nested "\\|") "\\)"))
(error "Unknown list car: %s" (car re))))))
(error "Unknown object type for building regex: %s" re))))
;; FIXME: Remove circular dependency between `rst-re' and `rst-re-alist'.
(with-no-warnings ; Silence byte-compiler about this construction.
;; Shadow global value we are just defining so we can construct it step by
(dolist (re rst-re-alist-def rst-re-alist)
(list (list (car re) (apply 'rst-re (cdr re))))))))
"Alist mapping symbols from `rst-re-alist-def' to regex strings."))
I find it hilarious that they appear to be aware of a now obsolete
alternative and a more powerful, officially supported one, yet decided
to do their own thang. At least there’s not much code around that
could be yucky, if you ignore that one circular dependency mentioned
at the bottom between the function and its look-up alist.