Vim Macros for Editing DocBook Documents
Recently, while helping Linux Journal convert its editorial process to use DocBook/XML for articles, I had occasion to convert some old Vim macros for use with the new process. The original macros were key maps or abbreviations for inserting Quark tags and special characters. The new editorial process involves marking or tagging a document in DocBook/XML. From there, a stylesheet is applied to convert the document either to Quark for publication in the print magazine or to HTML for publication on the Web site.
DocBook exists in two basic forms, an SGML version and a newer XML version. DocBook is a markup language that looks similar to HTML. It uses tags with attributes and ampersand sequences for specifying special characters and symbols. Listing 1 contains a short DocBook/XML article.
Listing 1. Sample DocBook/XML Article
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <!DOCTYPE article SYSTEM "docbookx.dtd"> <article> <!-- Article information. --> <articleinfo> <!-- Article title and abstract. --> <title>This is an Uninteresting Sample article</title> <abstract> <para>This article isn't about anything interesting.</para> </abstract> <!-- Author name and bio. --> <author> <firstname>John</firstname> <surname>Doe</surname> <authorblurb> <para>The author is not a very interesting person.</para> </authorblurb> </author> </articleinfo> <!-- Body of article. --> <simplesect> <title>This is the first of thankfully only one uninteresting section.</title> <para>True to form this is not very interesting either.</para> </simplesect> </article>
As you can see, the article structure is similar to HTML, except the tag names are different. The DOCTYPE line refers to the DTD (Document Type Definition) used to validate the file. To write useful DocBook, you need the DTD and a program to validate your document so you can determine if it contains any DocBook or XML errors. See the Resources at the end of the article for sites where you can get the DTD and programs for validating your documents.
Vim primarily is a work-a-like replacement for the original vi editor that came with many UNIX systems. Vim also contains many enhancements, including a quite capable macro/scripting language and a GUI version. Most Linux distributions should include Vim. Vim is a moded text editor; that is, keystrokes have different meanings depending on whether you're entering text or manipulating it.
The set of Vim macros used for our project are contained in the following files:
tagtmps.vim: contains tag templates. Tag templates are starting and ending tags and some predefined content that can be inserted into the file you're editing.
tfuncs.vim: contains functions for manipluating tags. Functions are available for inserting, deleting, moving and changing tags.
mfuncs.vim: contains functions that assist in setting up Vim key mappings.
maps.vim: uses the mapping functions defined above to define key mappings for accessing the tag functions defined above.
To use these files start vi and type:
:so tagtmps.vim :so tfuncs.vim :so mfuncs.vim :so maps.vim
Normally, however, you would place these commands into a single file and source only it when you enter vim. For example, place the above files in a sub-directory named vim in your home directory. Then, put the following lines in a file named editdb.vim:
so ~/vim/tagtmps.vim " Tag templates. so ~/vim/tfuncs.vim " Tag functions. so ~/vim/mfuncs.vim " Map functions. so ~/vim/maps.vim " Key mappings.
Now, start vi and type the following to load all the files with one command :so ~/editdb.vim. Another option is to source these files in your .vimrc file so they are loaded whenever you start vim. Your .vimrc file is located in your home directory.
Once these files have been read and processed by vim, the macros are bound to the keyboard. The macros provide the following capabilities:
Inserting a tag template.
Inserting a start tag.
Inserting an end tag.
Tagging a word by placing it between a start tag and an end tag.
Tagging a range of lines with a start tag and an end tag.
Changing a tag.
Inserting special symbols such as the copyright symbol.
Inserting special characters such as accented characters.
Deleting a tag under the cursor.
Moving the cursor to the left (right) of the previous (next) tag.
Moving a tag to the left (right) of the previous (next) word.
Deleting whitespace to the left or right of a tag.
Inserting whitespace to the left or right of a tag.
The macros are bound to function keys (F-Keys) followed by zero, one or two modifier characters. Function keys are specified as <FN>, where N is the function key number. Shifted function keys are specified as <S-FN>. The key mappings are detailed belwo in Tables 1-4.
Table 1. Key Mappings
Key Sequence | Mapping |
---|---|
<F2>tag-keys | Insert tag template. Inserts a template for the tag corresponding to the specified tag-keys in the document. See Table 2 for a list of defined tag-keys. |
<F3>symbol-keys | Insert symbol character. Inserts the symbol character corresponding to the specified symbol-keys in the document. Most often, this inserts the DocBook/XML entity reference for the symbol. See Table 3 for a list of defined symbol-keys. |
<F4>foreign-char-keys | Insert foreign character. Inserts the foreign character corresponding to the specified foreign-char-keys into the document. See Table 4 for a list of defined foreign-char-keys. |
<F5>tag-keys | Insert opening tag. Inserts the opening tag corresponding to the specified tag-keys. |
<F6>tag-keys | Insert closing tag. Inserts the closing tag corresponding to the specified tag-keys. |
<F7>tag-keys | Tag word. If the cursor is on a word, this places tags corresponding to the specified tag-keys around the word. |
<F7>tag-keys | Tag range (command mode). When a range of lines is specified, this places tags corresponding to the specified tag-keys around the lines. Note: this is done in command mode; type a colon followed by the key sequence. |
<F8>tag-keys | Change tag. If the cursor is on a tag, this changes the tag to the tag that corresponds to the specified tag-keys. |
<F9> | Move cursor to the left by tags. If the cursor is on a tag, this moves to the beginning of the tag (to the <). If the cursor is not on a tag, this moves to the beginning of the next tag to the left of the cursor. |
<F10> | Move tag left of preceding word. If the cursor is on a tag, this moves the tag to the left of the preceding word. In other words, if the cursor is on a start tag, it expands the amount of tagged content to include the word to the left of the tag. If the cursor is on an end tag, it removes the word to the left of the tag from the tagged contents. |
<F11> | Move tag right of following word. If the cursor is on a tag, this moves the tag to the right of the following word. In other words, if the cursor is on a start tag, it removes the word to the right of the tag from the tagged contents. If the cursor is on an end tag it, expands the amount of tagged content to include the word to the right of the tag. |
<F12> | Move cursor to the right by tags. If the cursor is on a tag, this moves to the end of the tag (to the >). If the cursor is not on a tag, this moves to end of the next tag to the right of the cursor. |
<S-F8> | Delete tag. If the cursor is on a tag, this deletes the tag. |
<S-F9> | Delete whitespace to the left of the tag. If the cursor is on a tag, this deletes any whitespace to the left of the tag. |
<S-F10> | Insert a space to the left of the tag. If the cursor is on a tag, this inserts a single space to the left of the tag. |
<S-F11> | Insert a space to the right of the tag. If the cursor is on a tag, this inserts a single space to the right of the tag. |
<S-F12> | Delete whitespace to the right of the tag. If the cursor is on a tag, this deletes any whitespace to the right of the tag. |
The following keys are used in combination with the function keys <F2>, <F5>, <F6>, <F7> and <F8> . The function key determines the action; the keys that follow determine the tag.
Table 2. Tag Keys
Keys | Corresponding Tag |
---|---|
a | <article> |
b | <emphasis role="bold"> |
c | <command> |
d | <entry> |
e | <email> |
f | <function> |
h | <title> |
i | <emphasis> |
jd | <remark role="web-pub-date"> |
ji | <remark role="author-image"> |
jl | <remark role="layout-info"> |
jn | <remark role="article-number"> |
jo | <remark role="output-file"> |
jp | <remark role="pull-quote"> |
js | <remark role="article-section"> |
jt | <remark role="teaser"> |
jw | <remark role="article-series"> |
k | <command role="what-to-type"> |
l | <listitem> |
m | <mediaobject> |
n | <itemizedlist> |
o | <orderedlist> |
p | <para> |
q | <quote> |
r | <row> |
s | <sidebar> |
t | <table> |
u | <ulink> |
xa | <author> |
xb | <blockquote> |
xc | <![CDATA[ |
xf | <firstname> |
xi | <articleinfo> |
xl | <surname> |
xm | <othername role="middle"> |
xn | <!-- |
xo | <screen> |
xp | <programlisting> |
xq | <question> |
xr | <answer> |
xs | <simplesect> |
For example, <F2> inserts a tag template, so <F2>p inserts <para></para>, places the cursor between the two tags and enters insert mode. Many of the templates consist of more than the start and end tag. For example, the template for a table inserts empty heading and body rows.
The following keys are used in combination with function key <F3>.
Table 3. Symbol Keys
Keys | Value Inserted | Description |
---|---|---|
3 | ¾ | Three-fourths fraction |
5 | ‘ | Left single quote |
6 | ’ | Right single quote |
7 | “ | Left double quote |
8 | ” | Right double quote |
, | < | Less than sign |
. | > | Greater than sign |
< | < | Less than sign |
> | > | Greater than sign |
a | æ | ae ligature |
c | © | Copyright symbol |
d | ° | Degrees sign |
f | ¼ | One-fourth fraction |
h | ½ | One-half fraction |
n | – | En-dash |
m | — | Em-dash. |
r | ® | Registered symbol |
t | × | Times sign |
_ (underscore) | -date- | Current date |
For example, <F3>3 inserts ¾ into the text.
The following keys are used in combination with function key <F4>.
Table 4. Foreign Character Keys
Keys | Value Inserted | Description |
---|---|---|
b | β | Greek beta |
m | μ | Greek mu |
n | ñ | n with tilde |
'a (singlequote-a) | á | a with acute accent |
'c | ç | c with cedilla |
'e | é | e with acute accent |
'i | í | i with acute accent |
'o | ó | o with acute accent |
'u | ú | u with acute accent |
`a (backtick-a) | à | a with grave accen |
`e | è | e with grave accent |
`i | ì | i with grave accent |
`o | ò | o with grave accent |
`u | ù | u with grave accent |
"a (doublequote-a) | ä | a with diaeresis |
"e | ë | e with diaeresis |
"i | ï | i with diaeresis |
"o | ö | o with diaeresis |
"u | ü | u with diaeresis |
^a | â | a with circumflex |
^e | ê | e with circumflex |
^i | î | i with circumflex |
^o | ô | o with circumflex |
^u | û | u with circumflex |
For example, <F4>'a inserts á (an a with an acute accent) into the text.
DocBook is a vast markup language offering many tags, and most projects use only a subset of the available tags. The bindings above were developed for marking up LJ articles. Your use of DocBook probably will focus on a different set of tags, so you probably need to change the templates and key bindings.
The file tagtmps.vim contains the templates inserted by the <F2> key. Listing 2 shows the template for the article tag.
Listing 2. Article Tag Template
let g:Template_article ="<?xml version=\"1.0\" encoding=\"ISO-8859-1\" standalone=\"no\"?><CR>" \."<!DOCTYPE article SYSTEM \"docbookx.dtd\"><CR><CR>" \."<article><CR><CR>" \."<simplesect><title/><CR>" \."<para><CR>-:-<CR>" \."</para><CR>" \."</simplesect><CR><CR>" \."<simplesect><title></title><CR>" \."<para><CR>" \."</para><CR>" \."</simplesect><CR><CR>" \."</article>"
All templates are stored in global string variables whose names start with Template_ and end with the tag to which the template corresponds. The g: prefix in the variable name makes it a global variable. Long strings can be placed on multiple lines by prefixing the second and subsequent lines with a backslash (\) and using the dot (.) concatenation operator to append the lines together. Each string piece is contained within double quotes. Double-quoted strings understand the usual C escape sequences. You can modify the existing templates or add new ones for other tags by following the described naming sequence. Within a template the sequence -:- is used to specify where you want the cursor to be placed after the template is inserted. The macro automatically removes this string after inserting the macro.
If you want to change the function keys used to execute the macros, modify the following lines in the file mfuncs.vim:
let s:InsertTagTemplateKey = "<F2>" let s:InsertSymbolKey = "<F3>" let s:InsertForeignCharKey = "<F4>" let s:InsertStartTagKey = "<F5>" let s:InsertEndTagKey = "<F6>" let s:TagWordKey = "<F7>" let s:TagRangeKey = "<F7>" let s:ChangeTagKey = "<F8>"
If you want to change the tag-keys, symbol-keys or foreign-char-keys that follow the functions keys or change the tags associated with the keys, change the corresponding lines in maps.vim. For example, to change b so it is associated with the tag <book> rather than the tag <emphasis role="bold">, look for the following line in maps.vim:
call MapTagKey("b", 0, 0, "emphasis", " role=\"bold\"")
and change it to:
call MapTagKey("b", 1, 1, "book", "")
The function MapTagKey is defined in mfuncs.vim. Its prototype is:
function! MapTagKey(key, snewline, enewline, tag, stagx)
Its parameters are explained in Table 5.
Table 5. MapTagKey Parameters
Parameter | Description |
---|---|
key | Keystroke(s) to bind the tag key to (for example, <F2>key). |
snewline | One if the start tag should be placed on a new line, zero otherwise. |
enewline | One if the end tag should be placed on a new line, zero otherwise. |
tag | The tag associated with the key. |
stagx | Extra attributes that should be placed in the start tag. |
The function MapTagKey merely sets up and executes a number of Vim nmap and imap commands to make the appropriate key bindings. Similarily, symbol-keys and foreign-char-keys are mapped by the functions MapSymbolKey and MapForeignCharKey. These functions each take two arguments, the key and the text to insert. For example:
call MapSymbolKey("3", "¾") call MapForeignCharKey("b", "β")
Near the bottom of the file maps.vim are a handful of Vim nmap commands for binding the tag manipluation and movement keys, including delete tag, change tag and move tag. All of these bindings call the functions defined in the file tfuncs.vim.
nmap <S-F8> :call DeleteTag()^M " Delete tag at cursor. nmap <F9> :call CursorLeftByTag()^M " Move left by tags. nmap <F10> :call MoveTagLeft()^M " Move tag left of preceding word. nmap <F11> :call MoveTagRight()^M " Move tag right of following word. nmap <F12> :call CursorRightByTag()^M " Move right by tags. nmap <S-F9> :call TightenTagLeft()^M " Delete whitespace left of tag. nmap <S-F10> :call InsertStringLeftOfTag(" ")^M " Insert space to the left of tag. nmap <S-F11> :call InsertStringRightOfTag(" ")^M " Insert space to the right of tag. nmap <S-F12> :call TightenTagRight()^M " Delete whitespace right of tag.
If you cat this file, these lines are going to look strange, and in some editors, all the lines are going to break right after the closing parentheses in the call. If you look at it with vim, you can see that the closing parentheses are followed by ^M, a carriage return. When you cat the file, this causes part of the line to be erased. Some editors cause a line break here. These mappings work in command mode and the ^M ends the command.
If you're a vim user and you need to edit DocBook, the macros described here can make your job easier by providing you with vim key bindings for inserting tags and entities, as well as for inserting predefined tag templates. It also provides key bindings that allow you to move tags in your document and to move through your document by way of tags.
Although originally designed for use with DocBook/XML, these macros could be used with any similar markup language, such as HTML, SGML or other XML-based markup languages.
Official Home of DocBook The DocBook DTD is available here.
On-line version of the O'Reilly book DocBook: The Definitive Guide The DocBook DTD also is available here.