2. Editing tools

Editing tools have come a long way in their support for XML (and specifically DocBook). There are two types of editors outlined in this section: text editors (emacs, vim, etc); and word processors (OpenOffice, AbiWord, etc). New authors who are not comfortable working with markup languages should probably choose a word processor that can output DocBook files. For this reason the word processors are listed first.

Although many editors can also validate your DocBook files, this information has been separated into Section 3, “Validation”.

More info

Check the resources section for more XML Authoring Tools.

2.1. Word Processors

Even if you are not comfortable working DocBook's tagset in a text editor you can still produce valid DocBook documents using a word processor. Support at this point is very limited, but it does exist in the following programs. The up side, of course, is that things like spell check are built in to the program. In addition to this, support for DocBook (and XML) is constantly improving.

Converting Microsoft Word documents

Even if you want to use MS Word to write your documents, you may find w2XML useful. Note that this is not free software--the cost is around $130USD. There is, however, a trial version of the software.

Work on the content!

Remember that all formatting changes you make to your document will be ignored when your document is released by the LDP. Instead of focusing on how your document looks, focus on the content.

2.1.1. AbiWord

Through word of mouth I've heard that AbiWord can work (natively) with DocBook documents. This will need to be tested by someone (possibly me) and should definitely be included if it is the case.

2.1.2. OpenOffice.org

http://openoffice.org

As of OpenOffice.org (OOo) 1.1RC there has been support for exporting files to DocBook format.

Although OOo uses the full DocBook document type declaration, it does not actually export the full list of DocBook elements. It uses a simplified DocBook tagset which is geared to on-the-fly rendering. (Although it is not the official Simplified DocBook which is described in Section 5, “DocBook DTD”.) The OpenOffice simplified (or special docbook) is available from http://xml.openoffice.org/xmerge/docbook/supported_tag_table.html.

2.1.2.1. Open Office 1.0.x

OOo has been tested by LDP volunteers with mostly positive results. Thanks to Charles Curley (charlescurley.com) for the following notes on using OOo version 1.0.x:

Check the version of your OpenOffice

These notes may not apply to the version of OOo you are using.

  • To be able to export to DocBook, you must have a Java runtime environment (JRE) installed and registered with OOo--a minimum of version 4.2.x is recommended. The configuration instructions will depend on how you installed your JRE. Visit the OOo web site for help with your setup.

    Contrary to the OOo documentation, the Linux OOo did not come with a JRE. I got one from Sun.

  • The exported file has lots of empty lines. My 54 line exported file had 5 lines of actual XML code.

  • There was no effort at pretty printing.

  • The header is: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">

  • The pull-down menu in the FileSave As dialog box for file format indicates that the export format is DocBook (simplified). There is no explanation of what that simplified indicates. Does OOo export a subset of DocBook? If so, which elements are ignored? Is there any way to enter any of them manually?

  • There is NO documentation on the DocBook export filter or whether OOo will import it again.

Conclusions: OOo 1.1RC is worth looking at if you want a word processor for preparing DocBook documents.

However, I hope they cure the lack of documentation. For one thing, it would be nice to know which native OOo styles map to which DocBook elements. It would also be nice to know how to map one's own OOo styles to DocBook elements.

2.1.2.2. Open Office 1.1

Tabatha Marshall offers the following additional information for OOo 1.1.

The first problem was when I tried to do everything on version 1.0.1. That was obviously a problem. I have RH8, and it was installed via rpm packages, so I ripped it out and did a full, new install of OpenOffice 1.1. It took a while to find out 1.1 was a requirement for XML to work.

During the install process I believe I was offered the choice to install the XML features. I have a tendency to do full installs of my office programs, so I selected everything.

I can't offer any advice to those trying to update their current OO 1.1. Their 3 ways aren't documented very well at the site (xml.openoffice.org) and as of this writing, I can't even find THAT on their site anymore. I think more current documentation is needed there to walk people through the process. Most of this was unclear and I had to pretty much experiment to get things working.

Well, after I installed everything I had some configuration to do. I opened the application, and got started by opening a new file, choosing templates, then selecting the DocBook template. A nice menu of Paragraph Styles popped up for me, which are the names for all those tags, I noticed (you can see I don't use WYSIWYG often).

With a blank doc before me (couldn't get to the XML Filter Settings menu unless some type of doc was opened), I went into ToolsXML Filter Settings, and edited the entry for DocBook file. I configured mine as follows:

  • Doctype -//OASIS//DTD DocBook XML V4.2//EN

  • DTD http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd

  • XSLT for export /usr/local/OpenOffice.org1.1.0/share/xslt/docbook/ldp-html.xsl

  • XSLT for import /usr/local/OpenOffice.org1.1.0/share/xslt/docbook/docbooktosoffheadings.xsl (this is the default)

  • Template for import /home/tabatha/OpenOffice/user/template/DocBook File/DocBookTemplate.stw

At first, if I opened an XML file that had even one parsing error, it would just open the file anyway and display the markup in OO. I have many XML files that use &copy; and other types of entities which show up as parse errors (depending on the encoding) even though they can be processed through. But today I was unable to open any of those files. I got input/output errors instead. Still investigating that one.

However when you do successfully open a document (one parsing with no errors), it puts it automatically into WYSIWYG based on the markup, and you can then work from the paragraph styles menu like any other such editor.

To validate the document, I used ToolsXML Filter Settings, then clicked the Test XSLTs button. On my screen, I set up the XSLT for export to be ldp-html.xsl. If you test and there are errors, a new window pops up with error messages at the bottom, and the lines that need to be changed up at the top. You can change them there and progress through the errors until they're all gone, and keep testing until they're gone.

If you want to open a file to see the source instead of the processed results, go to ToolsXML Filter SettingsTest XSLTs, and then under the Import section, check the Display Source box. My import XSLT is currently docbooktosoffheadings.xsl (the default) and the template for import is DocBookTemplate.stw (also default).

I think this might work for some people, but unfortunately not for me. I've never used WYSIWYG to edit markup. Emacs with PSGML can tell me what my next tag is no matter where I am, validate by moving through the trouble spots, and I can parse and process from command line.

With OpenOffice, you have to visit http://xml.openoffice.org/filters.html to find conversion tools.

2.1.3. WordPerfect 9 (Corel Office 2000)

http://www.corel.com/

WordPerfect 9 for the MS Windows platform has support for SGML and DocBook 3.0. WordPerfect 9 for Linux has no SGML capabilities.

If you are using WordPerfect on the Linux operating system, please read: WordPerfect on Linux FAQ

2.1.4. XMLmind's XML editor

http://www.xmlmind.com/xmleditor/

Although strictly speaking, it is not a word processor, XMLmind's XML editor allows authors to concentrate on the content, and not the markup. It has built in spelling and conversion utilities which allow you to transform your documents without having to install and configure an additional processing tool such as jade. There is a free standard edition, which is a simplified version of their professional edition.

2.1.5. Conglomerate

http://www.conglomerate.org

According to their web site, Conglomerate aims to be an XML editor that everyone can use. In particular, our primary goal is to create the ultimate editor for DocBook and similar formats. It aims to hide the jargon and complexity of XML and present the information in your documents in a way that makes sense.

2.1.6. Vex: a visual editor for XML

http://vex.sourceforge.net/

According to their web site, The visual part comes from the fact that Vex hides the raw XML tags from the user, providing instead a wordprocessor-like interface. Because of this, Vex is best suited for document-style XML documents such as XHTML and DocBook rather than data-style XML documents.

2.2. Text Editors

For advanced writers

The tools outlined in this section allow you to work with the DocBook tags directly. If you are not comfortable working with markup languages you may want to use a word processor instead. Word processors that support DocBook are described in Section 2.1, “Word Processors”.

If you are comfortable working with markup languages and text editors, you'll probably want to customize your current editor of choice to handle DocBook files. Below are some of the more common text editors that can, with some tweaking, handle DocBook files.

2.2.1. Emacs (PSGML)

http://www.lysator.liu.se/~lenst/about_psgml/

Emacs has an SGML writing mode called psgml that is a major mode designed for editing SGML and XML documents. It provides:

  • syntax highlighting or pretty printing features that make the tags stand out

  • a way to insert tags other than typing them by hand

  • and the ability to validate your document while writing

For users of Emacs, it's a great way to go. PSGML works with DocBook, LinuxDoc and other DTDs equally well.

2.2.1.1. Verifying PSGML is Installed

If you have installed a recent distribution, you may already have PSGML installed for use with Emacs. To check, start Emacs and look for the PSGML documentation (C+himpsgml).

Dependencies

If you don't have PSGML installed now might be a good time to upgrade Emacs. The rest of these instructions will assume you have PSGML installed.

2.2.1.2. Configuring Emacs for Use With PSGML

If you want GNU Emacs to enter PSGML mode when you open an .xml file, it will need to be able to find the DocBook DTD files. If your distribution already had PSGML set up for use with GNU Emacs, you probably won't need to do anything.

Tuning Emacs

For more information on how to configure Emacs, check out Software: Emacs.

Once you've configured your system to use PSGML you will need to override Emacs' default sgml-mode with the psgml-mode. This can be done by configuring your .emacs file. After you've edited the configuration file you will need to restart Emacs.

2.2.1.3. Creating New DocBook XML Files

There are a number of steps to creating a new DocBook XML file in Emacs.

  • Create a new file with an xml extension.

  • On the first line of the file enter the doctype for the version of DocBook that you would like to use. If you're not sure what a doctype is all about, check Section 5, “DocBook DTD”

  • Enter C+c C+p. If Emacs manages to parse your DTD, you will see Parsing prolog...done in the minibuffer.

  • Enter C+c C+e Enter to auto-magically insert the parent element for your document. (New authors are typically writing articles.)

  • If things are working correctly, you should see new tags for the parent element for your document right after the document type declaration. In other words you should now see two extra tags: <article> and </article> in your document.

2.2.1.4. Spell Checking in Emacs

Emacs can be configured to use aspell by adding the following to your ~/.emacs file. Thanks to Rob Weir for this configuration information.

;; Use aspell
(setq-default ispell-program-name "aspell")
;;Setup some dictionary languages
(setq ispell-dictionary "british")
(setq flyspell-default-dictionary "british")

2.2.2. nedit

http://sourceforge.net/projects/nedit/

To be fair, nedit is more for programmers, so it might seem a bit of overkill for new users and especially non-programmers. All that aside, it's extremely powerful, allowing for syntax highlighting. nedit doesn't allow you to automatically insert tags or automatically validate your code. However, it does allow for shell commands to be run against the contents of the window (as opposed to saving the file, then checking).

Figure B.1. nedit screen shot

The nedit program can provide line numbers across the left side of the screen, handy for when nsgmls complains of errors


2.2.2.1. Using nedit

When you open your DocBook file, nedit should already have syntax highlighting enabled. If it does not you can turn it on explicitly using: PreferencesLanguage ModeSGML HTML

If you have line numbers turned on (using PreferencesShow Line Numbers) then finding validation errors is much simpler. nsgmls, the validation tool we'll use, lists errors by their line number.

2.2.2.2. Configuring nedit

Since you can feed the contents of your window to outside programs, you can easily extend nedit to perform repetitive functions. The example you'll see here is to validate your document using nsgmls. For more information about nsgmls and validating documents please read Section 3, “Validation”.

  • Select PreferencesDefault SettingsCustomize MenusShell Menu.... This will bring up the Shell Command dialog box, with all the shell commands nedit has listed under the Shell menu.

  • Under Menu Entry, enter Validate DocBook. This will be the entry you'll see on the screen.

  • Under Accelerator, press Alt+S. Once this menu item is set up, you can press Alt+S to have the Validate DocBook automatically run.

  • Under Command Input, select window, and under Command Output, select dialog.

  • Under Command to Execute, enter nsgmls -sv. Using -v outputs the version number is output to the screen so that you know the command has run.

    Check the PATH

    Note that nsgmls has to be in your PATH for this to work properly.

Figure B.2. Adding shell commands to nedit

Adding shell commands to nedit

  • Click OK and you'll now be back at the main nedit screen. Load up an XML file, and select ShellValidate DocBook or press Alt+S.

  • The nedit program will fire up and check the contents of the window.

  • If all you see is a version number for nsgml then your document is valid. Any errors are reported by line number in your document.

Figure B.3. nsgmls output on success

If nsgmls reports success, it merely reports the version of nsgmls

2.2.3. VIM

http://www.vim.org

No mention of text editors is complete without talking about vi. The VIM (Vi IMproved) editor has the functionality of regular vi and includes syntax highlighting of tags.

2.2.3.1. Getting Started

There are many versions of vi. New authors will likely want one of the more feature-packed versions for syntax highlighting and a graphical interface including mouse control.

Red Hat users will want the following packages: vim-common, vim-minimal and vim-enhanced. Debian users will want the following package: vim. For an X interface (including GUI menus and mouse control) users will want gvim. The g in gvim is for Graphical.

VIM compiles very easy should you need to build your own. Both vim and gvim are built by default. Syntax highlighting is included but not enabled by default if you have to start from scratch; use the :syntax enable command in VIM to turn this feature on.

2.2.3.2. Creating New DocBook XML Files

In both vim and gvim, .xml files will be recognized and enter into SGML mode. A series of known DocBook tags and attributes have been entered into vim and will be highlighted one color if the name is known and another if it is not (for this author the colors are yellow and blue).

Having the correct document type declaration at the top of your document should add the syntax highlighting. If you do not see this highlighting you will need to force VIM into SGML mode (even for XML files) using the command :set ft=sgml. If you are working with multiple files for a single XML document you can add your document type in <-- comments --> to the top of the file to get the correct syntax highlighting (you will need to restart the program to see the change in highlighting). The top line of this file (tools-text-editors.xml) looks like this:

 
<!-- <!DOCTYPE book PUBLIC '-//OASIS//DTD DocBook XML V4.2//EN'> -->

2.2.3.3. Spell Check

As in Emacs, Vim, will work quite happily with aspell. It can be run from within Vim with the following: :! aspell -c %.

For more sophisticated spell check alternatives, give Cream or vimspell a try.

2.2.3.4. Tag Completion

The following information is provided by Kwan Lowe.

Vim has a DocBook helper script which can be easily copied into your .vimscripts directory and used to auto complete tags while writing DocBook documents. The script can be downloaded from: http://www.vim.org/scripts/script.php?script_id=38.

Grab the file, then untar it. Copy the dbhelper.vim to your .vimscripts directory if you have one.

  	$ mkdir .vimscripts
	$ cp dbhelper.vim .vimscripts
	

You'll also have to convert the dbhelper.vim file to unix formatting:

	$ dos2unix dbhelper.vim
	

Next, edit your .vimrc file and add the line: source /home/yourname/.vimscripts/dbhelper.vim

To use the scripts, enter vi and go into insert mode. Press , (comma) followed by the shortcut. For example: ,dtbk

2.2.4. XMLForm

http://www.datamech.com/XMLForm/

This web-based application allows you to put in the URL for XML source, or copy and paste the XML directly into the web form. The application then breaks down your document into a series of form fields that hide the DocBook tags so that you may edit the content directly. Version 5 is available from http://www.datamech.com/XMLForm/formGenerator5.html. This application is best on shorter documents (less than 20 pages printed).

As this is an on-line tool, it will be good for small updates only.

2.2.5. XMLmind XML Editor (XXE)

http://www.xmlmind.com/xmleditor

David Horton offers the following information:

I am a big fan of XMLMind's XXE editor and XFC FO converter. It is free as in beer, but not necessarily free as in speech. Very liberal license for personal use however. It's Java-based so it works on all sorts of OS's.