txt2pml.py

2004-10-10 @ 20:42:35

palmos
python

here.

I found a heap of Palm TEXt REAd documents, but I don’t like the way they appear as (Doc) in PalmReader. So I wrote a script to convert them to PML, the format used by DropBook to make PNRd PPrs documents.

Basically, I found that on OS X the translation of txt2pdbdoc -d (decode back into text) wasn’t so good; a heap of characters needed to be changed.

I did have them all listed here, but if I edit them, ecto fucks up the encoding!

I also changed the === to an 80% horizontal line.

The [[[ - ]]] blocks were indented, and a footer line [* <Text>] is indented also.

I assumed the only use of a / was for italics, and _ for underlining.

I also assume the first non-empty, non === line is the Title, and the Author line starts with By. I use this info to create a ‘Title Page’.

The tricky bit was getting the Chapter Heading sorted, I needed to break the text into a list of strings to do this, and scan through. This slows the script down a lot, but it still works okay. I might profile it a bit and see where the slowdown is.

Anyway, here’s the latest version I’ve uploaded: txt2pml.py

I plan to make a version to process Project Gutenberg texts, but that’s on the back burner.