[silva-dev] Update - silva2odt
dkuhlman at rexx.com
Tue Jun 26 05:59:37 CEST 2007
On Mon, Jun 25, 2007 at 08:24:51PM +0200, Marc Petitmermet wrote:
> > - lxml-1.2.1
> this was it. i had version 0.9.1 installed. i updated to version 1.3
> and the errors were gone.
> i created a full-media export zip from http://ssb.biomaterials.ch/
> ssb/ (this is a standard silva folder) which has a size of 6 mb.
> wow!! your program converted this 6 mb zip file in less than 0.3
> seconds and i have a bunch of .odt files.
Great. I lust for your machine. It must be very fast.
> some observations about installation and usage:
> - "setup.py install" complains about a missing "__init__.py"
> but it installs nonetheless
> - i can confirm eric; the syntax: silva2odt.py infile.zip outfile.odt
> does not work
I made a fix. But, the command line syntax/flags needs more
thought and work.
> - i had to use the --styles-path option because it could not
> find styles.odt
I won't comment on each of the specific problems below, but I'll
definitely put them on my to-do list.
Now, you see that I was quite serious when I said that this is a
> some observations about the result:
> - nested lists do not work at all. i.e. in the .odt file of the page
> the article 2, 3, 4 and 5 are completely empty
> - bulleted lists are not interpreted as a list; the result contains a
> paragraph with a bullet followed by line-end character and then the
> content follows in the next paragraph. e.g.
> original in silva:
> ? some text
> ? some more text
> converted in .odt:
> some text
> some more text
> - am i correct that tables are converted to standard text and not
There is no support for tables in the version you are testing. I'm
working on tables now. The generated XML code is very complicated,
and it's "fun".
> - images are not supported yet
I've just implemented some simple support for images. It needs
more work and testing.
> - there seems to be a major issue with links within paragraphs. after
> link the text is missing. e.g. on page
> [snip] of the Robert Mathys FoundationAbsorbables, Degradables and
> Resorbables". During [snip]
> the missing text is '. The topic of the meeting was "'
> another example on the same page: the first paragraph of year 2004
> contains only the following line:
> On behalf of the AO Research Institute
> dave, thanks again for this great product. may i ask if it will be
> possible to pipe the output of your program through ooconvert (http://
> sourceforge.net/projects/ooconvert/) to have even more different
The writer class (SilvaOdfWriter) has a function
(write_current_content_to_stream) that will write output to a
stream. It would be easy to ask that function to write to
Or, we could write a replacement for the function that writes the
ODF output to a temp file, then calls
os.system('ooconvert tmp.odt ... --format= ...')
to convert that to another format.
silva2odt.py is intended to be used as a Python module so that we
can implement those kinds of custom converters.
Thanks, by the way, for telling me about ooconvert. I did not know
And, thanks for taking the time to write up all the helpful
comments. I'll get to work.
More information about the silva-dev