[silva-dev] Update - silva2odt

Dave Kuhlman dkuhlman at rexx.com
Tue Jun 26 05:59:37 CEST 2007


On Mon, Jun 25, 2007 at 08:24:51PM +0200, Marc Petitmermet wrote:
> > - lxml-1.2.1
> 
> this was it. i had version 0.9.1 installed. i updated to version 1.3  
> and the errors were gone.

Super.

> 
> i created a full-media export zip from http://ssb.biomaterials.ch/ 
> ssb/ (this is a standard silva folder) which has a size of 6 mb.
> 
> wow!! your program converted this 6 mb zip file in less than 0.3  
> seconds and i have a bunch of .odt files.

Great.  I lust for your machine.  It must be very fast.

> 
> some observations about installation and usage:
> - "setup.py install" complains about a missing "__init__.py"
>    but it installs nonetheless
> - i can confirm eric; the syntax: silva2odt.py infile.zip outfile.odt
>    does not work

I made a fix.  But, the command line syntax/flags needs more
thought and work.

> - i had to use the --styles-path option because it could not
>    find styles.odt
> 

I won't comment on each of the specific problems below, but I'll
definitely put them on my to-do list.

Now, you see that I was quite serious when I said that this is a
*preliminary* version.

> some observations about the result:
> - nested lists do not work at all. i.e. in the .odt file of the page
>    http://ssb.biomaterials.ch/ssb/bylaws
>    the article 2, 3, 4 and 5 are completely empty
> - bulleted lists are not interpreted as a list; the result contains a
>    paragraph with a bullet followed by line-end character and then the
>    content follows in the next paragraph. e.g.
> 
>    original in silva:
>    ? some text
>    ? some more text
> 
>    converted in .odt:
>    ?
>    some text
>    ?
>    some more text
> 
> - am i correct that tables are converted to standard text and not  
> tables?

There is no support for tables in the version you are testing.  I'm
working on tables now.  The generated XML code is very complicated,
and it's "fun".

> - images are not supported yet

I've just implemented some simple support for images.  It needs
more work and testing.

> - there seems to be a major issue with links within paragraphs. after  
> the
>    link the text is missing. e.g. on page
>    http://ssb.biomaterials.ch/ssb/history:
> 
>    [snip] of the Robert Mathys FoundationAbsorbables, Degradables and
>    Resorbables". During [snip]
> 
>    the missing text is '. The topic of the meeting was "'
> 
>    another example on the same page: the first paragraph of year 2004
>    contains only the following line:
> 
>    On behalf of the AO Research Institute
> 
> dave, thanks again for this great product. may i ask if it will be  
> possible to pipe the output of your program through ooconvert (http:// 
> sourceforge.net/projects/ooconvert/) to have even more different  
> formats?

The writer class (SilvaOdfWriter) has a function
(write_current_content_to_stream) that will write output to a
stream.  It would be easy to ask that function to write to
sys.stdout.

Or, we could write a replacement for the function that writes the
ODF output to a temp file, then calls

    os.system('ooconvert tmp.odt ... --format= ...')

to convert that to another format.

silva2odt.py is intended to be used as a Python module so that we
can implement those kinds of custom converters.

Thanks, by the way, for telling me about ooconvert.  I did not know
about that.

And, thanks for taking the time to write up all the helpful
comments.  I'll get to work.

Dave


-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman



More information about the silva-dev mailing list