0

I want to convert PowerPoint (.ppt) and Excel (.xls) files to a plain text format (.txt) from the command line on a server running Linux. The server does not have Xorg or similar libraries installed, and installing them is not an option.

I have tried catppt from catdoc, but it did not work for me.

~$ catppt presentacion_16x9.ppt 
Violación de segmento

(The message means "Segmentation fault")

What software should I use for this conversion?

phuclv
  • 30,396
  • 15
  • 136
  • 260

2 Answers2

0

Use the command-line utility of LibreOffice/OpenOffice

soffice --headless --convert-to txt presentacion_16x9.ppt

You may need to change soffice to libreoffice, or add the --headless option to before or after --convert-to depending on your software version

There are also out filters to modify the output options

soffice --headless --convert-to "txt:Text (encoded):UTF8"    presentacion_16x9.ppt
soffice --headless --convert-to "txt:Text (encoded):UTF8,LF" sheet.xls

For more information about the command and filters see

phuclv
  • 30,396
  • 15
  • 136
  • 260
0

You are probably going to need to use XSLT to convert the files to anything you need. See this blog: Link for lots of details.

In short, you would write an XSLT that can handle the fields that you want out of the PPT/XLS files and print them in the format you want to a TXT file. It's a bit to learn, but that is the only way I know how to do it.

Glorfindel
  • 4,158
sleeves
  • 146
  • 2