1

I have an html file like this this that can be opened with Libreoffice and then export to Excel

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>
<head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    <title>Tables</title>
    <meta name="generator" content="LibreOffice 4.2.8.2 (Linux)">
    <meta name="created" content="20170328;3115845446710">
    <meta name="changed" content="20170328;3152295681061">

    <style type="text/css"><!-- 
        body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small }
         -->
    </style>

But I want to know if there exists a method for doing this via command line

phuclv
  • 30,396
  • 15
  • 136
  • 260

1 Answers1

1

If the file can be opened normally in LibreOffice then you can use this

libreoffice --convert-to xls myfile.html

or this

libreoffice --convert-to xlsx myfile.html

depending on which format you want. Change libreoffice to soffice if the former isn't available on your system

Sometimes (especially if you're using an old version of LibreOffice) you also need --headless option

libreoffice --headless --convert-to xlsx myfile.html

You can also use unoconv

unoconv -f xlsx myfile.html
phuclv
  • 30,396
  • 15
  • 136
  • 260