I want to download HTMLs (example: http://www.brpreiss.com/books/opus6/) and join it to one HTML or some other format that i can use on ebook reader. Sites with free books don't have standard paging, they're not blogs or forums, so don't know how to do some automatic crawling and merging.
5 Answers
The way I used to do this was Calibre.
That became too much of a pain though so I built a Chrome Extension to make it easier.
It's called EpubPress (http://epub.press).
It allows you to build an ebook from your Chrome tabs.
Hope that helps!
- 129
- 1
- 3
Pandoc can take a link to a page (or a html file) and convert it to pdf/epub ...
I'm not sure if it'd crawl. If it doesn't, you could crawl pages first with wget or something (or just collect links) and give it to pandoc.
HTTrack is a good option - it will build an ebook from a website: It is available for download from here: https://www.httrack.com/ HTTrack "allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure."
You can then convert the HTML into an EPUB , AZW3 or PDF using Calibre, or any other HTML to epub conversion software.
A second option to convert directly to EPUB is EpubPress:
It has extensions to allow use from Firefox (v44.0+ only) or Chrome.
To use this software you need to open a browser window. Each tab is essentially a 'chapter' in your ebook. Arrange the tabs in the desired order of appearance, then activate epubpress - it will download and arrange the tabs in their order of appearance, in .epub format. Hope this helps!
*However, note that EpubPress downloads discrete webpages - not a 'website', at HTTrack does. To download a website with EpubPress you must open each link on the website as a separate tab, then use Epubpress to collect these links into .epub format.
- 31
You can use https://getpocket.com and the pocket recipe in calibre accessible via the "Fetch news" menu.
- 1,164
