Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

samscudder/md2epub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

3 Commits

Repository files navigation

MD2Epub

Epub files can be read on any reader, and are a reversible format, so you can extract the source files from the .epub and manually edit it, if necessary. All ebook reader devices these days can open Epub files, although for Kindle they need to be converted. I recommend using the "Send to Kindle" option to view it instead of using third party conversion software.

Creating the structure and standard content files for an Epub can be tedious time-consuming work. This script is meant to ease that process by taking a markdown file and converting it to an Epub 3 compatible file. It's written in Python, so you need to have that installed to be able to run it.

I use Typora, which is a WYSIWYG markdown editor to create content markdown, and use the script to convert it into an Epub. These instructions were written with Typora.

The formatting is defined by a .css file, the cover and other illustrations are also automatically added.

You can inadvertently cause the creation of invalid content, so I would run the output through EpubCheck or some other validation process.

Markdown format is a little different from other formatting languages, in that paragraphs are separated by a blank line.

The script was developed using spec-driven-development using Kiro AI.

YAML Front Matter

The converter uses the YAML front matter for defining metadata for the epub file. The front matter starts with a --- on the first line, and continue as key:value pairs until the second ---.

Here is a sample content:

---
identifier: galaxy_v01n02_b003
title: Galaxy Science Fiction V01 N02
creator: H. L. Gold Ed.
cover: galaxy_v01n02.jpg
css: galaxy.css
language: en-US
publisher: World Editions, Inc.
subject: Science Fiction magazine
description: Galaxy Science Fiction - November 1950
---

Identifier is the internal book id that differentiates it from another book. I normal make up something from the title of the book, using underscores instead of spaces. I will add a _b000 to the end with a distinct build number, which I bump for each version. (I've had problems on the kindle when permanently deleting old versions from my devices and later sending again with the same book ID.)

The rest of the front matter is kind of self-explanatory.

To build the Epub, just place all files in the same folder and run

python md2epub.py my_file.md

Designating Chapters

A horizontal line (---) specifies a new file inside the epub. The first line after a horizontal line specifies the file name. So if you want the file to be called "chapter_01", just add this:

---
chapter_01
... chapter content ...

The table of contents (toc) will be generated automatically, in the order files are found in the markdown file. If you want a different content page inside the epub or (like in some pulps) a content page in a distinct order, add a manual content page where you want it to appear in the Epub.

To add a link, use the standard markdown link syntax:

[...AND IT COMES OUT HERE](and_it_comes_out_here)

This will generate a link to a chapter called and_it_comes_out_here. The converter will attempt to extract the chapter title from the first Heading 1 (# ) in the file. The toc.ncx file will be generated in the order the markdown is set.

Formatting

Italics, bold, headings are all converted to HTML. The & character, is converted to & so DON’T write it out yourself. Table support should work, but I haven't used it extensively.

Quotes are converted to smartquotes (‘ ’ and " "), if they aren't already.

Soft paragraph breaks (single line feeds) are converted to html line breaks.

Markdown has special uses for asterisks (bold, italic, lists, etc...), so to add an asterisk to the text, I normally use the UTF-8 Asterisk Operator (∗): U+2217.

I started to use this as a way to quickly make up Epub files from pulp magazines, and most of the formatting for "pulp-style" content is automatically created. They have a couple of quirks. A "scene" break always starts with the first word (or first few words) capitalized. The converter will automatically add a "dropcaps3" style to the first character in a capitalized paragraph in a file, and "dropcaps" to the first character of all other ocurrences, to match original formatting:

image-20260417095838292

converts to something like this:

image-20260417100003934

TIP: since you CAN have capitalized paragraphs that you DON'T want drop caps, like the second one above, just add a space as the first character (eagle-eyed readers might have noticed that in the markdown above). It will be supressed from the xhtml file.

Images

Image support is basic. In markdown, the format is like this:

![alt text](filename)

The file should be in the same folder as the markdown, and not have any folder information in the filename.

Manually adding a style

You can manually define a style for a paragraph by placing a . and a style name on the first line of a paragraph, and adding a single line break before the paragraph content.

This is the first paragraph.
This is the second paragraph.
.right
This is the third paragraph, with the style "right" applied to it.
.center.thin
This is the fourth paragram, with the styles "center" and "thin" applied to it.

In Typora, you enter the style, and do a "soft-break" with "shift-Enter" so it looks like this:

image-20260417095556170

Also, the body tags has a style with the same name as the filename, allowing different styling for tags for each chapter if you need it. With this you can practically style your content in any way you wish.

Here's a list of the styles that I have inside the galaxy.css file and a quick explanation of what they do. I normally have these base styles in any .css file I make up for Epubs. I configure the p tag to have no spacing between paragraphs, and to indent the first line by 1em.

Style Explanation
.fl First line. A 1em space before the paragraph, and no indentation on the first line.
.center Center the text in the paragraph.
.right Right align the text.
.dropcaps3 3-line high drop caps, automatically set in the first paragraph that starts with capitalized first words.
.dropcaps 2-line high drop caps, automatically set in subsequent paragraphs that start with capitalized first words.
.story-intro Paragraph in bold, sans-serif type, with a 1em margin on each side. Justified.
.thin 2em margin on the left and right of text.
.space A 1em space before a paragraph. (Some texts don't remove the indentation after a scene break).
.revindent Reverse indentation. Have the first line flush to the left, and the other lines indented 1em.
.noindent No indentation in the first line.
.smallcaps All text in the paragraph will be in small caps
.dinkus Centered, with extra spacing between letters. Yes... the * * * is called a dinkus.

I've added sample files in the test-data folder, for Astounding Stories, Planet Stories and Galaxy Science Fiction, along with the CSS and some of the image files extracted from pulp scans available on the Internet Archive.

The CSS can contain references to custom font files, which are also included in the Epub. The flash.ttf in planet_stories.css is an example, although the actual .TTF file isn’t here for copyright reasons.

You can take a look at the files to get a idea of what can be done.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /