41

is there an easy way to transform HTML into markdown with JAVA?

I am currently using the Java MarkdownJ library to transform markdown to html.

import com.petebevin.markdown.MarkdownProcessor;
...
public static String getHTML(String markdown) {
 MarkdownProcessor markdown_processor = new MarkdownProcessor();
 return markdown_processor.markdown(markdown);
}
public static String getMarkdown(String html) {
/* TODO Ask stackoverflow */
}
Riduidel
22.4k15 gold badges91 silver badges194 bronze badges
asked Sep 12, 2008 at 17:23

5 Answers 5

12

There is a great library for JS called Turndown, you can try it online here. It works for htmls that the accepted answer errors out.

I needed it for Java (as the question), so I ported it. The library for Java is called CopyDown, it has the same test suite as Turndown and I've tried it with real examples that the accepted answer was throwing errors.

To install with gradle:

dependencies {
 compile 'io.github.furstenheim:copy_down:1.0'
}

Then to use it:

CopyDown converter = new CopyDown();
String myHtml = "<h1>Some title</h1><div>Some html<p>Another paragraph</p></div>";
String markdown = converter.convert(myHtml);
System.out.println(markdown);
> Some title\n==========\n\nSome html\n\nAnother paragraph\n

PS. It has MIT license

brainer
1251 silver badge7 bronze badges
answered May 30, 2020 at 16:30
Sign up to request clarification or add additional context in comments.

Comments

10

There is a Java Library called flexmark which has such a feature. Maven Dependency:

<dependency>
 <groupId>com.vladsch.flexmark</groupId>
 <artifactId>flexmark-html2md-converter</artifactId>
 <version>0.64.0</version>
</dependency>

Using the class com.vladsch.flexmark.html2md.converter.FlexmarkHtmlConverter you can convert an HTML String to a Markdown String in one line like this:

String md = FlexmarkHtmlConverter.builder().build().convert(html);
answered May 6, 2022 at 9:41

2 Comments

Worked like a charm, replaced my old code that was using 'Remark'!
I added the above dependency, but i see error in my bundle, Imported package: Cannot be resolved
3

I am working on the same issue, and experimenting with a couple different techniques.

The answer above could work. You could use the jTidy library to do the initial cleanup work and convert from HTML to XHTML. You use the XSLT stylesheet linked above.

Unfortunately there is no library that has a one-stop function to do this in Java. You could try using the Python script html2text with Jython, but I haven't yet tried this!

answered Oct 7, 2008 at 12:50

Comments

2

if you are using WMD editor and want to get the markdown code on the server side, just use these options before loading the wmd.js script:

wmd_options = {
 // format sent to the server. can also be "HTML"
 output: "Markdown",
 // line wrapping length for lists, blockquotes, etc.
 lineLength: 40,
 // toolbar buttons. Undo and redo get appended automatically.
 buttons: "bold italic | link blockquote code image | ol ul heading hr",
 // option to automatically add WMD to the first textarea found.
 autostart: true
 };
answered Apr 12, 2009 at 0:36

Comments

1

There is a Haskell library called pandoc that can convert between most markup formats.
Although it is not a Java library, it can be used through its CLI in Java.

You can get and install the latest version from here. Read the getting started guides here.

var command = "pandoc --to=markdown_strict --output=result.md input.html";
var pandoc = new ProcessBuilder()
 .command(command.split(" "))
 .directory(new File(".")) // Working directory
 .start();
pandoc.waitFor();
// The output result.md will be created in the working directory

This tool can also be used in GitHub Actions workflows.

answered Dec 3, 2021 at 9:33

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.