Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: dotcommander/defuddle

v0.7.3

10 Jun 23:56
@garyblankenship garyblankenship

Choose a tag to compare

[v0.7.3] — 2026年06月10日

Fixed

  • Sanitize site-specific extractor output before returning Result.Content, matching the generic parser sanitizer path.
  • Honor ProcessCode, ProcessImages, ProcessHeadings, ProcessMath, ProcessFootnotes, and ProcessRoles options during standardization.
  • Cap ParseFromURL response reads before buffering the body, returning ErrTooLarge for oversized responses.
  • Return structured ErrHTTPStatus / HTTPStatusError for non-2xx URL fetches instead of parsing error pages.
  • Resolve implicit metadata URLs against the final redirect target while preserving an explicit caller-supplied Options.URL.
  • Sync selected upstream parser fixes from kepano/defuddle: ChatGPT split assistant messages, YouTube JSON-LD video metadata selection, markdown link destinations with spaces, and weekday-aware byline cleanup.

Changed

  • task verify now runs govulncheck ./... through the new task vuln gate.

Assets 2
Loading

v0.7.2

29 May 21:20
@garyblankenship garyblankenship

Choose a tag to compare

v0.7.2

Fixed

  • fix(extractors/grok): extract body inner HTML instead of full document wrapper

Changed

  • refactor(scoring): single-pass anchor metrics in scoreNonContentBlock
Loading

v0.7.1

25 May 01:46
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.7.1

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.7.1

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Bug fixes

  • 6a60544: fix(lint): errcheck on test pipes and tidy go.mod

Others

  • 1223a0a: chore(taskfile): gate tag target on verify
  • 0bb9f37: test(cli): add RunE integration tests for all subcommands

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.7.0...v0.7.1

Loading

v0.6.0

25 Apr 18:34
@garyblankenship garyblankenship

Choose a tag to compare

Full Changelog: v0.5.3...v0.6.0

Loading

v0.5.3

25 Apr 18:05
@garyblankenship garyblankenship

Choose a tag to compare

Full Changelog: v0.5.2...v0.5.3

Loading

v0.5.2

25 Apr 17:36
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.5.2

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.5.2

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Bug fixes

  • c748dab: fix(removals): subdomain-aware same-site hostname matching

Performance improvements

  • 6ee7089: perf: pre-compiled CSS selectors, regex fast-path, and avoid re-parse on word count

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.5.1...v0.5.2

Loading

v0.5.1

22 Apr 20:49
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.5.1

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.5.1

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Refactors

  • e1191d1: refactor(extractors): split registry.go into per-category files

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.5.0...v0.5.1

Loading

v0.5.0

22 Apr 20:25
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.5.0

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.5.0

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Features

  • 8521924: feat(extractors): port leetcode, discourse, linkedin — complete upstream parity
  • 43bff00: feat(extractors): port lwn, c2_wiki, x_oembed (Batch B)
  • e17513b: feat(extractors): port wikipedia, medium, nytimes (Batch A)

Others

  • e837e74: chore(scripts): add upstream extractor sync checker

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.4.0...v0.5.0

Loading

v0.4.0

22 Apr 19:45
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.4.0

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.4.0

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Features

  • b816f48: feat(cli): accept piped HTML in parse command
  • 70f6a46: feat(extractors): add Bluesky thread and quoted-post extraction
  • 22b4e28: feat(extractors): add Mastodon federation support with shared comment helpers
  • 5061a60: feat(extractors): add Threads extractor with dual DOM+JSON paths

Bug fixes

  • 8e6f13b: fix(cli): resolve err113 and errcheck lint findings
  • a8456fe: fix: prepare repository for public release

Refactors

  • 2291b7c: refactor(cli): extract loadResult/renderOutput, replace property switch with map, delete unused helpers
  • 391d389: refactor(cli): replace go-json-experiment with stdlib encoding/json
  • 92f8bf3: refactor(defuddle): decompose parseInternal, retry ladder as data, extract isProtectedNode
  • 5882003: refactor(extractors): DRY conversation extractors (shared title helpers, fallback selectors, single ExtractMessages pass)
  • d62ab88: refactor(scoring): extract ScoreElement sub-functions, hoist magic numbers to const
  • 79f4593: refactor: Go 1.24+ modernization (SplitSeq, new(bool), slices.Contains)

Others

  • faeba9f: chore(ci): bump actions/upload-artifact v4 → v7
  • 3b1697f: chore(deps): bump go-task/setup-task from 1 to 2
  • 6b94b3b: chore(deps): bump golang.org/x/net from 0.52.0 to 0.53.0
  • 0612afc: style(cli): align errNoURLs with sibling error conventions
  • 6311269: test(markdown): add golden file harness for 22 custom renderers

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.3.1...v0.4.0

Loading

v0.3.1

06 Apr 17:14
@garyblankenship garyblankenship

Choose a tag to compare

Defuddle Go v0.3.1

Web content extraction library and CLI tool for Go.

📦 Installation

Download Pre-built Binaries

Download the appropriate binary for your platform from the assets below.

Install with Go

go install github.com/dotcommander/defuddle/cmd/defuddle@v0.3.1

Install from Source

git clone https://github.com/dotcommander/defuddle.git
cd defuddle-go
make build-cli

Changelog

Bug fixes

  • 4736085: fix: rename test fixtures to avoid colons in file paths

🔍 Usage Examples

# Extract content from URL
defuddle parse https://example.com/article
# Convert to markdown
defuddle parse https://example.com/article --markdown
# Get JSON output with metadata
defuddle parse https://example.com/article --json
# Extract specific property
defuddle parse https://example.com/article --property title

Full Changelog: v0.3.0...v0.3.1

Loading
Previous 1
Previous

AltStyle によって変換されたページ (->オリジナル) /