Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

URL.to_uri() applies IDNA encoding to IPv6 literal hosts #188

Open
@mvaught02

Description

Description

When calling URL.to_uri() on a URL that uses a plain IPv6 literal host (e.g. http://[::1]:PORT/...), hyperlink attempts to IDNA-encode the host. This is incorrect and results in an exception from idna, since IPv6 literals are not DNS labels and must not undergo IDNA processing.

This is the same root cause as #68, which appears to have gone stale.


Reproduction

from hyperlink import EncodedURL
url = EncodedURL.from_text("http://[::1]:60993/api/v1/token")
url.to_uri().to_text().encode("ascii")

Actual result

idna.core.InvalidCodepoint:
Codepoint U+003A ':' not allowed

This occurs because idna.encode("::1") is invoked.


Expected result

The URL should round-trip successfully:

"http://[::1]:60993/api/v1/token"

Why this is a bug

  • IDNA applies only to DNS labels, not IP literals
  • RFC 3986 explicitly allows IPv6 literal hosts
  • hyperlink already has parse_host() which distinguishes:
    • DNS names
    • IPv4 literals
    • IPv6 literals
  • Applying IDNA to IP literals is both unnecessary and invalid

Suggested fix

In URL.to_uri(), detect IPv4/IPv6 literal hosts using parse_host() and skip IDNA encoding for those cases. IDNA should only be applied to non-IP (DNS) hosts.

Pseudo-logic:

family, host_text = parse_host(self.host)
if family is not None:
 # IPv4 / IPv6 literal
 host = host_text
else:
 host = idna_encode(host_text, uts46=True).decode("ascii")

Regression test

def test_ipv6_literal_not_idna_encoded():
 from hyperlink import EncodedURL
 url = EncodedURL.from_text("http://[::1]:8080/path")
 assert url.to_uri().to_text() == "http://[::1]:8080/path"

Environment

  • hyperlink: current release / main
  • Python: 3.10+
  • idna: current

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /