-
-
Notifications
You must be signed in to change notification settings - Fork 9
feat: Add option to resolve relative links to absolute paths #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Introduced `index_file_name` option in the plugin configuration with a default value of "index.md". - Refactored link handling in the Markdown generation process to convert relative links to absolute URLs.
- Introduced `absolute_link` option in the plugin configuration to control the conversion of relative links to absolute URLs.
Thanks a lot for the PR @AlphaBs!
Don't we always want to output absolute links? Is the option really useful here, if the "backward-compatible" behavior is actually a bug?
Also, I'm not sure to understand the purpose of the index_file_name option. Aren't index files always named index.md?
Thanks for the feedback!
-
For my use case, I always need absolute links. However, I set the default to false because I wasn't sure about other users' needs, even though I personally believe the default should be true. I'm happy to go with your opinion on this. I've never personally needed relative links, so I can't judge if the previous behavior was a required feature for someone else.
-
I created the
index_file_nameoption because I wasn't completely sure how mkdocs works internally. I added it just in case there was a configuration in mkdocs that might change the output filename (e.g., from README.md to something other than index.md). If mkdocs always converts README.md to index.md, then you are correct, and this option is unnecessary.
Let's go with always rendering absolute links then!
I created the index_file_name option because I wasn't completely sure how mkdocs works internally. I added it just in case there was a configuration in mkdocs that might change the output filename (e.g., from README.md to something other than index.md). If mkdocs always converts README.md to index.md, then you are correct, and this option is unnecessary.
I see where you come from. MkDocs will consider README.md to be an index page, but this index page will always be named index.html. From there, the equivalent Markdown page we build will always be named index.md. Well, I think. Let me check.
Yeah that's probably it:
path_md = Path(page.file.abs_dest_path).with_suffix(".md")
The absolute destination path would definitely be index.html even if the page is named readme.md.
So, you can remove both options and update the code accordingly 🙂
Thanks for the super fast answer!
Thanks for clarifying! I'll remove the absolute_link and index_file_name option and update the code to always generate absolute links right now.
Updated! Let me know if there's anything else you'd like me to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions and a request to revert the architectural changes (I prefer functions with parameters over methods conveniently depending on self).
Did you use AI by the way? Just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems a bit brittle? Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the ':' character wasn't allowed in paths, but I see it's possible on Linux. I'll need to think about a better approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should skip links that start with /. MkDocs will warn about those when the link is to an actual page, and will leave such links as they are. We should assume users know what they are doing and they want this exact link, and we shouldn't assume there's a corresponding llmstxt/markdown page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that sounds like a better approach.
Did you use AI by the way? Just curious.
I used AI for writing the comments and the PR message, as I'm not a native English speaker. I wrote the code logic myself.
What do you think about separating the relative path conversion logic from plugin.py?
Currently, we're testing this logic within test_plugin.py, and it seems to make debugging difficult when tests fail, and the test code is becoming overly complicated.
We could thoroughly test the conversion logic in its own test_converter.py and leave only the very basic integration cases in test_plugin.py.
Uh oh!
There was an error while loading. Please reload this page.
Description
This PR introduces a new optional feature to resolve relative Markdown links (e.g.,
../assets/image.pngor./another-doc.md) into their full absolute paths during the conversion.Motivation
As mentioned in issue bug: Use of relative links within documentation generates incorrect links in
llms-full.txt#22 , links in the generated llms-full.txt were broken.LLM loses the original file's path context when processing the consolidated text. This can cause it to try and access incorrect or non-existent paths.
This new option ensures link integrity by converting all relative paths to absolute URLs (e.g.,
<base_url>/assets/image.png), making them accessible from anywhere and preserving the correct context for the LLM.Changes
New Config Options:
absolute_link: bool: Added a new option, defaulting to False for backward compatibility. When set to True, the absolute path conversion is activated.index_file_name: str: Added a new option, defaulting to index.md. This is used to correctly resolve directory links (e.g.,[link](./docs/)) to their corresponding index file (e.g.,[link](<base_uri>/docs/index.md)).Conversion Logic:
The logic runs before the final HTML is converted to Markdown. It parses the HTML, finds all tags, and reads their href attributes. If
absolute_link: true, it converts relative paths into absolute URLs using thebase_urland the page's directory.Example
mkdocs.yml
docs/page.md
llms.txt