Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

add security note about accessing urls #1600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gregsdennis wants to merge 6 commits into main from gregsdennis/uri-vs-url-security-note

Conversation

Copy link
Member

@gregsdennis gregsdennis commented Apr 26, 2025

What kind of change does this PR introduce?

clarification

Issue & Discussion References

Summary

Adds a security note about performing network operations when encountering URLs.

The last sentence in the addition was taken directly from @awwright's comment in the issue.

Does this PR introduce a breaking change?

no

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mention any security considerations. It's a requirement that we made for security reasons, but it's not a security consideration itself.

We could talk about the security considerations that led to that decision, but that feels out-of-place to me. This section should be about things implementers need to consider and protect against. It's not supposed to be a place for us to justify decisions we made for security reasons.

Because this requirement is a "SHOULD" and not a "MUST", we could talk about the security considerations that implementers who chose to support that kind of retrieval need to be aware of. That's the only way I think this makes sense.

gregsdennis and dfgffcv reacted with thumbs up emoji
Comment on lines 1997 to 1998
the host system to various security vulnerabilities, such as man-in-the-middle
attacks or data leaks.
Copy link
Member

@Relequestual Relequestual May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to sound alarmist, but RCEs are also a potential if there's the potential of bad parsing and maliciuos intent. I think MitM is a low risk, but a noteable consideration.

How do you imagine data leaks might happen? By virtue of making a request to a URL from a system which should be invisible?

Copy link
Member Author

@gregsdennis gregsdennis May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A misbehaving implementation with access to the internet could send your data to another server, unrequested. To avoid this we instruct implementations to not make network calls by default. Thus making use of the network is opt-in, suggesting that the user understands the risks.

I can add the RCE risk to the list.

Copy link
Member

@jdesrosiers jdesrosiers May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there are nuances that I'm not familiar with in this area, but I don't see any of these things as risks worth mentioning.

A misbehaving implementation with access to the internet could send your data to another server, unrequested

I don't see how that's possible. We're talking about retrieving schemas over a network. Information is coming into the system, never out. The only data that could be leaked is what public schemas your network is accessing.

I think MitM is a low risk, but a noteable consideration.

I see MitM as essentially the same thing as data leakage. MitM is about covertly intercepting communications that are thought to be done privately. If you're retrieving a publicly available schema there's no need for MitM because the schema is already public. Again, the only information that could be exposed is which schemas you're accessing.

RCEs are also a potential if there's the potential of bad parsing and maliciuos intent.

I'm not sure what you mean by this. It would need to be code send by the attacker that gets executed by the implementation that isn't intended to be executed by the implementation. I don't see how that's possible.

Copy link
Member

Minor issue, but otherwise looks good. Thanks!

Co-authored-by: Ben Hutton <relequestual@gmail.com>
Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that occurs to me is that real risk is evaluating untrusted schemas. Retrieving a schema you control that is accessible only on a VPN is a safe practice. The risk only comes when on an untrusted network because it opens the possibility that an untrusted schema can get into your system.

I think this section should focus on specific risks of network/filesystem access related to untrusted schemas. For example, if a system accepts user schemas and one of those schemas has a filesystem reference, you don't want an untrusted schema trying to access your filesystem.

Once we've covered that, then we can simply say that accessing schemas over an untrusted network opens the possibility of unintentionally evaluating untrusted schemas due to malicious actors. I wouldn't even mention specific types of network attacks. I think that's out of scope.

Copy link
Member

FWIW this is what I note for security considerations in my implementation -- https://metacpan.org/pod/JSON::Schema::Modern#SECURITY-CONSIDERATIONS -- as regular expressions provide a potential vector for executing code or creating a DoS.

Copy link
Member Author

@karenetheridge thank you. I notice that what you have is particularly focused on regular expressions, which are already included in the validation spec.

Copy link
Member

I notice that what you have is particularly focused on regular expressions

Yes, since I don't support fetching schemas from disk or the network, I think this is the only direct source of vulnerabilities that a user might not already be aware of.

I think the key to emphasize (and we can repeat it in a few places if relevant) is "do not trust schemas from external sources".

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this just duplicates what we say in the Schema References section:

Implementations which can access the network SHOULD default to operating offline.

It doesn't need to be in both places.

Also, I still don't see this as a security consideration it itself. It's more of a recommendation/requirement to address a security consideration that goes unmentioned.

If you don't mind, I'd like to have go at an alternate PR for this. I think I'm well positioned to write this up considering that I do support this kind of retrieval and have spent a good amount of effort thinking through the implications of that decision.

gregsdennis reacted with thumbs up emoji
Copy link
Member Author

Go for it. I'm spinning wheels with this one.

jdesrosiers reacted with thumbs up emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers

@jdesrosiers jdesrosiers jdesrosiers requested changes

@Relequestual Relequestual Relequestual requested changes

Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Security considerations should mention treating URIs as URLs (from $ref and $schema)

AltStyle によって変換されたページ (->オリジナル) /