Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

CodeQL custom model documentation improvements #18196

ajohnston9 started this conversation in General
Discussion options

I've been looking into the development of custom models for CodeQL, as our internal codebases use a wide variety of bespoke libraries not currently modelled by CodeQL. For clarity, here is a link to the relevant documentation for custom models for Java and Kotlin codebases.

Unfortunately, the currently available documentation for custom models, regardless of langauge, is significantly lacking, to the point that development of models is challenging if not impossible without significant research. I was hoping to get some clarity on a number of issues, including but not limited to:

  • Defining what value(s) are supported by different arguments. For example, sourceModel specifies a provenence parameter, but this parameter is not defined. Similarly, sinkModel specifies a kind variable, but I cannot find documentation on valid values for kind.
  • While sourceModel and sinkModel have relatively intuitive uses, summaryModel and neutralModel are not defined. No intuition is provided as to when these models should be defined and what kind of impact having (or lacking) these models have on the success of a query.
  • I cannot readily identify which models are natively supported within CodeQL and which would require new development. This information may be embedded within the code, but a cursory search did not yield results. Ideally, I would like this information documented somewhere clearly, as I would like to diff what open-source libraries are used within my codebase and the ones with existing support in CodeQL to identify which models would need to be contributed. Where appropriate, I would be happy to contribute these models back to the community.
  • I see some documentation that suggests integration with VSCode would enable auto-generation of models or simplify the process. We do not use VSCode internally, so ideally this support would be provided as a standalone script or integrated within the CodeQL binary. Are there any plans to provide this functionality? If not, what is the ideal way of developing a significant amount of models?

Apologies for the excess of questions, but I figured organizing this into a single thread is better for visibility than creating multiple interrelated threads.

You must be logged in to vote

Replies: 1 comment 1 reply

Comment options

Hi

Thanks for your questions, I will try my best to reply below:

For example, sourceModel specifies a provenence parameter, but this parameter is not defined.

The provenance column is mostly an internal thing, and should in your case always be set to "manual".

Similarly, sinkModel specifies a kind variable, but I cannot find documentation on valid values for kind.

This is because it is language-dependent, and it is probably easiest to simply grep for existing models to see which ones are supported (for example, the sql-injection model for Java means that a sink model will apply to the SQL injection query).

While sourceModel and sinkModel have relatively intuitive uses, summaryModel and neutralModel are not defined.

There are some examples of summaryModel at https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/#example-add-flow-through-the-concat-method and https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/#example-add-flow-through-the-map-method, and an example of a neutralModel at https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/#example-add-a-neutral-method.

I cannot readily identify which models are natively supported within CodeQL and which would require new development.

You may be able to identify this by running the UnsupportedExternalAPIs.ql query on your codebase.

I see some documentation that suggests integration with VSCode would enable auto-generation of models or simplify the process. We do not use VSCode internally, so ideally this support would be provided as a standalone script or integrated within the CodeQL binary. Are there any plans to provide this functionality? If not, what is the ideal way of developing a significant amount of models?

I don't think we have any plans in this direction, but I think @jf205 may know more.

You must be logged in to vote
1 reply
Comment options

jf205 Dec 5, 2024
Collaborator

Hi @ajohnston9 and thanks for the feedback

The tldr; is that the CodeQL extension for VS Code is the best place to develop custom models. We have a feature called the CodeQL model editor which can do most of the things that you mention including 'identifying which models are natively supported within CodeQL and which would require new development'. You can read more about it here.

We don't have any plans to provide that functionality elsewhere at the moment. Apologies if that doesn't work for you. It might be worth trying out the VS Code extension to see if it helps with your modeling tasks. If not, then i'd be glad to hear more feedback about what you think is missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /