Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Check eugene lint for migration lint rules? #305

kaaveland started this conversation in Ideas
Discussion options

Hey! I've been working on linting migration scripts by syntax tree analysis over at eugene on-and-off for about a year. The primary use-case I've been trying to tackle has been to set up guardrails in CI/CD to help catch locking issues and the like. There's a little web demo available at the documentation site.

I think there's some overlap in functionality between what this project is trying to do, and what I've been working on. Both projects are also Rust and use pg_query.rs. Should we look into whether it's possible for us to collaborate in some way so you could easily get access to the lints that I add to eugene in a sensible way? 🤔

You must be logged in to vote

Replies: 1 comment 5 replies

Comment options

That looks super nice! Our linter is very much a copy of Squawk, but your linter seems to be more advanced.

It should be possible to integrate it for sure. I would love to compare our approach to lints to yours to see if we can integrate it directly, or run your linter next to it.

Personally, I would prefer to have just one linter in postgrestools. Otherwise it might confuse users.

You must be logged in to vote
5 replies
Comment options

I wasn't aware of Squawk from before, this looks very interesting! The documentation is very good.

I started out quite differently -- in the primary eugene trace mode, it runs the statements in the migration one at a time within a transaction, keeping track of pg_locks and the system catalogs while running, then rolls back. This enables it to do things like detect that a statement caused a table rewrite while holding a lock on some object visible to other transactions (although not necessarily why). This is fast if you've got a database that has the last N - 1 migrations applied. I think it would probably be difficult to add this sort of functionality to an lsp server. eugene trace can create ephemeral postgres servers and dispose of them after running all migrations, if the migrations are SQL scripts with a common naming convention, but that's going to take many seconds (or possibly minutes). Since I wanted it to be easy for people to use it, or support commit-hooks, I added the syntax analysis later.

To be honest, the part of the tool that actually runs SQL sparks the most joy for me to work on -- I want to add checks to detect anti-patterns on the schema level. For example warn against foreign keys where the types don't line up correctly, like text on the referencing side pointing to uuid on the referenced side (joins won't be able to use an index), cyclical foreign keys, foreign keys missing a backing index on the referencing side, that sort of thing. Some of these ideas seem pretty difficult to chase with syntax analysis only. But maybe there's enough state in postgrestools to make it possible? Or maybe there's a different part of postgrestools where schema checks like that would fit? 🤔

The implementation of the rules you have already look familiar to me from eugene lint, I could definitely take a stab at just contributing some of the ones that I have - although I think it may require more state in RuleContext . I keep track of which database objects that were created within the current migration and whether set [local] lock_timeout has been applied so that I can warn on AccessExclusiveLock on objects visible to other transactions, for example. My rules do not work directly on the pg_query ast like yours, I have a thin translation layer between, but I could just remove that, it's small.

Comment options

thanks for the detailed analysis! The eugene trace feature is indeed interesting!

I think what you describe makes most sense: we should try to bring your lint rules into our linter, but keep the trace feature separate. In a way, it's also a lint, but it's using a live db and I think people dont expect a linter to do that. Out of curiosity, how did you come up with trace as the name for it? It's a pretty unique approach! We could integrate it into postgrestools as a separate command / feature that runs next to lint.

What do you think about this:

  • I will create an issue for me to pass more context to the RuleContext, especially the "previous" statements
  • Once that's there, I would ping you so we can start migrating your lint rules over. I would prefer to keep the raw pg_query output though since it's more flexible
  • we keep trace separate for now. when the lint rules are migrated, I would get back to you for a decision whether you want to keep them separate. if not, I am happy to help in integrating it into postgrestools

What do you think? Note that this is just a hobby project too, so can't make any promises on when things start to move :D

Comment options

If you're willing to try that, it would be great! I think what I'm doing can have a bigger impact in this project than in mine alone, so I'm willing to put some hours into this to see where it takes us.

Comment options

For example warn against foreign keys where the types don't line up correctly, like text on the referencing side pointing to uuid on the referenced side (joins won't be able to use an index), cyclical foreign keys, foreign keys missing a backing index on the referencing side, that sort of thing. Some of these ideas seem pretty difficult to chase with syntax analysis only. But maybe there's enough state in postgrestools to make it possible?

Regarding this, we have a SchemaCache available that holds meta information about foreign key links, data types and so on, and it's quite easy to enhance :) So I think there's a lot of potential!

I will create an issue for me to pass more context to the RuleContext, especially the "previous" statements

I'll start working on this now. I'll give you a headsup when there's progress :)

Comment options

Done, every lint rule now has access to the schema-data and an AnalysedFileContext via its RuleContext: ref

The AnalysedFileContext is currently just a scaffold, but we can add properties to it like you did with eugene: ref

Here's the basic loop that runs over a Vec<AnalysableStatement>: ref

I'll be in on vacation for a couple of weeks, but if you need any help getting started I'm sure @psteinroe can help you out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /