Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Assorted ideas #347

wojdyr started this conversation in Ideas
Dec 19, 2024 · 0 comments
Discussion options

I'll edit this post if anything changes.

Requested features and ideas that have been floating around.
Each of them makes an interesting and useful scientific programming project.

French-Wilson method

It has been requested a few times over the years. Popular implementations of F-W (truncate, ctruncate, cctbx, XDSconv) differ in details, as mentioned here (I haven't found a comparison of all the implementations). In recent years new implementations were written: in STARANISO, reciprocalspaceship, Servalcat.

DSSP (partly done – finding hydrogen bonds has been implemented, but not much tested yet)

A classical method of assigning secondary structure, published by Kabsch and Sander in 1983. Either implementing it from scratch or porting one of the existing implementations. Here are all implementations that I've found:

  • dssp in C++, license BSD-2, with minor improvements to the original method, associated with PDB_REDO,
  • older version of the above,
  • ksdssp – in C++, license BSD-3, from 2004, associated with Chimera
  • pyDSSP – Python, license MIT, started in 2022, simplified to be differentiable
  • AssigningSecondaryStructure – Julia, license MIT, started in 2023
  • gmx-dssp – for GROMACS, C++, LGPL (incompatible license!), from 2023, reported to give the same results as the first one.

SASA

Calculating solvent-accessible surface area. There are different methods; I don't know how they differ. We'd need to check what is used in Pymol, BioPython, MDTraj, DSSP, FreeSASA, dr-sasa, etc.

new parser libraries

Evaluate newer parser libraries. I've been happy with PEGTL; it's been a much better experience than using Boost.Spirit. I wrote Gemmi's CIF parser around 2017 using PEGTL and it's still the fastest open-source CIF parser. Since then, a few other C++ libraries have emerged: lexy which is inspired by PEGTL, Boost.Parser which is a newer alternative to Boost.Spirit, and some others. Lexy is interesting because it comes with benchmarks. But I won't have time to try it out anytime soon.

PyMOL-like selection syntax (a subset is implemented and documented)

This has been proposed several times by different people. We use the selection syntax from MMDB (a.k.a CID) which is also used by Coot and some CCP4 programs. However, many users strongly prefer chain A and resname HIS over //A/(HIS)/. It's unfortunate that each program (PyMOL, Chimera, VMD, cctbx, ...) uses different syntax. I don't have a strong preference; PyMOL is proposed here because it seems to be more widely known than other programs. Note that a new selection would not fit into gemmi's current Selection class. The CID selection is simpler, it is matched separately for each level of the model-chain-residue-atom hierarchy (no boolean operators). Feedback on what has been implemented is welcomed.

Determining space group from symmetry operations

Currently, gemmi can match symmetry operations against the list of 560+ tabulated space group settings. This approach doesn't work in the general case (any origin shift), which requires a different method -- for example, the one described by R.W. Grosse-Kunstleve in Algorithms for deriving crystallographic space-group information (1999).

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /