Revision b8d74bed-dda7-49a1-9d1a-1800c5636c8a - Code Review Stack Exchange
Your SQL is not working as well as you think it is.
Any time you do any significant amount of post-processing on an SQL result set, that is an indicator that your query is weakly formulated. The point of the database is that it lets you query it for exactly the data that you are interested in. The way your code treats the database as a passive storage format, you could just as well have stored everything in a CSV file.
You didn't provide any details about your database schema, so I can only guess that your third column is named `position` and the fourth column is named `genome`. (Had you explicitly specified which columns you were selecting, instead of just `SELECT *`, your code would be more self-documenting.) A query such as the following should extract the relevant rows:
SELECT *
FROM sequence_info AS b
JOIN sequence_info AS a
ON a.genome = b.genome
AND a.position = b.position - length(b.genome)
JOIN sequence_info AS c
ON c.genome = b.genome
AND c.position = b.position + length(b.genome)
WHERE kmer_length > 2
ORDER BY length(b.genome) DESC, b.genome;
For performance, be sure to [create index](http://www.sqlite.org/lang_createindex.html)es on the `genome` and `position` columns of your table.