Some newer writings can be found here.

Recent posts

    • The cost of fixing climate change - December 24, 2016
      climate change, nuclear, solar, war, economy
      Comments

      An old draft I had lying around, but which I somehow never got around to finish. Well, here goes. TL;DR: Nuclear is probably our best option to reverse climat change. This explains why.

    • HP EliteBook Folio G1 - September 12, 2016
      linux, hardware, ubuntu
      Comments

      A review of the HP EliteBook Folio G1, and the story of how to install Linux on it.

    • CAS-based generic data store - August 3, 2016
      science, statistics, whales
      Comments

      Content adressable storage makes a system for generic data storage slightly more opaque, but eliminates the need for any central naming authority, and comes with many desirable properties (immutability of data, verifiability, deduplication) for free.

    • Why we should stop talking, and start to prepare for climate change - May 23, 2016
      climate change, nuclear power, solar power
      Comments

      For all the focus on climate change, politicians, environmentalists, and basically everybody involved, don't seem particularly interested in actually solving the problem.

    • Probabilities for heterozygote genetic markers in hybrids - March 10, 2016
      science, statistics, whales
      Comments

      With a suitable set of genetic markers, it is fairly straightforward to identify organisms as belonging to one or the other population. But how useful are such markers for identifying hybrids when migrants from one population have mixed into the ohter?

    • Can you trust science? - March 25, 2015
      science, statistics
      Comments

      We are bombarded by reports of scientific results that promise to revolutionize aspects of our lives, but somehow actual progress seems to go at a much lower pace. In reality, many scientific results fail to be reproduced. And there's a good reason for that.

    • Thoughts on phylogenetic trees - February 15, 2015
      phylogeny, evolution
      Comments

      Phylogenetic trees represent the evolutionary relationship between species. They are often constructed based on the sequences of genes, and different genes can give conflicting results. This is my attempt to sort things out.

    • Information content and allele frequency difference - July 17, 2014
      sequence analysis, SNP, varan
      Comments

      A quick look at the relationship between the information value and the frequency difference of alleles.

    • Expected site information from SNPs - July 2, 2014
      bioinformatics, bayesian
      Comments

      SNPs are commonly identified by calculating measures like $F_ST$ or $p$-values from high-throughput sequencing data. But these are proxies for what we really want to know, viz. the information to be gained from observing a site. Here's how to do that.

    • Big data revisited - May 5, 2014
      bioinformatics, opinion, big data
      Comments

      Wherein we examine the functional programming communities active in Munich, define the term "big data", and look at what it means in relation to bioinformatics, and science in general.

    • Some not-so-grand challenges in bioinformatics - April 8, 2014
      bioinformatics, opinion
      Comments

      After working some years in bioinformatics, one realizes that there are some unmet challenges out there. Our methods and tools are often not quite as good as we would wish. Here are some of the issues I've run across.

    • Parallel SNP identification - March 26, 2014
      sequence analysis, SNP, population genetics
      Comments

      I've recently been experimenting with various metrics for SNP discovery. One challenge is the time it takes to process the large data amounts, and many existing tools are quite slow to run. The obvious answer is of course parallelism, and one would think that for a program that essentially processes a stream of records one by one, something simple like Haskell's `parMap` would work. But it turns out `parMap` builds a strict list of all the work to be done, and my feeble attemts at getting around this (by chunking and so on), didn't quite work out. Here is a rather quick and dirty hack that did, and which gets a nice speedup.

    • k-mer counting in a high-level language? - November 22, 2013
      sequence analysis, k-mer, judy arrays
      Comments

      I often argue that Haskell is a high-level language that unlike many other HLLs offer good tools that also results in good performance. Currently, I am toying around with k-mer indexing, here are the results so far.

    • Frequency counting in Haskell - November 11, 2013
      sequence analysis, SFF, 454, parsing
      Comments

      Frequency counting is an important task, which can be implemented with a variety of underlying data structures. Here I explore a few of them in Haskell.

    • Generic storage for heterogeneous data - October 1, 2013
      data storage, metadata, xml, bioinformatics
      Comments

      Modern biology (and science in general) tends to produce data. Lots and lots of data. Often way more than can be sensibly analyzed. To exploit the value of these data in the future, it is necessary to store them in a way so that they can be easily cataloged, searched, retrieved, and interpreted. This is an attempt to design a system that addresses these needs, yet is as simple as possible.

All posts…

Feedback? Please email ketil@malde.org.

AltStyle によって変換されたページ (->オリジナル) /