In the case of triangle meshes, any modern implementation of the heat method should really make use of the intrinsic Laplace operator, which adds negligible computational overhead, and has identical inputs/outputs, but is much (much) more robust on low-quality meshes. The GeometryCentral, CGAL, and libigl implementations listed below provide this option. The GeometryCentral version also works on general meshes, including non-manifold ones, by adopting the tufted Laplacian.
None of the current implementations of the heat method (listed on this page) fully exploit its potential for speed or accuracy. To substantially improve performance on multi-core or GPU-based systems, one need only link against a parallel sparse linear solver that can handle symmetric positive-definite systems. (One might also parallelize matrix construction.) To further improve accuracy, one can incorporate the very nice iterative scheme of Belyaev & Fayolle from their paper On Variational and PDE-Based Distance Function Approximations. Like the heat method, this scheme lends itself nicely to prefactorization (hence low amortized cost relative to fast marching/fast sweeping or window-based methods), and would be easy to implement on top of the existing reference implementation.
For a very cool application of the heat method, see Floraform.
It is also possible to use the heat method to solve the single- or multiple-source shortest path problem on general graphs.