Here are some research papers that are about Valgrind or involve Valgrind significantly.
If you refer to Valgrind in a published work, please cite one or more of the following papers, not just the Valgrind website.
 Tracking Bad Apples: Reporting the
 Origin of Null and Undefined Value Errors.
 Michael D. Bond, Nicholas Nethercote, Stephen W. Kent, Samuel Z. Guyer and
 Kathryn S. McKinley.
 Proceedings of the ACM SIGPLAN International Conference on Object-Oriented
 Programming, Systems, Languages and Applications (OOPSLA 2007),
 Montreal, Canada, October 2007.
 This paper describes an attempt to improve the error messages produced for
 undefined value errors detected by Memcheck by tracking their origins.
 Ultimately it wasn't that useful in practice. The paper also describes a
 similar technique for improving null pointer exception messages in Java,
 which was much more successful. Please only cite this paper if you are
 specifically discussing origin tracking.
 
 Valgrind: A Framework for Heavyweight
 Dynamic Binary Instrumentation.
 Nicholas Nethercote and Julian Seward.
 Proceedings of ACM SIGPLAN 2007 Conference on Programming Language Design
 and Implementation (PLDI 2007), 
 San Diego, California, USA, June 2007.
 This paper describes how Valgrind works, and how it differs from other
 DBI frameworks such as Pin and DynamoRIO. 
 Please cite this paper when discussing Valgrind in general. However, if
 you are discussing Valgrind specifically in relation to memory errors
 (i.e. the Memcheck tool), please cite the USENIX paper below as well or
 instead.
 
 How to Shadow Every Byte of Memory
 Used by a Program.
 Nicholas Nethercote and Julian Seward.
 Proceedings of the Third International ACM SIGPLAN/SIGOPS Conference on
 Virtual Execution Environments (VEE 2007), San Diego, California, USA,
 June 2007.
 This paper describes in detail how Memcheck's shadow memory is
 implemented, and compares it to other alternative approaches.
 Please cite this paper if you are discussing shadow memory
 implementations. You could also cite it when discussing Memcheck, but
 please cite the USENIX paper in preference.
 
 Building Workload Characterization Tools
 with Valgrind.
 Nicholas Nethercote, Robert Walsh and Jeremy Fitzhardinge.
 Invited tutorial, IEEE International Symposium on Workload
 Characterization (IISWC 2006), San Jose, California, USA, October 
 2006.
 These four talks cover (a) how Valgrind works, (b) three example profiling
 tools (Cachegrind, Callgrind, Massif), (c) how to build a new tool, using
 a simple example, (d) ideas for more advanced tools, and general
 tool-building advice.
 
 Using Valgrind to detect undefined value
 errors with bit-precision.
 Julian Seward and Nicholas Nethercote.
 Proceedings of the USENIX'05 Annual Technical Conference, Anaheim,
 California, USA, April 2005.
 This paper describes in detail how Memcheck's undefined value error
 detection (a.k.a. V bits) works. Please cite it if you are talking about
 memory checking with Valgrind, in particular if you are referring to
 Memcheck's undefined value error detection.
 
 Dynamic Binary Analysis and Instrumentation.
 Nicholas Nethercote.
 PhD Dissertation, University of Cambridge, November 2004.
 This dissertation describes Valgrind in some detail (some of these details
 are now out-of-date) as well as Cachegrind, Annelid and Redux; it also
 covers some underlying theory about dynamic binary analysis in general and
 what all these tools have in common. Please cite it if you are writing
 about Cachegrind, or the dynamic binary analysis theory work. If you are
 writing about Valgrind in general, please cite the PLDI2007 paper above in
 preference. If you are writing about Annelid, please cite the SPACE2004
 paper in preference. If you are writing about Redux, please cite the
 ENTCS paper in preference.
 
 A Tool Suite for Simulation Based
 Analysis of Memory Access Behavior.
 Josef Weidendorfer, Markus Kowarschik and Carsten Trinitis.
 Proceedings of the 4th International Conference on Computational Science
 (ICCS 2004), Krakow, Poland, June 2004.
 This paper describes Callgrind.
 
 Bounds-Checking Entire Programs Without
 Recompiling.
 Nicholas Nethercote and Jeremy Fitzhardinge.
 Informal Proceedings of the Second Workshop on Semantics, Program
 Analysis, and Computing Environments for Memory Management (SPACE 2004),
 Venice, Italy, January 2004.
 This paper describes Annelid, an experimental bounds checker. Although
 the paper is upbeat about Annelid, it really didn't work that well.
 Good bounds-checking relies too much on source-level information (such as
 what values are pointers, and what things those pointers point to) that is
 difficult to obtain when working at the binary level.
 
 Valgrind: A Program Supervision Framework. (slides)
 Nicholas Nethercote and Julian Seward.
 Electronic Notes in Theoretical Computer Science 89 No. 2, 2003.
 This paper describes Valgrind in general, but is somewhat out-of-date.
 Please cite the PLDI paper above in preference.
 
 Redux: A Dynamic Dataflow Tracer.
 Nicholas Nethercote and Alan Mycroft.
 Electronic Notes in Theoretical Computer Science 89 No. 2, 2003.
 This paper describes Redux, and experimental dynamic dataflow tracing
 tool. Redux is fun and intriguing, although wildly unscaleable. The
 paper suggests multiple uses for it, none of which are likely to be
 practical. What it 
 Flayer: Exposing Application
 Internals.
 Will Drewry and Tavis Ormandy.
 Proceedings of the First USENIX Workshop on Offensive Technologies (WOOT
 '07), Boston, Massachussetts, USA, August 2007.
 This paper is about a tool that analyses and modifies program execution
 flow.
 
 Profiling floating point value ranges
 for reconfigurable implementation.
 Ashley W Brown, Paul H J Kelly, Wayne Luk.
 Proceedings of the 1st HiPEAC Workshop on Reconfigurable Computing, Ghent,
 Belgium, January 2007.
 This paper is about a tool that analyses the floating point values used by
 programs, in order to determine if specialised floating point hardware
 could be used to run them.
 
 Fault Detection in Multi-Threaded C++
 Server Applications.
 Arndt Muehlenfeld and Franz Wotawa.
 Informal Proceedings of the International Workshop on Multithreading in
 Hardware and Software (TV06), Seattle, Washington, USA, August 2006.
 This paper is about some improvements to Helgrind, the data-race detector.
 
 Dynamic taint analysis for automatic
 detection, analysis, and signature generation of exploits on commodity
 software.
 James Newsome and Dawn Song.
 Proceedings of the 12th Annual Network and Distributed System Security
 Symposium (NDSS '05), February 2005.
 This paper is about a security tool built with Valgrind.