Index: libmarshal.tex =================================================================== RCS file: /cvsroot/python/python/dist/src/Doc/lib/libmarshal.tex,v retrieving revision 1.22 diff -u -r1.22 libmarshal.tex --- libmarshal.tex 15 Nov 2001 23:55:12 -0000 1.22 +++ libmarshal.tex 5 Feb 2003 20:33:24 -0000 @@ -26,13 +26,16 @@ Python modules of \file{.pyc} files. Therefore, the Python maintainers reserve the right to modify the marshal format in backward incompatible ways should the need arise. If you're serializing and -de-serializing Python objects, use the \module{pickle} module. There -may also be unknown security problems with -\module{marshal}\footnote{As opposed to the known security issues in -the \module{pickle} module!}. +de-serializing Python objects, use the \module{pickle} module instead. \refstmodindex{pickle} \refstmodindex{shelve} \obindex{code} + +\begin{notice}[warning] +The \module{marshal} module is not intended to be secure against +erroneous or maliciously constructed data. Never unmarshal data +received from an untrusted or unauthenticated source. +\end{notice} Not all Python object types are supported; in general, only objects whose value is independent from a particular invocation of Python can Index: libpickle.tex =================================================================== RCS file: /cvsroot/python/python/dist/src/Doc/lib/libpickle.tex,v retrieving revision 1.38 diff -u -r1.38 libpickle.tex --- libpickle.tex 27 Nov 2002 05:26:46 -0000 1.38 +++ libpickle.tex 5 Feb 2003 20:33:24 -0000 @@ -85,18 +85,14 @@ \module{pickle} serialization format is guaranteed to be backwards compatible across Python releases. -\item The \module{pickle} module doesn't handle code objects, which - the \module{marshal} module does. This avoids the possibility - of smuggling Trojan horses into a program through the - \module{pickle} module\footnote{This doesn't necessarily imply - that \module{pickle} is inherently secure. See - section~\ref{pickle-sec} for a more detailed discussion on - \module{pickle} module security. Besides, it's possible that - \module{pickle} will eventually support serializing code - objects.}. - \end{itemize} +\begin{notice}[warning] +The \module{pickle} module is not intended to be secure against +erroneous or maliciously constructed data. Never unpickle data +received from an untrusted or unauthenticated source. +\end{notice} + Note that serialization is a more primitive notion than persistence; although \module{pickle} reads and writes file objects, it does not handle the @@ -194,9 +190,9 @@ \end{excdesc} \begin{excdesc}{UnpicklingError} -This exception is raised when there is a problem unpickling an object, -such as a security violation. Note that other exceptions may also be -raised during unpickling, including (but not necessarily limited to) +This exception is raised when there is a problem unpickling an object. +Note that other exceptions may also be raised during unpickling, +including (but not necessarily limited to) \exception{AttributeError}, \exception{EOFError}, \exception{ImportError}, and \exception{IndexError}. \end{excdesc} @@ -206,8 +202,7 @@ subclass to customize the behavior. However, in the \module{cPickle} modules these callables are factory functions and so cannot be subclassed. One of the common reasons to subclass is to control what -objects can actually be unpickled. See section~\ref{pickle-sec} for -more details on security concerns.}, \class{Pickler} and +objects can actually be unpickled.}, \class{Pickler} and \class{Unpickler}: \begin{classdesc}{Pickler}{file\optional{, bin}} @@ -579,89 +574,6 @@ % inst_persistent_id() which appears to give unknown types a second % shot at producing a persistent id. Since Jim Fulton can't remember % why it was added or what it's for, I'm leaving it undocumented. - -\subsection{Security \label{pickle-sec}} - -Most of the security issues surrounding the \module{pickle} and -\module{cPickle} module involve unpickling. There are no known -security vulnerabilities -related to pickling because you (the programmer) control the objects -that \module{pickle} will interact with, and all it produces is a -string. - -However, for unpickling, it is \strong{never} a good idea to unpickle -an untrusted string whose origins are dubious, for example, strings -read from a socket. This is because unpickling can create unexpected -objects and even potentially run methods of those objects, such as -their class constructor or destructor\footnote{A special note of -caution is worth raising about the \refmodule{Cookie} -module. By default, the \class{Cookie.Cookie} class is an alias for -the \class{Cookie.SmartCookie} class, which ``helpfully'' attempts to -unpickle any cookie data string it is passed. This is a huge security -hole because cookie data typically comes from an untrusted source. -You should either explicitly use the \class{Cookie.SimpleCookie} class ---- which doesn't attempt to unpickle its string --- or you should -implement the defensive programming steps described later on in this -section.}. - -You can defend against this by customizing your unpickler so that you -can control exactly what gets unpickled and what gets called. -Unfortunately, exactly how you do this is different depending on -whether you're using \module{pickle} or \module{cPickle}. - -One common feature that both modules implement is the -\member{__safe_for_unpickling__} attribute. Before calling a callable -which is not a class, the unpickler will check to make sure that the -callable has either been registered as a safe callable via the -\refmodule[copyreg]{copy_reg} module, or that it has an -attribute \member{__safe_for_unpickling__} with a true value. This -prevents the unpickling environment from being tricked into doing -evil things like call \code{os.unlink()} with an arbitrary file name. -See section~\ref{pickle-protocol} for more details. - -For safely unpickling class instances, you need to control exactly -which classes will get created. Be aware that a class's constructor -could be called (if the pickler found a \method{__getinitargs__()} -method) and the the class's destructor (i.e. its \method{__del__()} method) -might get called when the object is garbage collected. Depending on -the class, it isn't very heard to trick either method into doing bad -things, such as removing a file. The way to -control the classes that are safe to instantiate differs in -\module{pickle} and \module{cPickle}\footnote{A word of caution: the -mechanisms described here use internal attributes and methods, which -are subject to change in future versions of Python. We intend to -someday provide a common interface for controlling this behavior, -which will work in either \module{pickle} or \module{cPickle}.}. - -In the \module{pickle} module, you need to derive a subclass from -\class{Unpickler}, overriding the \method{load_global()} -method. \method{load_global()} should read two lines from the pickle -data stream where the first line will the the name of the module -containing the class and the second line will be the name of the -instance's class. It then look up the class, possibly importing the -module and digging out the attribute, then it appends what it finds to -the unpickler's stack. Later on, this class will be assigned to the -\member{__class__} attribute of an empty class, as a way of magically -creating an instance without calling its class's \method{__init__()}. -You job (should you choose to accept it), would be to have -\method{load_global()} push onto the unpickler's stack, a known safe -version of any class you deem safe to unpickle. It is up to you to -produce such a class. Or you could raise an error if you want to -disallow all unpickling of instances. If this sounds like a hack, -you're right. UTSL. - -Things are a little cleaner with \module{cPickle}, but not by much. -To control what gets unpickled, you can set the unpickler's -\member{find_global} attribute to a function or \code{None}. If it is -\code{None} then any attempts to unpickle instances will raise an -\exception{UnpicklingError}. If it is a function, -then it should accept a module name and a class name, and return the -corresponding class object. It is responsible for looking up the -class, again performing any necessary imports, and it may raise an -error to prevent instances of the class from being unpickled. - -The moral of the story is that you should be really careful about the -source of the strings your application unpickles. \subsection{Example \label{pickle-example}}