cortesi
aldo@corte.si

pathoc: break all the Python webservers!

2012年09月27日

A few months ago, I announced pathod, a pathological HTTP daemon. The project started as a testing tool to let me craft standards-violating HTTP responses while working on mitmproxy. It soon became a free-standing project, and has turned out to be incredibly useful in security testing, exploit delivery and general creative mischief. In the last release, I added pathoc - pathod's malicious client-side twin. It does for HTTP requests what pathod does for HTTP responses, and uses the same hyper-terse specification language.

In this post, I show how pathoc can be used as a very simple fuzzer, by finding issues in a number of major pure-Python webservers. None of the tested servers failed catastrophically - they all caught the unexpected exception and continued serving requests. None the less, I think it's reasonable to say that we've triggered a bug if a) the server returns an 500 Internal Server Error response or terminates the connection abnormally, and b) we see a traceback in our logs. In fact, by this definition, I found bugs in every pure-Python server I tested.

All of the problems I list below are simple failures of validation - what they have in common is that somewhere in the project code is called with input that it doesn't expect and can't handle. This matters - in fact, I'd argue that the majority of security problems fall in this category. It's interesting to ponder why this type of issue is so ubiquitous in Python servers. I have no doubt that part the answer lies in Python's use of exceptions - errors that would be explicit in other languages can be implicit in Python, and code that seems clean and intuitive might in fact be buggy. I think this is especially relevant right now, given the recent flurry of discussion surrounding the Go language and its error handling. It's pretty instructive to read Russ Cox's recent riposte to this post criticizing Go's explicit approach, while looking at the bugs below. I love Python and I think it's a fine language, but I also think the designers of Go probably made the right choice.

Basic fuzzing with pathoc

My methodology for these tests was very simple indeed. I launched each server in turn, and used pathod to fire corrupted GET requests at the daemon until I saw an error. I then looked at the logs, and boiled the distinct cases down to a minimal pathoc specification by hand. This exercises a rather shallow set of features in the server software - mostly parsing of the HTTP lead-in and request headers. It's possible to give software a much, much deeper workout with pathoc, but I'll leave that for a future post.

My pathoc fuzzing command looked something like this:

pathoc -n 1000 -p 8080 -t 1 localhost 'get:/:b@10:ir,"\x00"'

The most important flags here are -n, which tells pathoc to make 1000 consecutive requests, and -t, which tells pathoc to time out after one second (necessary to prevent hangs when daemons terminate improperly). The request specification itself breaks down as follows:

get Issue a GET request
/ ... to the path /
b@10 ... with a body consisting of 10 random bytes
ir,"\x00" ... and inject a NULL byte at a random location.

It's that last clause - the random injection - that makes the difference between simply crafting requests and basic fuzzing. Every time a new request is issued, the injection occurs at a different location. I varied the injected character between a NULL byte, a carriage return and a random alphabet letter. Each exposed different errors in different servers. For a complete description of the specification language, see the online docs.

Results

For each bug, I've given a traceback and a minimal pathoc call to trigger the issue. The tracebacks have been edited lightly to shorten file paths and remove irrelevances like timestamps.

CherryPy

pathoc -p 8080 localhost 'get:/:b@10:h"Content-Length"="x"'
ENGINE ValueError("invalid literal for int() with base 10: 'x'",)
Traceback (most recent call last):
 File "cherrypy/wsgiserver/wsgiserver2.py", line 1292, in communicate
 req.parse_request()
 File "cherrypy/wsgiserver/wsgiserver2.py", line 591, in parse_request
 success = self.read_request_headers()
 File "cherrypy/wsgiserver/wsgiserver2.py", line 711, in read_request_headers
 if mrbs and int(self.inheaders.get("Content-Length", 0)) > mrbs:
ValueError: invalid literal for int() with base 10: 'x'
pathoc -p 8080 localhost 'get:/:i4,"\r"
ENGINE TypeError("argument of type 'NoneType' is not iterable",)
Traceback (most recent call last):
 File "cherrypy/wsgiserver/wsgiserver2.py", line 1292, in communicate
 req.parse_request()
 File "cherrypy/wsgiserver/wsgiserver2.py", line 580, in parse_request
 success = self.read_request_line()
 File "cherrypy/wsgiserver/wsgiserver2.py", line 644, in read_request_line
 if NUMBER_SIGN in path:
TypeError: argument of type 'NoneType' is not iterable

Tornado

pathoc -p 8080 localhost 'get:/:b@10:h"Content-Length"="x"'
[E 120927 11:42:26 iostream:307] Uncaught exception, closing connection.
 Traceback (most recent call last):
 File "tornado/iostream.py", line 304, in wrapper
 callback(*args)
 File "tornado/httpserver.py", line 254, in _on_headers
 content_length = int(content_length)
 ValueError: invalid literal for int() with base 10: 'x'
[E 120927 11:42:26 ioloop:435] Exception in callback <tornado.stack_context._StackContextWrapper object at 0x1012e28e8>
 Traceback (most recent call last):
 File "tornado/ioloop.py", line 421, in _run_callback
 callback()
 File "tornado/iostream.py", line 304, in wrapper
 callback(*args)
 File "tornado/httpserver.py", line 254, in _on_headers
 content_length = int(content_length)
 ValueError: invalid literal for int() with base 10: 'x'
pathoc -p 8080 localhost 'get:/:h"h\r\n"="x"'
[E iostream:307] Uncaught exception, closing connection.
 Traceback (most recent call last):
 File "tornado/iostream.py", line 304, in wrapper
 callback(*args)
 File "tornado/httpserver.py", line 236, in _on_headers
 headers = httputil.HTTPHeaders.parse(data[eol:])
 File "tornado/httputil.py", line 127, in parse
 h.parse_line(line)
 File "tornado/httputil.py", line 113, in parse_line
 name, value = line.split(":", 1)
 ValueError: need more than 1 value to unpack
[E ioloop:435] Exception in callback <tornado.stack_context._StackContextWrapper object at 0x1012bd7e0>
 Traceback (most recent call last):
 File "tornado/ioloop.py", line 421, in _run_callback
 callback()
 File "tornado/iostream.py", line 304, in wrapper
 callback(*args)
 File "tornado/httpserver.py", line 236, in _on_headers
 headers = httputil.HTTPHeaders.parse(data[eol:])
 File "tornado/httputil.py", line 127, in parse
 h.parse_line(line)
 File "tornado/httputil.py", line 113, in parse_line
 name, value = line.split(":", 1)
 ValueError: need more than 1 value to unpack

Twisted

pathoc -p 8080 localhost 'get:/:b@10:h"Content-Length"="x"'
[HTTPChannel,4,127.0.0.1] Unhandled Error
 Traceback (most recent call last):
 File "twisted/python/log.py", line 84, in callWithLogger
 return callWithContext({"system": lp}, func, *args, **kw)
 File "twisted/python/log.py", line 69, in callWithContext
 return context.call({ILogContext: newCtx}, func, *args, **kw)
 File "twisted/python/context.py", line 118, in callWithContext
 return self.currentContext().callWithContext(ctx, func, *args, **kw)
 File "twisted/python/context.py", line 81, in callWithContext
 return func(*args,**kw)
 --- <exception caught here> ---
 File "twisted/internet/selectreactor.py", line 150, in _doReadOrWrite
 why = getattr(selectable, method)()
 File "twisted/internet/tcp.py", line 199, in doRead
 rval = self.protocol.dataReceived(data)
 File "twisted/protocols/basic.py", line 564, in dataReceived
 why = self.lineReceived(line)
 File "twisted/web/http.py", line 1558, in lineReceived
 self.headerReceived(self.__header)
 File "twisted/web/http.py", line 1580, in headerReceived
 self.length = int(data)
 exceptions.ValueError: invalid literal for int() with base 10: 'x'

SimpleHTTP

pathoc -p 8080 localhost 'get:"/0円"'
Exception happened during processing of request from ('127.0.0.1', 54029)
Traceback (most recent call last):
 File "lib/python2.7/SocketServer.py", line 284, in _handle_request_noblock
 self.process_request(request, client_address)
 File "lib/python2.7/SocketServer.py", line 310, in process_request
 self.finish_request(request, client_address)
 File "lib/python2.7/SocketServer.py", line 323, in finish_request
 self.RequestHandlerClass(request, client_address, self)
 File "lib/python2.7/SocketServer.py", line 638, in __init__
 self.handle()
 File "python2.7/BaseHTTPServer.py", line 340, in handle
 self.handle_one_request()
 File "lib/python2.7/BaseHTTPServer.py", line 328, in handle_one_request
 method()
 File "lib/python2.7/SimpleHTTPServer.py", line 44, in do_GET
 f = self.send_head()
 File "lib/python2.7/SimpleHTTPServer.py", line 68, in send_head
 if os.path.isdir(path):
 File "lib/python2.7/genericpath.py", line 41, in isdir
 st = os.stat(s)
TypeError: must be encoded string without NULL bytes, not str

Waitress

pathoc -p 8080 localhost 'get:/:i16," "'
ERROR:waitress:uncaptured python exception, closing channel
<waitress.channel.HTTPChannel connected 127.0.0.1:62330 at 0x1007ca310>
(
 <type 'exceptions.IndexError'>:list index out of range
 [lib/python2.7/asyncore.py|read|83]
 [lib/python2.7/asyncore.py|handle_read_event|444]
 [lib/python2.7/site-packages/waitress/channel.py|handle_read|169]
 [lib/python2.7/site-packages/waitress/channel.py|received|186]
 [lib/python2.7/site-packages/waitress/parser.py|received|99]
 [lib/python2.7/site-packages/waitress/parser.py|parse_header|158]
 [lib/python2.7/site-packages/waitress/parser.py|get_header_lines|247]
)

Edit: The first version of this post had examples that were due to the test WSGI application, not waitress. I've replaced them with the traceback above, which has been reformatted for clarity.

Werkzeug

pathoc -p 8080 localhost 'get:/:h"Host"="n\r0円"'
Traceback (most recent call last):
 File "flask/app.py", line 1518, in __call__
 return self.wsgi_app(environ, start_response)
 File "flask/app.py", line 1507, in wsgi_app
 return response(environ, start_response)
 File "/usr/local/lib/python2.7/site-packages/werkzeug/wrappers.py", line 1082, in __call__
 app_iter, status, headers = self.get_wsgi_response(environ)
 File "werkzeug/wrappers.py", line 1070, in get_wsgi_response
 headers = self.get_wsgi_headers(environ)
 File "werkzeug/wrappers.py", line 986, in get_wsgi_headers
 headers['Location'] = location
 File "werkzeug/datastructures.py", line 1132, in __setitem__
 self.set(key, value)
 File "werkzeug/datastructures.py", line 1097, in set
 self._validate_value(_value)
 File "werkzeug/datastructures.py", line 1065, in _validate_value
 raise ValueError('Detected newline in header value. This is '
ValueError: Detected newline in header value. This is a potential security problem

AltStyle によって変換されたページ (->オリジナル) /