Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 2cbf8f0

Browse files
Revert "Publish websocket blog post (#93)" (#100)
This reverts commit 85d59aa.
1 parent 85d59aa commit 2cbf8f0

File tree

1 file changed

+23
-64
lines changed

1 file changed

+23
-64
lines changed

‎_posts/2021-09-27-tornado-websockets.md‎ renamed to ‎_drafts/2021-06-13-tornado-websockets.md‎

Lines changed: 23 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,20 @@ excerpt: "WebSockets with the Tornado web framework is a simple, robust way to
44
handle streaming data. I walk through a minimal example and discuss why these
55
tools are good for the job."
66
tags:
7-
- python
87
- streaming
98
- tornado
109
- websocket
1110
header:
1211
overlay_image: /assets/images/cool-backgrounds/cool-background8.png
1312
caption: 'Photo credit: [coolbackgrounds.io](https://coolbackgrounds.io/)'
14-
last_modified_at: 2021年09月27日
13+
last_modified_at: 2021年06月13日
14+
search: false
1515
---
1616

17+
{% if page.noindex == true %}
18+
<meta name="robots" content="noindex">
19+
{% endif %}
20+
1721
A lot of data science and machine learning practice assumes a static dataset,
1822
maybe with some MLOps tooling for rerunning a model pipeline with the freshest
1923
version of the dataset.
@@ -31,20 +35,20 @@ requests with REST endpoints). Of course, Tornado has pretty good support for
3135
WebSockets as well.
3236

3337
In this blog post I'll give a minimal example of using Tornado and WebSockets
34-
to handle streaming data. The toy example I have is one app (`server.py`)
35-
writing samples of a Bernoulli to a WebSocket, and another app (`client.py`)
38+
to handle streaming data. The toy example I have is one app (`transmitter.py`)
39+
writing samples of a Bernoulli to a WebSocket, and another app (`receiver.py`)
3640
listening to the WebSocket and keeping track of the posterior distribution for
3741
a [Beta-Binomial conjugate model](https://eigenfoo.xyz/bayesian-bandits/).
3842
After walking through the code, I'll discuss these tools, and why they're good
3943
choices for working with streaming data.
4044

41-
For another tutorial on this same topic, you can check out [`proft`'s blog
45+
For another good tutorial on this same topic, you can check out [`proft`'s blog
4246
post](https://en.proft.me/2014/05/16/realtime-web-application-tornado-and-websocket/).
4347

44-
## Server
48+
## Transmitter
4549

46-
- When `WebSocketServer` is registered to a REST endpoint (in `main`), it keeps
47-
track of any processes who are listening to that endpoint, and pushes
50+
- When `WebSocketHandler` is registered to a REST endpoint (on line 44), it
51+
keeps track of any processes who are listening to that endpoint, and pushes
4852
messages to them when `send_message` is called.
4953
* Note that `clients` is a class variable, so `send_message` is a class
5054
method.
@@ -56,20 +60,12 @@ post](https://en.proft.me/2014/05/16/realtime-web-application-tornado-and-websoc
5660
case. For example, you could watch a file for any modifications using
5761
[`watchdog`](https://pythonhosted.org/watchdog/), and dump the changes into
5862
the WebSocket.
59-
- The [`websocket_ping_interval` and `websocket_ping_timeout` arguments to
60-
`tornado.Application`](https://www.tornadoweb.org/en/stable/web.html?highlight=websocket_ping#tornado.web.Application.settings)
61-
configure periodic pings of WebSocket connections, keeping connections alive
62-
and allowing dropped connections to be detected and closed.
63-
- It's also worth noting that there's a
64-
[`tornado.websocket.WebSocketHandler.websocket_max_message_size`](https://www.tornadoweb.org/en/stable/websocket.html?highlight=websocket_max_message_size#tornado.websocket.WebSocketHandler)
65-
attribute. While this is set to a generous 10 MiB, it's important that the
66-
WebSocket messages don't exceed this limit!
6763

68-
<script src="https://gist.github.com/eigenfoo/22f46166fa6924d684d68ca06e08b055.js"></script>
64+
<script src="https://gist.github.com/eigenfoo/cb07fe6f026d544b013b29143e125a38.js"></script>
6965

70-
## Client
66+
## Receiver
7167

72-
- `WebSocketClient` is a class that:
68+
- `WebSocketReceiver` is a class that:
7369
1. Can be `start`ed and `stop`ped to connect/disconnect to the WebSocket and
7470
start/stop listening to it in a separate thread
7571
2. Can process every message (`on_message`) it hears from the WebSocket: in
@@ -78,39 +74,17 @@ post](https://en.proft.me/2014/05/16/realtime-web-application-tornado-and-websoc
7874
but this processing could theoretically be anything. For example, you
7975
could do some further processing of the message and then dump that into a
8076
separate WebSocket for other apps (or even users!) to subscribe to.
81-
- To connect to the WebSocket, we need to use a WebSocket library: thankfully
82-
Tornado has a built-in WebSocket functionality (`tornado.websocket`), but
83-
we're also free to use other libraries such as the creatively named
84-
[`websockets`](https://github.com/aaugustin/websockets) or
77+
- To connect to the WebSocket, we need to use a WebSocket client, such as the
78+
creatively named
8579
[`websocket-client`](https://github.com/websocket-client/websocket-client).
86-
- Note that we run `on_message` on the same thread as we run
87-
`connect_and_read`. This isn't a problem so long as `on_message` is fast
88-
enough, but a potentially wiser choice would be to offload `connect_and_read`
89-
to a separate thread by instantiating a
90-
[`concurrent.futures.ThreadPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor)
91-
and calling
92-
[`tornado.ioloop.IOLoop.run_in_executor`](https://www.tornadoweb.org/en/stable/ioloop.html#tornado.ioloop.IOLoop.run_in_executor),
93-
so as not to block the thread where the `on_message` processing happens.
94-
- The `io_loop` instantiated in `main` (as well as in `server.py`) is
95-
important: it's how Tornado schedules tasks (a.k.a. _callbacks_) for delayed
80+
- Note that we run `read` is a separate thread, so as not to block the main
81+
thread (where the `on_message` processing happens).
82+
- The `io_loop` instantiated on line 50 (as well as in `transmitter.py`) is
83+
important - it's how Tornado schedules tasks (a.k.a. _callbacks_) for delayed
9684
(a.k.a. _asynchronous_) execution. To add a callback, we simply call
9785
`io_loop.add_callback()`.
98-
- The [`ping_interval` and `ping_timeout` arguments to
99-
`websocket_connect`](https://www.tornadoweb.org/en/stable/websocket.html?highlight=ping_#tornado.websocket.websocket_connect)
100-
configure periodic pings of the WebSocket connection, keeping connections
101-
alive and allowing dropped connections to be detected and closed.
102-
- The `callback=self.maybe_retry_connection` is [run on a future
103-
`WebSocketClientConnection`](https://github.com/tornadoweb/tornado/blob/1db5b45918da8303d2c6958ee03dbbd5dc2709e9/tornado/websocket.py#L1654-L1655).
104-
Here, we simply get the `future.result()` (i.e. the WebSocket client
105-
connection itself) — I don't actually do anything with the `self.connection`,
106-
but you could if you wanted. In the event of an exception while doing that,
107-
we assume there's a problem with the WebSocket connection and retry
108-
`connect_and_read` after 3 seconds. This all has the effect of recovering
109-
gracefully if the WebSocket is dropped or `server.py` experiences a brief
110-
outage for whatever reason (both of which are probably inevitable for
111-
long-running apps using WebSockets).
112-
113-
<script src="https://gist.github.com/eigenfoo/341f6c6c578d34120bccc4229e434377.js"></script>
86+
87+
<script src="https://gist.github.com/eigenfoo/a693b67167c775f7fe67329f3797595d.js"></script>
11488

11589
## Why Tornado?
11690

@@ -153,21 +127,6 @@ SSE)](https://www.smashingmagazine.com/2018/02/sse-websockets-data-flow-http2/):
153127
it seems to be a cleaner protocol for unidirectional data flow, which is really
154128
all that we need.
155129

156-
Additionally, [Armin
157-
Ronacher](https://lucumr.pocoo.org/2012/9/24/websockets-101/) has a much
158-
starker view of WebSockets, seeing no value in using WebSockets over TCP/IP
159-
sockets for this application:
160-
161-
> Websockets make you sad. [...] Websockets are complex, way more complex than I
162-
> anticipated. I can understand that they work that way but I definitely don't
163-
> see a value in using websockets instead of regular TCP connections if all you
164-
> want is to exchange data between different endpoints and neither is a browser.
165-
166-
My thought after reading these criticisms is that perhaps WebSockets aren't the
167-
ideal technology for handling streaming data (from a maintainability or
168-
architectural point of view), but that doesn't mean that they aren't good
169-
scalable technologies when they do work.
170-
171130
---
172131

173132
[^1]: There is [technically a difference](https://sqlstream.com/real-time-vs-streaming-a-short-explanation/) between "real-time" and "streaming": "real-time" refers to data that comes in as it is created, whereas "streaming" refers to a system that processes data continuously. You stream your TV show from Netflix, but since the show was created long before you watched it, you aren't viewing it in real-time.

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /