blag/hugo_blag/content/posts/2018-03-09-python-asyncio.org
2020-01-31 17:46:38 -05:00

145 lines
9.0 KiB
Org Mode

---
title: Some Python asyncio disambiguation
author: Chris Hodapp
date: March 9, 2018
tags:
- technobabble
---
# TODO: Generators? Is it accurate that prior to all this, coroutines
# were still available, but by themselves they offered no way to
# perform anything in the background?
Recently I needed to work a little more in-depth with Python 3's
[[https://docs.python.org/3/library/asyncio.html][asyncio]]. On the one hand, some people (like me) might scoff at this
because it's just green threads and cooperative threading is a model
that's fresh out of the '90s, and Python /still/ has the [[https://wiki.python.org/moin/GlobalInterpreterLock][GIL]] - and
because Elixir, Erlang, Haskell, [[https://github.com/clojure/core.async/][Clojure]] (also [[http://blog.paralleluniverse.co/2013/05/02/quasar-pulsar/][this]]), [[http://docs.paralleluniverse.co/quasar/][Java/Kotlin]], and
Go all handle async and M:N threading fine, and have for years. The
Python folks have their own set of complaints, like
[[http://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio/][I don't understand Python's Asyncio]] and
[[http://jordanorelli.com/post/31533769172/why-i-went-from-python-to-go-and-not-nodejs][Why I went from Python to Go (and not node.js)]].
At least it is in good company [[https://nullprogram.com/blog/2018/05/31/#threads][with Emacs still]].
On the other hand, it's still a useful enough paradigm that it's in
the works for [[https://doc.rust-lang.org/nightly/unstable-book/language-features/generators.html][Rust]] (sort of... it had green threads which were removed
in favor of a lighter approach) and broadly the [[http://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html][JVM]] (sort
of... they're trying to do [[https://en.wikipedia.org/wiki/Fiber_(computer_science)][fibers]], not green threads). [[https://github.com/libuv/libuv][libuv]] brings
something very similar to various languages, including C, and C
already has an asyncio imitator with [[https://github.com/AndreLouisCaron/libgreen][libgreen]]. Speaking of C, did
anyone know that GLib has some decent support here via things like
[[https://developer.gnome.org/gio/stable/GTask.html][GTask]], [[https://developer.gnome.org/glib/stable/glib-Thread-Pools.html][GThreadPool]], and [[https://developer.gnome.org/glib/stable/glib-Asynchronous-Queues.html][GAsyncQueue]]? I didn't until recently. But I
digress...
asyncio is still preferable to manually writing code in
[[https://en.wikipedia.org/wiki/Continuation-passing_style][continuation-passing-style]] (as that's all callbacks are, and last time
I had to write that many callbacks, I hated it enough that I [[https://haskellembedded.github.io/posts/2016-09-23-introducing-ion.html][added
features to my EDSL]] to avoid it), it's still preferable to a lot of
manual arithmetic on timer values to try to schedule things, and it's
still preferable to doing blocking I/O all over the place and trying
to escape it with other processes. Coroutines are also preferable to
yet another object-oriented train-wreck when it comes to handling
things like pipelines. While Python's had coroutines for quite awhile
now, asyncio perhaps makes them a little more obvious. [[http://www.dabeaz.com/coroutines/Coroutines.pdf][David
Beazley's slides]] are excellent for explaining its earlier coroutine
support.
I found the [[https://pymotw.com/3/concurrency.html][Concurrency with Processes, Threads, and Coroutines]]
tutorials to be an excellent overview of Python's asyncio, as well as
most ways of handling concurrency in Python, and I highly recommend
them.
However, I still had a few stumbling blocks in understanding, and
below I give some notes I wrote to check my understanding. I put
together a table to try to classify what method to use in different
circumstances. As I use it here, calling "now" means turning control
over to some other code, whereas calling "whenever" means retaining
control but queuing up some code to be run in the background
asychronously (as much as possible).
|-----------+-----------+-----------------------+-----------------------------------------------|
| Call from | Call to | When/where | How |
|-----------+-----------+-----------------------+-----------------------------------------------|
| Either | Function | Now, same thread | Normal function call |
| Function | Coroutine | Now, same thread | ~.run_*~ in event loop |
| Coroutine | Coroutine | Now, same thread | ~await~ |
| Either | Function | Whenever, same thread | Event loop ~.call_*()~ |
| Either | Coroutine | Whenever, same thread | Event loop ~.create_task()~ |
| | | | ~asyncio.ensure_future()~ |
| Either | Function | Now, another thread | ~.run_in_executor()~ on ~ThreadPoolExecutor~ |
| Either | Function | Now, another process | ~.run_in_executor()~ on ~ProcessPoolExecutor~ |
|-----------+-----------+-----------------------+-----------------------------------------------|
* Futures & Coroutines
The documentation was also sometimes vague on the relation between
coroutines and futures. My summary on what I figured out is below.
** Python already had generator-based coroutines.
Python now has a language feature it refers to as "coroutines" in
asyncio (and in calls like ~asyncio.iscoroutine()~, but in Python 2.5
it also already supported similar-but-not-entirely-the-same form of
coroutine, and even earlier in a limited form via generators. See [[https://www.python.org/dev/peps/pep-0342/][PEP
342]] and [[http://www.dabeaz.com/coroutines/Coroutines.pdf][Beazley's slides]].
** Coroutines and Futures are *mostly* independent.
It just happens that both allow you to call things asychronously.
However, you can use coroutines/asyncio without ever touching a
Future. Likewise, you can use a Future without ever touching a
coroutine or asyncio. Note that its ~.result()~ call isn't a
coroutine.
** They can still encapsulate each other.
A coroutine can encapsulate a Future simply by using ~await~ on it.
A Future can encapsulate a coroutine with [[https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future][asyncio.ensure\_future()]] or
the event loop's [[https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.AbstractEventLoop.create_task][.create\_task()]].
** Futures can implement asychronicity(?) differently
The ability to make a Future from a coroutine was mentioned above;
that's [[https://docs.python.org/3/library/asyncio-task.html#task][asyncio.Task]], an implementation of [[https://docs.python.org/3/library/asyncio-task.html#future][asyncio.Future]], but it's not
the only way to make a Future.
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future][concurrent.futures.Future]] provides other mostly-compatible ways. Its
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor][ThreadPoolExecutor]] provides Futures based on separate threads, and its
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor][ProcessPoolExecutor]] provides Futures based on separate processes.
** Futures are always paired with some running context.
That is, a Future is already "started" - running, or scheduled to run,
or already ran, or something along those lines, and this is why it has
semantics for things like cancellation.
A coroutine by itself is not. The closest analogue is [[https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.Handle][asyncio.Handle]]
which is available only when a coroutine has been scheduled to run.
* Other Event Loops
[[https://pypi.python.org/pypi/Quamash][Quamash]] implements an asyncio event loop inside of Qt, and I used this
on a project. I ran into many issues with this combination. Qt's
juggling of multiple event loops seemed to cause many problems here,
and I still have some unsolved issues in which calls
~run_until_complete~ cause coroutines to die early with an exception
because the event loop appears to have died. This came up regularly
for me because of how often I would want a Qt slot to queue a task in
the background, and it seems this is an acknowledge [[https://github.com/harvimt/quamash/issues/33][issue]].
There is also [[https://github.com/MagicStack/uvloop\][uvloop]]. I presently have no need for extra performance
(nor could I really use it alongside Qt), but it's helpful to know
about.
* Other References
There are a couple pieces of "official" documentation that can be good
references as well:
- [[https://www.python.org/dev/peps/pep-0492/][PEP 492 - Coroutines with async and await syntax]]
- [[https://www.python.org/dev/peps/pep-0525/][PEP 525 - Asynchronous Generators]]
- [[https://www.python.org/dev/peps/pep-3156/][PEP 3156 - Asynchronous IO Support Rebooted: the "asyncio" Module]]
[[https://www.python.org/dev/peps/pep-0342/][PEP 342]] and [[https://www.python.org/dev/peps/pep-0380/][PEP 380]] are relevant too.