blag/drafts/2018-03-09-python-asyncio.org

125 lines
7.5 KiB
Org Mode

---
title: Some Python asyncio disambiguation
author: Chris Hodapp
date: March 9, 2018
tags: technobabble
---
# TODO: Generators? Is it accurate that prior to all this, coroutines
# were still available, but by themselves they offered no way to
# perform anything in the background?
Recently I needed to work a little more in-depth with Python 3's
[[https://docs.python.org/3/library/asyncio.html][asyncio]]. On the one hand, some people (including me) might scoff at
this because it's just green threads and cooperative threading is a
model that's fresh out of the '90s, and Python /still/ has the [[https://wiki.python.org/moin/GlobalInterpreterLock][GIL]] -
and because Elixir and Erlang and Haskell and [[http://blog.paralleluniverse.co/2013/05/02/quasar-pulsar/][Clojure]] and [[http://docs.paralleluniverse.co/quasar/][Java/Kotlin]]
have handled async and M:N threading fine. However, it's still a
useful enough paradigm that it's already in C via [[https://github.com/libuv/libuv][libuv]], and it's in
the works for [[https://doc.rust-lang.org/nightly/unstable-book/language-features/generators.html][Rust]] (sort of... it had green threads which were removed
in favor of a lighter approach) and the [[http://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html][JVM]] (sort of... they're trying
to do [[https://en.wikipedia.org/wiki/Fiber_(computer_science)][fibers]], not green threads). The Python folks have their own set
of complaints, like [[http://lucumr.pocoo.org/2016/10/30/i-dont-understand-asyncio/][I don't understand Python's Asyncio]].
On the other hand, asyncio is still preferable to manually writing
code in [[https://en.wikipedia.org/wiki/Continuation-passing_style][continuation-passing-style]] (as that's all callbacks are, and
last time I had to write that many callbacks, I hated it enough that I
[[https://haskellembedded.github.io/posts/2016-09-23-introducing-ion.html][added features to my EDSL]] to avoid it), it's still preferable to a lot
of manual arithmetic on timer values to try to schedule things, and
it's still preferable to doing blocking I/O all over the place and
trying to escape it with other processes.
I found the [[https://pymotw.com/3/concurrency.html][Concurrency with Processes, Threads, and Coroutines]]
tutorials to be approachable and thorough, and I highly recommend
them.
However, I still had a few stumbling blocks in understanding, and
below I give some notes I wrote to check my understanding. I put
together a table to try to classify what method to use in different
circumstances. As I use it here, calling "now" means turning control
over to some other code, whereas calling "whenever" means retaining
control but queuing up some code to be run in the background
asychronously (as much as possible).
|-----------+-----------+-----------------------+-----------------------------------------------|
| Call from | Call to | When/where | How |
|-----------+-----------+-----------------------+-----------------------------------------------|
| Either | Function | Now, same thread | Normal function call |
| Function | Coroutine | Now, same thread | ~.run_*~ in event loop |
| Coroutine | Coroutine | Now, same thread | ~await~ |
| Either | Function | Whenever, same thread | Event loop ~.call_*()~ |
| Either | Coroutine | Whenever, same thread | Event loop ~.create_task()~ |
| | | | ~asyncio.ensure_future()~ |
| Either | Function | Now, another thread | ~.run_in_executor()~ on ~ThreadPoolExecutor~ |
| Either | Function | Now, another process | ~.run_in_executor()~ on ~ProcessPoolExecutor~ |
|-----------+-----------+-----------------------+-----------------------------------------------|
# TODO: How do I make Pandoc render this table better? It's hardly
# usable right now because you can't see where a column starts and
# ends
* Futures & Coroutines
The documentation was also sometimes vague on the relation between
coroutines and futures. My summary on what I figured out is below.
** Coroutines and Futures are *mostly* independent.
It just happens that both allow you to call things asychronously.
However, you can use coroutines/asyncio without ever touching a
Future. Likewise, you can use a Future without ever touching a
coroutine or asyncio. Note that its ~.result()~ call isn't a
coroutine.
** They can still encapsulate each other.
A coroutine can encapsulate a Future simply by using ~await~ on it.
A Future can encapsulate a coroutine with [[https://docs.python.org/3/library/asyncio-task.html#asyncio.ensure_future][asyncio.ensure\_future()]] or
the event loop's [[https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.AbstractEventLoop.create_task][.create\_task()]].
** Futures can implement asychronicity(?) differently
The ability to make a Future from a coroutine was mentioned above;
that's [[https://docs.python.org/3/library/asyncio-task.html#task][asyncio.Task]], an implementation of [[https://docs.python.org/3/library/asyncio-task.html#future][asyncio.Future]], but it's not
the only way to make a Future.
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future][concurrent.futures.Future]] provides other mostly-compatible ways. Its
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor][ThreadPoolExecutor]] provides Futures based on separate threads, and its
[[https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor][ProcessPoolExecutor]] provides Futures based on separate processes.
** Futures are always paired with some running context.
That is, a Future is already "started" - running, or scheduled to run,
or already ran, or something along those lines, and this is why it has
semantics for things like cancellation.
A coroutine by itself is not. The closest analogue is [[https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.Handle][asyncio.Handle]]
which is available only when a coroutine has been scheduled to run.
* Other Event Loops
[[https://pypi.python.org/pypi/Quamash][Quamash]] implements an asyncio event loop inside of Qt, and I used this
on a project. I ran into many issues with this combination. Qt's
juggling of multiple event loops seemed to cause many problems here,
and I still have some unsolved issues in which calls
~run_until_complete~ cause coroutines to die early with an exception
because the event loop appears to have died. This came up regularly
for me because of how often I would want a Qt slot to queue a task in
the background, and it seems this is an acknowledge [[https://github.com/harvimt/quamash/issues/33][issue]].
There is also [[https://github.com/MagicStack/uvloop\][uvloop]]. I presently have no need for extra performance
(nor could I really use it alongside Qt), but it's helpful to know
about.
* Other References
There are a couple pieces of "official" documentation that can be good
references as well:
- [[https://www.python.org/dev/peps/pep-0492/][PEP 492 - Coroutines with async and await syntax]]
- [[https://www.python.org/dev/peps/pep-0525/][PEP 525 - Asynchronous Generators]]
- [[https://www.python.org/dev/peps/pep-3156/][PEP 3156 - Asynchronous IO Support Rebooted: the "asyncio" Module]]
[[https://www.python.org/dev/peps/pep-0492/][PEP 342]] and [[https://www.python.org/dev/peps/pep-0380/][PEP 380]] are relevant too.