blag/drafts/2017-12-12-dataflow.org

#+TITLE: Dataflow paradigm (working title)
#+AUTHOR: Chris Hodapp
#+DATE: December 12, 2017
#+TAGS: technobabble

There is a sort of parallel between the declarative nature of
computational graphs in TensorFlow, and functional programming
(possibly function-level - think of the J language and how important
rank is to its computations).

Apache Spark and TensorFlow are very similar in a lot of ways.  The
key difference I see is that Spark handles different types of data
internally that are more suited to databases, reords, tables, and
generally relational data, while TensorFlow is, well, tensors
(arbitrary-dimensional arrays).

The interesting part to me with both of these is how they've moved
"bulk" computations into first-class objects (ish) and permitted some
level of introspection into them before they run, as they run, and
after they run.  Like I noted in Notes - Paper, 2016-11-13, "One
interesting (to me) facet is how the computation process has been
split out and instrumented enough to allow some meaningful
introspection with it.  It hasn't precisely made it a first-class
construct, but still, this feature pervades all of Spark's major
abstractions (RDD, DataFrame, Dataset)."

# Show Tensorboard example here
# Screenshots may be a good idea too

Spark does this with a database. TensorFlow does it with numerical
calculations.  Node-RED does it with irregular, asynchronous data.