32 lines
1.4 KiB
Org Mode
32 lines
1.4 KiB
Org Mode
#+TITLE: Dataflow paradigm (working title)
|
|
#+AUTHOR: Chris Hodapp
|
|
#+DATE: December 12, 2017
|
|
#+TAGS: technobabble
|
|
|
|
There is a sort of parallel between the declarative nature of
|
|
computational graphs in TensorFlow, and functional programming
|
|
(possibly function-level - think of the J language and how important
|
|
rank is to its computations).
|
|
|
|
Apache Spark and TensorFlow are very similar in a lot of ways. The
|
|
key difference I see is that Spark handles different types of data
|
|
internally that are more suited to databases, reords, tables, and
|
|
generally relational data, while TensorFlow is, well, tensors
|
|
(arbitrary-dimensional arrays).
|
|
|
|
The interesting part to me with both of these is how they've moved
|
|
"bulk" computations into first-class objects (ish) and permitted some
|
|
level of introspection into them before they run, as they run, and
|
|
after they run. Like I noted in Notes - Paper, 2016-11-13, "One
|
|
interesting (to me) facet is how the computation process has been
|
|
split out and instrumented enough to allow some meaningful
|
|
introspection with it. It hasn't precisely made it a first-class
|
|
construct, but still, this feature pervades all of Spark's major
|
|
abstractions (RDD, DataFrame, Dataset)."
|
|
|
|
# Show Tensorboard example here
|
|
# Screenshots may be a good idea too
|
|
|
|
Spark does this with a database. TensorFlow does it with numerical
|
|
calculations. Node-RED does it with irregular, asynchronous data.
|