Add draft post on RetinaNet, and stub for dataflow stuff
This commit is contained in:
parent
92c4efac7d
commit
e588dce485
31
drafts/2017-12-12-dataflow.org
Normal file
31
drafts/2017-12-12-dataflow.org
Normal file
@ -0,0 +1,31 @@
|
||||
#+TITLE: Dataflow paradigm (working title)
|
||||
#+AUTHOR: Chris Hodapp
|
||||
#+DATE: December 12, 2017
|
||||
#+TAGS: technobabble
|
||||
|
||||
There is a sort of parallel between the declarative nature of
|
||||
computational graphs in TensorFlow, and functional programming
|
||||
(possibly function-level - think of the J language and how important
|
||||
rank is to its computations).
|
||||
|
||||
Apache Spark and TensorFlow are very similar in a lot of ways. The
|
||||
key difference I see is that Spark handles different types of data
|
||||
internally that are more suited to databases, reords, tables, and
|
||||
generally relational data, while TensorFlow is, well, tensors
|
||||
(arbitrary-dimensional arrays).
|
||||
|
||||
The interesting part to me with both of these is how they've moved
|
||||
"bulk" computations into first-class objects (ish) and permitted some
|
||||
level of introspection into them before they run, as they run, and
|
||||
after they run. Like I noted in Notes - Paper, 2016-11-13, "One
|
||||
interesting (to me) facet is how the computation process has been
|
||||
split out and instrumented enough to allow some meaningful
|
||||
introspection with it. It hasn't precisely made it a first-class
|
||||
construct, but still, this feature pervades all of Spark's major
|
||||
abstractions (RDD, DataFrame, Dataset)."
|
||||
|
||||
# Show Tensorboard example here
|
||||
# Screenshots may be a good idea too
|
||||
|
||||
Spark does this with a database. TensorFlow does it with numerical
|
||||
calculations. Node-RED does it with irregular, asynchronous data.
|
||||
72
drafts/2017-12-13-retinanet.org
Normal file
72
drafts/2017-12-13-retinanet.org
Normal file
@ -0,0 +1,72 @@
|
||||
#+TITLE: Explaining RetinaNet
|
||||
#+AUTHOR: Chris Hodapp
|
||||
#+DATE: December 13, 2017
|
||||
#+TAGS: technobabble
|
||||
|
||||
A paper came out in the past few months, [[https://arxiv.org/abs/1708.02002][Focal Loss for Dense Object
|
||||
Detection]], from one of Facebook's teams. The goal of this post is to
|
||||
explain this work a bit as I work through the paper, and to look at
|
||||
one particular [[https://github.com/fizyr/keras-retinanet][implementation in Keras]].
|
||||
|
||||
"Object detection" as it is used here refers to machine learning
|
||||
models that can not just identify a single object in an image, but can
|
||||
identify and *localize* multiple objects, like in the below photo
|
||||
taken from [[https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html][Supercharge your Computer Vision models with the TensorFlow
|
||||
Object Detection API]]:
|
||||
|
||||
# TODO:
|
||||
# Define mAP
|
||||
|
||||
#+CAPTION: TensorFlow object detection example 2.
|
||||
#+ATTR_HTML: :width 100% :height 100%
|
||||
[[../images/2017-12-13-objdet.jpg]]
|
||||
|
||||
The paper discusses many of the two-stage approaches, like R-CNN and
|
||||
its variants, which work in two steps:
|
||||
|
||||
1. One model proposes a sparse set of locations in the image that
|
||||
probably contain something. Ideally, this contains all objects in
|
||||
the image, but filters out the majority of negative locations
|
||||
(i.e. only background, not foreground).
|
||||
2. Another model, typically a convolutional neural network, classifies
|
||||
each location in that sparse set as either being foreground and
|
||||
some specific object class, or as being background.
|
||||
|
||||
Additionally, it discusses some existing one-stage approaches like
|
||||
[[https://pjreddie.com/darknet/yolo/][YOLO]] and [[https://arxiv.org/abs/1512.02325][SSD]]. In essence, these run only the second step - but
|
||||
instead of starting from a sparse set of locations that are probably
|
||||
something of interest, they start from a dense set of locations which
|
||||
has blanketed the entire image on a grid of many locations, over many
|
||||
sizes, and over many aspect ratios, regardless of whether they may
|
||||
contain an object.
|
||||
|
||||
This is simpler and faster - but not nearly as accurate.
|
||||
|
||||
Broadly, the process of training these models requires minimizing some
|
||||
kind of loss function that is based on what the model misclassifies
|
||||
when it is run on some training data. It's preferable to be able to
|
||||
compute some loss over each individual instance, and add all of these
|
||||
losses up to produce an overall loss.
|
||||
|
||||
This leads to a problem in one-stage detectors: That dense set of
|
||||
locations that it's classifying usually contains a small number of
|
||||
locations that actually have objects (positives), and a much larger
|
||||
number of locations that are just background and can be very easily
|
||||
classified as being in the background (easy negatives). However, the
|
||||
loss function still adds all of them up - and even if the loss is
|
||||
relatively low for each of the easy negatives, their cumulative loss
|
||||
can drown out the loss from objects that are being misclassified.
|
||||
|
||||
The training process is trying to minimize this loss, and so it is
|
||||
mostly nudging the model to improve in the area least in need of it
|
||||
(its ability to classify background areas that it already classifies
|
||||
well) and neglecting the area most in need of it (its ability to
|
||||
classify the "difficult" objects that it is misclassifying).
|
||||
|
||||
# TODO: What else can I say about why loss should be additive?
|
||||
# Quote DL text? ML text?
|
||||
|
||||
This is the *class imbalance* issue in a nutshell that the paper gives
|
||||
as the limiting factor for the accuracy of one-stage detectors.
|
||||
|
||||
# TODO: Visualize this. Can I?
|
||||
BIN
images/2017-12-13-objdet.jpg
Normal file
BIN
images/2017-12-13-objdet.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 256 KiB |
Loading…
x
Reference in New Issue
Block a user