Finally called my modularity post done
This commit is contained in:
parent
eaf4f0444a
commit
56a09d4593
@ -1,381 +0,0 @@
|
|||||||
---
|
|
||||||
title: Modularity & Abstraction (working title)
|
|
||||||
author: Chris Hodapp
|
|
||||||
date: April 20, 2017
|
|
||||||
tags:
|
|
||||||
- technobabble
|
|
||||||
- rambling
|
|
||||||
draft: true
|
|
||||||
---
|
|
||||||
|
|
||||||
# Why don't I turn this into a paper for arXiv too? It can still be
|
|
||||||
# posted to the blog (just also make it exportable to LaTeX perhaps)
|
|
||||||
|
|
||||||
_Modularity_ and _abstraction_ feature prominently wherever computers
|
|
||||||
are involved. This is meant very broadly: it applies to designing
|
|
||||||
software, using software, integrating software, and to a lot of
|
|
||||||
hardware as well. It applies elsewhere, and almost certainly
|
|
||||||
originated elsewhere first, however, it appears especially crucial
|
|
||||||
around software.
|
|
||||||
|
|
||||||
Definitions, though, are a bit vague (including anything in this
|
|
||||||
post). My goal in this post isn't to try to (re)define them, but to
|
|
||||||
explain their essence and expand on a few theses:
|
|
||||||
|
|
||||||
- Modularity arises naturally in a wide array of places.
|
|
||||||
- Modularity and abstraction are intrinsically connected.
|
|
||||||
- Both are for the benefit of people. This usually doesn't need
|
|
||||||
stated, but to echo Paul Graham and probably others: to the
|
|
||||||
computer, it is all the same.
|
|
||||||
- More specifically, both are there to manage *complexity* by
|
|
||||||
assigning meaningful information and boundaries which allow people
|
|
||||||
to match a problem to what they can actually think about.
|
|
||||||
|
|
||||||
# - Whether a given modularization makes sense depends strongly on
|
|
||||||
# meaning and relevance of *information* inside and outside of
|
|
||||||
# modules, and broad context matters to those.
|
|
||||||
|
|
||||||
* Why?
|
|
||||||
|
|
||||||
People generally agree that "modularity" is good. The idea that
|
|
||||||
something complex can be designed and understood in terms of smaller,
|
|
||||||
simpler pieces comes naturally to anyone that has built something out
|
|
||||||
of smaller pieces or taken something apart. (This isn't to say that
|
|
||||||
reductionism is the best way to understand everything, but that's
|
|
||||||
another matter.) It runs very deep in the Unix philosophy, which ESR
|
|
||||||
gives a good overview of in [[http://www.catb.org/~esr/writings/taoup/html/ch01s06.html][The Art of Unix Programming]] - or, listen
|
|
||||||
to it from [[https://youtu.be/tc4ROCJYbm0?t%3D248][Kernighan himself]] at Bell Labs in
|
|
||||||
1982.
|
|
||||||
|
|
||||||
Tim Berners-Lee gives some practical limitations in [[https://www.w3.org/DesignIssues/Principles.html][Principles of
|
|
||||||
Design]] and in [[https://www.w3.org/DesignIssues/Modularity.html][Modularity]]: "Modular design hinges on the simplicity and
|
|
||||||
abstract nature of the interface definition between the modules. A
|
|
||||||
design in which the insides of each module need to know all about each
|
|
||||||
other is not a modular design but an arbitrary partitioning of the
|
|
||||||
bits... It is not only necessary to make sure your own system is
|
|
||||||
designed to be made of modular parts. It is also necessary to realize
|
|
||||||
that your own system, no matter how big and wonderful it seems now,
|
|
||||||
should always be designed to be a part of another larger system." Les
|
|
||||||
Hatton in [[http://www.leshatton.org/TAIC2008-29-08-2008.html][The role of empiricism in improving the reliability of
|
|
||||||
future software]] even did an interesting derivation tying the defect
|
|
||||||
density in software to how it is broken into pieces. The 1972 paper
|
|
||||||
[[https://www.cs.virginia.edu/~eos/cs651/papers/parnas72.pdf][On the Criteria to be Used in Decomposing System into Modules]] cites a
|
|
||||||
1970 textbook on why modularity is important in systems programming,
|
|
||||||
but also notes that nothing is said on how to divide a systems into
|
|
||||||
modules.
|
|
||||||
|
|
||||||
"Abstraction" doesn't have quite the same consensus. In software, it's
|
|
||||||
generally understood that decoupled or loosely-coupled is better than
|
|
||||||
tightly-coupled, but at the same time, "abstraction" can have the
|
|
||||||
connotation of something that gets in the way, adds overhead, and
|
|
||||||
confuses things. Dijkstra, in one of few instances of not being
|
|
||||||
snarky, allegedly said, "Being abstract is something profoundly
|
|
||||||
different from being vague. The purpose of abstraction is not to be
|
|
||||||
vague, but to create a new semantic level in which one can be
|
|
||||||
absolutely precise." Joel Spolsky, in one of few instances of me
|
|
||||||
actually caring what he said, also has a blog post from 2002 on the
|
|
||||||
[[https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/][Law of Leaky Abstractions]] ("All non-trivial abstractions, to some
|
|
||||||
degree, are leaky.") The [[https://en.wikipedia.org/wiki/Principle_of_least_privilege][principle of least privilege]] is likewise a
|
|
||||||
thing. So, abstraction too has its practical and theoretical
|
|
||||||
limitations.
|
|
||||||
|
|
||||||
* How They Relate
|
|
||||||
|
|
||||||
I bring these up together because: *abstractions* are the boundaries
|
|
||||||
between *modules*, and the communication channels (APIs, languages,
|
|
||||||
interfaces, protocols) through which they talk. It need not
|
|
||||||
necessarily be a standardized interface or a well-documented boundary,
|
|
||||||
though that helps.
|
|
||||||
|
|
||||||
Available abstractions vary. They vary by, for instance:
|
|
||||||
- ...what language you choose. Consider, for instance, that a language
|
|
||||||
like Haskell contains various abstractions done largely within the
|
|
||||||
type system that cannot be expressed in many other languages.
|
|
||||||
Languages like Python, Ruby, or JavaScript might have various
|
|
||||||
abstractions meaningful only in the context of dynamic typing. Some
|
|
||||||
languages more readily permit the creation of new abstractions, and
|
|
||||||
this might lead to a broader range of abstractions implemented in
|
|
||||||
libraries.
|
|
||||||
- ...the operating system and its standard library. What is a
|
|
||||||
process? What is a thread? What is a dynamic library? What is a
|
|
||||||
filesystem? What is a file? What is a block device? What is a
|
|
||||||
socket? What is a virtual machine? What is a bus? What is a
|
|
||||||
commandline?
|
|
||||||
- ...the time period. How many of the abstractions named above were
|
|
||||||
around or viable in 1970, 1980, 1990, 2000? In the opposite
|
|
||||||
direction, when did you last use that lovely standardized protocol,
|
|
||||||
[[https://en.wikipedia.org/wiki/Common_Gateway_Interface][CGI]], to let your web application and your web server communicate,
|
|
||||||
use [[https://en.wikipedia.org/wiki/PHIGS][PHIGS]] to render graphics, or access a large multiuser system
|
|
||||||
via hard-wired terminals?
|
|
||||||
|
|
||||||
As such: Possible ways to modularize things vary. It may make no
|
|
||||||
sense that certain ways of modularization even can or should exist
|
|
||||||
until it's been done other ways hundreds or thousands of times.
|
|
||||||
|
|
||||||
Other terms are related too. "Loosely-coupled" (or loose coupling)
|
|
||||||
and "tightly-coupled" refer to the sort of abstractions sitting
|
|
||||||
between modules, or whether or not there even are separate modules.
|
|
||||||
"Decoupling" involves changing the relationship between modules
|
|
||||||
(sometimes, creating them in the first place), typically splitting
|
|
||||||
things into two more sensible pieces that a more sensible abstraction
|
|
||||||
separates. "Factoring out" is really a form of decoupling in which
|
|
||||||
smaller parts of something are turned into a module which the original
|
|
||||||
thing then interfaces with (one canonical example is taking some bits
|
|
||||||
of code, often that are very similar or identical in many places, and
|
|
||||||
moving them into a single function). To say one has "abstracted over"
|
|
||||||
some details implies that a module is handling those details, that the
|
|
||||||
details shouldn't matter, and what does matter is the abstraction one
|
|
||||||
is using.
|
|
||||||
|
|
||||||
One of Rich Hickey's favorite topics is *composition*, and with good
|
|
||||||
reason (and you should check out [[http://www.infoq.com/presentations/Simple-Made-Easy/][Simple Made Easy]] regardless). This
|
|
||||||
relates as well: to *compose* things together effectively into bigger
|
|
||||||
parts requires that they support some common abstraction.
|
|
||||||
|
|
||||||
In the same area, [[https://clojurefun.wordpress.com/2012/08/17/composition-over-convention/][Composition over convention]] is a good read on how
|
|
||||||
/frameworks/ run counter to modularity: they aren't built to behave
|
|
||||||
like modules of a larger system.
|
|
||||||
|
|
||||||
# -----
|
|
||||||
|
|
||||||
It has a very pragmatic reason behind it: When something is a module
|
|
||||||
unto itself, presumably it is relying on specific abstractions, and it
|
|
||||||
is possible to freely change this module's internal details (provided
|
|
||||||
that it still respects the same abstractions), to move this module to
|
|
||||||
other contexts (anywhere that provides the same abstractions), and to
|
|
||||||
replace it with other modules (anything that respects the same
|
|
||||||
abstractions).
|
|
||||||
|
|
||||||
It also has a more abstract reason: When something is a module unto
|
|
||||||
itself, the way it is designed and implemented usually presents more
|
|
||||||
insight into the fundamentals of the problem it is solving. It
|
|
||||||
contains fewer incidental details, and more essential details.
|
|
||||||
|
|
||||||
# -------
|
|
||||||
|
|
||||||
* Information
|
|
||||||
|
|
||||||
I referred earlier to the abstractions themselves as both boundaries
|
|
||||||
and communications channels. Another common view is that abstractions
|
|
||||||
are *contracts* with a communicated and agreed purpose, and I think
|
|
||||||
this is a useful definition too: it conveys the notion that there are
|
|
||||||
multiple parties involved and that they are free to behave as needed
|
|
||||||
provided that they fulfill some obligation
|
|
||||||
|
|
||||||
Some definitions refer directly to information, like the [[https://en.wikipedia.org/wiki/Abstraction_principle_(computer_programming)][abstraction
|
|
||||||
principle]] which aims to reduce duplication of information which fits
|
|
||||||
with [[https://en.wikipedia.org/wiki/Don%2527t_repeat_yourself][don't repeat yourself]] so that "a modification of any single
|
|
||||||
element of a system does not require a change in other logically
|
|
||||||
unrelated elements".
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# ----- FIXME
|
|
||||||
Consider the information this module deals in, in essence.
|
|
||||||
|
|
||||||
What is the most general form this information could be expressed in,
|
|
||||||
without being so general as to encompass other things that are
|
|
||||||
irrelevant or so low-level as to needlessly constrain the possible
|
|
||||||
contexts?
|
|
||||||
|
|
||||||
(Aristotle's theory of definitions?)
|
|
||||||
|
|
||||||
* Less-Conventional Examples
|
|
||||||
|
|
||||||
One thing I've watched with some interest is when new abstractions
|
|
||||||
emerge (or, perhaps, old ones become more widespread) to solve
|
|
||||||
problems that I wasn't even aware existed.
|
|
||||||
|
|
||||||
[[https://circleci.com/blog/it-really-is-the-future/][It really is the future]] talks about a lot of more recent forms of
|
|
||||||
modularity from the land of devops, most of which were completely
|
|
||||||
unheard-of in, say, 2010. [[https://www.functionalgeekery.com/episode-75-eric-b-merritt/][Functional Geekery episode 75]] talks about
|
|
||||||
many similar things.
|
|
||||||
|
|
||||||
[[https://jupyter.org/][Jupyter Notebook]] is one of my favorites here. It provides a notebook
|
|
||||||
interface (similar to something like Maple or Mathematica) which:
|
|
||||||
|
|
||||||
- allows the notebook to use various different programming languages
|
|
||||||
underneath,
|
|
||||||
- decouples where the notebook is used and where it is running, due to
|
|
||||||
being implemented as a web application accessed through the browser,
|
|
||||||
- decouples the presentation of a stored notebook from Jupyter itself
|
|
||||||
by using a [[https://nbformat.readthedocs.io/en/latest/][JSON-based file format]] which can be rendered without
|
|
||||||
Jupyter (like GitHub does if you commit a .ipynb file).
|
|
||||||
|
|
||||||
I love notebook interfaces already because they simplify experimenting
|
|
||||||
by handling a lot of things I'd otherwise have to do manually - like
|
|
||||||
saving results and keeping them lined up with the exact code that
|
|
||||||
produced them. Jupyter adds some other use-cases I find marvelous -
|
|
||||||
for instance, I can let the interpreter run on my workstation which
|
|
||||||
has all of the computing power, but I can access it across the
|
|
||||||
Internet from my laptop.
|
|
||||||
|
|
||||||
[[https://zeppelin.apache.org/][Apache Zeppelin]] does similar things with different languages; I've
|
|
||||||
just used it much less.
|
|
||||||
|
|
||||||
Another favorite of mine is [[https://nixos.org/nix/][Nix]]. One excellent article, [[http://blog.ezyang.com/2014/08/the-fundamental-problem-of-programming-language-package-management/][The
|
|
||||||
fundamental problem of programming language package management]],
|
|
||||||
doesn't ever mention Nix but explains very well the problems it sets
|
|
||||||
out to solve. To be able to combine nearly all of the
|
|
||||||
programming-language specific package managers into a single module is
|
|
||||||
a very lofty goal, but Nix appears to do a decent job of it (among
|
|
||||||
other things).
|
|
||||||
|
|
||||||
The [[https://www.lua.org/][Lua]] programming language is noteworthy here. It's written in
|
|
||||||
clean C with minimal dependencies, so it runs nearly anywhere that a C
|
|
||||||
or C++ compiler targets. It's purposely very easy both to *embed*
|
|
||||||
(i.e. to put inside of a program and use as an extension language,
|
|
||||||
such as for plugins or scripting) and to *extend* (i.e. to connect
|
|
||||||
with libraries to allow their functionality to be used from Lua). [[https://www.gnu.org/software/guile/][GNU
|
|
||||||
Guile]] has many of the same properties, I'm told.
|
|
||||||
|
|
||||||
We ordinarily think of object systems as something living in the
|
|
||||||
programming language. However, the object system is sometimes made a
|
|
||||||
module that is outside of the programming language, and languages just
|
|
||||||
interact with it. [[https://en.wikipedia.org/wiki/GObject][GObject]], [[https://en.wikipedia.org/wiki/Component_Object_Model][COM]], and [[https://en.wikipedia.org/wiki/XPCOM][XPCOM]] do this, and to some
|
|
||||||
extent, so does [[https://en.wikipedia.org/wiki/Meta-object_System][Qt & MOC]] - and there are probably hundreds of others,
|
|
||||||
particularly if you allow dead ones created during the object-oriented
|
|
||||||
hype of the '90s. This seems to happen in systems where the object
|
|
||||||
hierarchy is in effect "bigger" than the language.
|
|
||||||
|
|
||||||
[[https://zeromq.org/][ZeroMQ]] is another example: a set of cross-language abstractions for
|
|
||||||
communication patterns in a distributed system. I know it's likely
|
|
||||||
not unique, but it is one of the better-known and the first I thought
|
|
||||||
of, and I think their [[http://zguide.zeromq.org/page:all][guide]] is excellent.
|
|
||||||
|
|
||||||
Interestingly, the same iMatix behind ZeroMQ also created [[https://github.com/imatix/gsl][GSL]] and
|
|
||||||
explained its value in [[https://imatix-legacy.github.io/mop/introduction.html][Model-Oriented Programming]], for which
|
|
||||||
abstraction features heavily. I've not used GSL, and am skeptical of
|
|
||||||
its stated usefulness, but it looks like it is meant to help create
|
|
||||||
compile-time abstractions that likewise sit outside of any particular
|
|
||||||
programming language.
|
|
||||||
|
|
||||||
# TODO: Expand on this.
|
|
||||||
|
|
||||||
[[https://web.hypothes.is/][hypothes.is]] is a curious one that I find fascinating. They're trying
|
|
||||||
to factor out annotation and commenting from something that is handled
|
|
||||||
on a per-webpage basis and turn it into its own module, and I really
|
|
||||||
like what I've seen. However, it does not seem to have caught on
|
|
||||||
much.
|
|
||||||
|
|
||||||
The Unix tradition lives on in certain modern tools. [[https://stedolan.github.io/jq/][jq]] has proven
|
|
||||||
very useful anytime I've had to mess with JSON data. [[http://www.dest-unreach.org/socat/][socat]] and [[http://netcat.sourceforge.net/][netcat]]
|
|
||||||
have saved me numerous times. I'm sure certain people love the fact
|
|
||||||
that [[https://neovim.io/][Neovim]] is designed to be seamlessly embedded and to extend with
|
|
||||||
plugins. [[https://suckless.org/philosophy][suckless]] perhaps takes it too far, but gets an honorary
|
|
||||||
mention...
|
|
||||||
|
|
||||||
# ???
|
|
||||||
|
|
||||||
# Also, TCP/IP and the entire notion of packet-switched networks.
|
|
||||||
# And the entire OSI 7-layer model.
|
|
||||||
|
|
||||||
# Also, caches - of all types. (CPU, disk...)
|
|
||||||
|
|
||||||
# One key is how the above let you *reason* about things without
|
|
||||||
# knowing their specifics.
|
|
||||||
|
|
||||||
People know that I love Emacs, but I also do believe many of the
|
|
||||||
complaints on how large it is. Despite that it is basically its own
|
|
||||||
operating system, /within this/ it has considerable modularity. The
|
|
||||||
same applies somewhat to Blender, I suppose.
|
|
||||||
|
|
||||||
Consider [[https://research.google.com/pubs/pub43146.html][Machine Learning: The High Interest Credit Card of Technical Debt]],
|
|
||||||
a paper that anyone working around machine learning should read and
|
|
||||||
re-read regularly. Large parts of the paper are about ways in which
|
|
||||||
machine learning conflicts with proper modularity and abstraction.
|
|
||||||
(However, [[https://colah.github.io/posts/2015-09-NN-Types-FP/][Neural Networks, Types, and Functional Programming]] is still
|
|
||||||
a good post and shows some sorts of abstraction that still exist
|
|
||||||
at least in neural networks.)
|
|
||||||
|
|
||||||
Even DOS had useful abstractions. Things like
|
|
||||||
DriveSpace/DoubleSpace/Stacker worked well enough because most
|
|
||||||
software that needed files relied on DOS's normal abstractions to
|
|
||||||
access them - so it did not matter to them that the underlying
|
|
||||||
filesystem was actually compressed, or was actually a RAM disk, or was
|
|
||||||
on some obscure SCSI interface. Likewise, for the silliness known as
|
|
||||||
[[https://en.wikipedia.org/wiki/Expanded_memory][EMS]], applications that accessed memory through the EMS abstraction
|
|
||||||
could disregard whether it was a "real" EMS board providing access to
|
|
||||||
that memory, whether it was an expanded memory manager providing
|
|
||||||
indirect access to some other memory or even to a hard disk pretending
|
|
||||||
to be memory.
|
|
||||||
|
|
||||||
Even more abstractly: emulators work because so much software
|
|
||||||
respected the abstraction of some specific CPU and hardware platform.
|
|
||||||
|
|
||||||
Submitted without further comment:
|
|
||||||
https://github.com/stevemao/left-pad/issues/4
|
|
||||||
|
|
||||||
* Fragments
|
|
||||||
|
|
||||||
- Abstracting over...
|
|
||||||
- Multiple applications
|
|
||||||
- Multiple users
|
|
||||||
- Multiple CPUs
|
|
||||||
- Multiple hosts
|
|
||||||
|
|
||||||
- [[Notes - Paper, 2016-11-13]]
|
|
||||||
- Tanenbaum vs. Linus war & microkernels
|
|
||||||
- TBL: "The choice of language is a common design choice. The low
|
|
||||||
power end of the scale is typically simpler to design, implement and
|
|
||||||
use, but the high power end of the scale has all the attraction of
|
|
||||||
being an open-ended hook into which anything can be placed: a door
|
|
||||||
to uses bounded only by the imagination of the programmer. Computer
|
|
||||||
Science in the 1960s to 80s spent a lot of effort making languages
|
|
||||||
which were as powerful as possible. Nowadays we have to appreciate
|
|
||||||
the reasons for picking not the most powerful solution but the least
|
|
||||||
powerful. The reason for this is that the less powerful the
|
|
||||||
language, the more you can do with the data stored in that
|
|
||||||
language. If you write it in a simple declarative from, anyone can
|
|
||||||
write a program to analyze it in many ways." (Languages are a kind
|
|
||||||
of abstraction - one that influences how a module is written, and
|
|
||||||
what contexts it is useful in.)
|
|
||||||
- "Self" paper & structural reification?
|
|
||||||
- I'm still not sure how this relates, but it may perhaps relate to
|
|
||||||
how *not* to make things modular (structural reification is a sort
|
|
||||||
of check on the scope of objects/classes)
|
|
||||||
- What by Rich Hickey?
|
|
||||||
- Simple Made Easy?
|
|
||||||
- The Value of Values?
|
|
||||||
- SICP: [[https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-19.html#%25_chap_3][Modularity, Objects, and State]]
|
|
||||||
- [[https://www.cs.utexas.edu/~wcook/Drafts/2009/essay.pdf][On Understanding Data Abstraction, Revisited]]
|
|
||||||
- http://www.catb.org/~esr/writings/taoup/html/apb.html#Baldwin-Clark -
|
|
||||||
Carliss Baldwin and Kim Clark. Design Rules, Vol 1: The Power of
|
|
||||||
Modularity. 2000. MIT Press. ISBN 0-262-024667.
|
|
||||||
- Brooks, No Silver Bullet?
|
|
||||||
|
|
||||||
- https://en.wikipedia.org/wiki/Essential_complexity
|
|
||||||
|
|
||||||
- https://twitter.com/fchollet/status/962074070513631232
|
|
||||||
|
|
||||||
- [[https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-9.html#%25_chap_1][From SICP chapter 1 intro]]: "The acts of the mind, wherein it exerts
|
|
||||||
its power over simple ideas, are chiefly these three: 1. Combining
|
|
||||||
several simple ideas into one compound one, and thus all complex
|
|
||||||
ideas are made. 2. The second is bringing two ideas, whether simple
|
|
||||||
or complex, together, and setting them by one another so as to take
|
|
||||||
a view of them at once, without uniting them into one, by which it
|
|
||||||
gets all its ideas of relations. 3. The third is separating them
|
|
||||||
from all other ideas that accompany them in their real existence:
|
|
||||||
this is called abstraction, and thus all its general ideas are
|
|
||||||
made." -John Locke, An Essay Concerning Human Understanding (1690)
|
|
||||||
- One point I have ignored (maybe): You clearly separate the 'inside'
|
|
||||||
of a module (its implementation) from the 'outside' (that is - its
|
|
||||||
boundaries, the abstractions that it interfaces with or that it
|
|
||||||
implements) so that the 'inside' can change more or less freely
|
|
||||||
without having any effect on the outside.
|
|
||||||
- Abstractions as a way of reducing the work required to add
|
|
||||||
functionality (changes can be made just in the relevant modules, and
|
|
||||||
other modules do not need to change to conform)
|
|
||||||
- What is more key? Communication, information content, contracts,
|
|
||||||
details?
|
|
||||||
- [[https://en.wikipedia.org/wiki/Don%2527t_repeat_yourself][Don't repeat yourself]]
|
|
||||||
- [[https://simplyphilosophy.org/study/aristotles-definitions/][Aristotle & theory of definitions]]
|
|
||||||
- this isn't right. I need to find the quote in the Durant book
|
|
||||||
(which will probably have an actual source) that pertains to how
|
|
||||||
specific and how general a definition must be
|
|
||||||
|
|
||||||
- [[https://en.wikipedia.org/wiki/SOLID][SOLID]]
|
|
||||||
- [[https://en.wikipedia.org/wiki/Cross-cutting_concern][Cross-cutting concerns]] and [[https://en.wikipedia.org/wiki/Aspect-oriented_programming][Aspect-oriented programming]]
|
|
||||||
- [[https://en.wikipedia.org/wiki/Separation_of_concerns][Separation of Concerns]]
|
|
||||||
- [[https://en.wikipedia.org/wiki/Abstraction_principle_(computer_programming)][Abstraction principle]]
|
|
||||||
- [[https://en.wikipedia.org/wiki/Don%2527t_repeat_yourself][Don't repeat yourself]]
|
|
||||||
333
content/posts/2020-07-17-modularity.org
Normal file
333
content/posts/2020-07-17-modularity.org
Normal file
@ -0,0 +1,333 @@
|
|||||||
|
---
|
||||||
|
title: "Modularity & Abstraction"
|
||||||
|
author: Chris Hodapp
|
||||||
|
date: "2020-07-16"
|
||||||
|
tags:
|
||||||
|
- technobabble
|
||||||
|
- rambling
|
||||||
|
---
|
||||||
|
|
||||||
|
# Why don't I turn this into a paper for arXiv too? It can still be
|
||||||
|
# posted to the blog (just also make it exportable to LaTeX perhaps)
|
||||||
|
|
||||||
|
/(This is a sort of rambling post that I started in 2017 April.)/
|
||||||
|
|
||||||
|
*Modularity* and *abstraction* feature prominently wherever computers
|
||||||
|
are involved. This is meant very broadly: it applies to designing
|
||||||
|
software, using software, integrating software, and to a lot of
|
||||||
|
hardware as well. It applies elsewhere, and almost certainly
|
||||||
|
originated elsewhere first, however, it appears especially crucial
|
||||||
|
around software.
|
||||||
|
|
||||||
|
Definitions, though, are a bit vague (including anything in this
|
||||||
|
post). My goal in this post isn't to try to (re)define them, but to
|
||||||
|
explain their essence and expand on a few theses:
|
||||||
|
|
||||||
|
- Modularity arises naturally in a wide array of places.
|
||||||
|
- Modularity and abstraction are intrinsically connected.
|
||||||
|
- Both are for the benefit of people. This usually doesn't need
|
||||||
|
stated, but to echo Paul Graham and probably others: to the
|
||||||
|
computer, it is all the same.
|
||||||
|
- More specifically, both are there to manage *complexity* by
|
||||||
|
assigning meaningful information and boundaries which allow people
|
||||||
|
to match a problem to what they can actually think about.
|
||||||
|
|
||||||
|
# - Whether a given modularization makes sense depends strongly on
|
||||||
|
# meaning and relevance of *information* inside and outside of
|
||||||
|
# modules, and broad context matters to those.
|
||||||
|
|
||||||
|
* What Are They?
|
||||||
|
|
||||||
|
People generally agree that "modularity" is good. The idea that
|
||||||
|
something complex can be designed and understood in terms of smaller,
|
||||||
|
simpler pieces comes naturally to anyone that has built something out
|
||||||
|
of smaller pieces or taken something apart. (This isn't to say that
|
||||||
|
reductionism is the best way to understand everything, but that's
|
||||||
|
another matter.) It runs very deep in the Unix philosophy, which ESR
|
||||||
|
gives a good overview of in [[http://www.catb.org/~esr/writings/taoup/html/ch01s06.html][The Art of Unix Programming]] - or, listen
|
||||||
|
to it from [[https://youtu.be/tc4ROCJYbm0?t%3D248][Kernighan himself]] at Bell Labs in 1982.
|
||||||
|
|
||||||
|
Tim Berners-Lee gives some practical limitations in [[https://www.w3.org/DesignIssues/Principles.html][Principles of
|
||||||
|
Design]] and in [[https://www.w3.org/DesignIssues/Modularity.html][Modularity]]: "Modular design hinges on the simplicity and
|
||||||
|
abstract nature of the interface definition between the modules. A
|
||||||
|
design in which the insides of each module need to know all about each
|
||||||
|
other is not a modular design but an arbitrary partitioning of the
|
||||||
|
bits... It is not only necessary to make sure your own system is
|
||||||
|
designed to be made of modular parts. It is also necessary to realize
|
||||||
|
that your own system, no matter how big and wonderful it seems now,
|
||||||
|
should always be designed to be a part of another larger system." Les
|
||||||
|
Hatton in [[http://www.leshatton.org/TAIC2008-29-08-2008.html][The role of empiricism in improving the reliability of
|
||||||
|
future software]] even did an interesting derivation tying the defect
|
||||||
|
density in software to how it is broken into pieces. The 1972 paper
|
||||||
|
[[https://www.cs.virginia.edu/~eos/cs651/papers/parnas72.pdf][On the Criteria to be Used in Decomposing System into Modules]] cites a
|
||||||
|
1970 textbook on why modularity is important in systems programming,
|
||||||
|
but also notes that nothing is said on how to divide a systems into
|
||||||
|
modules.
|
||||||
|
|
||||||
|
"Abstraction" doesn't have quite the same consensus. In software, it's
|
||||||
|
generally understood that decoupled or loosely-coupled is better than
|
||||||
|
tightly-coupled, but at the same time, "abstraction" can have the
|
||||||
|
connotation of something that gets in the way, adds overhead, and
|
||||||
|
confuses things. Dijkstra, in one of few instances of not being
|
||||||
|
snarky, allegedly said, "Being abstract is something profoundly
|
||||||
|
different from being vague. The purpose of abstraction is not to be
|
||||||
|
vague, but to create a new semantic level in which one can be
|
||||||
|
absolutely precise." Joel Spolsky, in one of few instances of me
|
||||||
|
actually caring what he said, also has a blog post from 2002 on the
|
||||||
|
[[https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/][Law of Leaky Abstractions]] ("All non-trivial abstractions, to some
|
||||||
|
degree, are leaky.") The [[https://en.wikipedia.org/wiki/Principle_of_least_privilege][principle of least privilege]] is likewise a
|
||||||
|
thing. So, abstraction too has its practical and theoretical
|
||||||
|
limitations.
|
||||||
|
|
||||||
|
* How They Relate
|
||||||
|
|
||||||
|
I bring these up together because: *abstractions* are the boundaries
|
||||||
|
between *modules*, and the communication channels (APIs, languages,
|
||||||
|
interfaces, protocols) through which they talk. It need not
|
||||||
|
necessarily be a standardized interface or a well-documented boundary,
|
||||||
|
though that helps.
|
||||||
|
|
||||||
|
Available abstractions vary. They vary by, for instance:
|
||||||
|
- ...what language you choose. Consider, for instance, that a language
|
||||||
|
like Haskell contains various abstractions done largely within the
|
||||||
|
type system that cannot be expressed in many other languages.
|
||||||
|
Languages like Python, Ruby, or JavaScript might have various
|
||||||
|
abstractions meaningful only in the context of dynamic typing. Some
|
||||||
|
languages more readily permit the creation of new abstractions, and
|
||||||
|
this might lead to a broader range of abstractions implemented in
|
||||||
|
libraries.
|
||||||
|
- ...the operating system and its standard library. What is a
|
||||||
|
process? What is a thread? What is a dynamic library? What is a
|
||||||
|
filesystem? What is a file? What is a block device? What is a
|
||||||
|
socket? What is a virtual machine? What is a bus? What is a
|
||||||
|
commandline?
|
||||||
|
- ...all other kinds of libraries a language might use, and entire
|
||||||
|
frameworks that cross language boundaries. Consider something like
|
||||||
|
Apache Spark, which deals in abstractions that may be accessed from
|
||||||
|
various languages.
|
||||||
|
- ...the time period. How many of the abstractions named above were
|
||||||
|
around or viable in 1970, 1980, 1990, 2000? In the opposite
|
||||||
|
direction, when did you last use that lovely standardized protocol,
|
||||||
|
[[https://en.wikipedia.org/wiki/Common_Gateway_Interface][CGI]], to let your web application and your web server communicate,
|
||||||
|
use [[https://en.wikipedia.org/wiki/PHIGS][PHIGS]] to render graphics, or access a large multiuser system
|
||||||
|
via hard-wired terminals?
|
||||||
|
|
||||||
|
As such: Possible ways to modularize things vary. It may make no
|
||||||
|
sense that certain ways of modularization even can or should exist
|
||||||
|
until it's been done other ways dozens or hundreds or maybe thousands
|
||||||
|
of times.
|
||||||
|
|
||||||
|
Other terms are related too. "Loosely-coupled" (or loose coupling)
|
||||||
|
and "tightly-coupled" refer to the sort of abstractions sitting
|
||||||
|
between modules, or whether or not there even are separate modules.
|
||||||
|
"Decoupling" involves changing the relationship between modules
|
||||||
|
(sometimes, creating them in the first place), typically splitting
|
||||||
|
things into two more sensible pieces that a more sensible abstraction
|
||||||
|
separates. "Factoring out" is really a form of decoupling in which
|
||||||
|
smaller parts of something are turned into a module which the original
|
||||||
|
thing then interfaces with (one canonical example is taking some bits
|
||||||
|
of code, often that are very similar or identical in many places, and
|
||||||
|
moving them into a single function). To say one has "abstracted over"
|
||||||
|
some details implies that a module is handling those details, that the
|
||||||
|
details shouldn't matter, and what does matter is the abstraction one
|
||||||
|
is using.
|
||||||
|
|
||||||
|
One of Rich Hickey's favorite topics is *composition*, and with good
|
||||||
|
reason (and you should check out [[http://www.infoq.com/presentations/Simple-Made-Easy/][Simple Made Easy]] regardless). This
|
||||||
|
relates as well: to *compose* things together effectively into bigger
|
||||||
|
parts requires that they support some common abstraction.
|
||||||
|
|
||||||
|
In the same area, [[https://clojurefun.wordpress.com/2012/08/17/composition-over-convention/][Composition over convention]] is a good read on how
|
||||||
|
/frameworks/ run counter to modularity: they aren't built to behave
|
||||||
|
like modules of a larger system.
|
||||||
|
|
||||||
|
The contrasting terms *interface* and *implementation* are commonly
|
||||||
|
seen in software, with "implementation" loosely referring to what is
|
||||||
|
inside a module, and "interface" referring to its "outside" boundaries
|
||||||
|
and thus to the abstractions it supports. You'll commonly hear advice
|
||||||
|
about separating interface from implementation, and some semi-related
|
||||||
|
things:
|
||||||
|
|
||||||
|
- [[https://en.wikipedia.org/wiki/SOLID][SOLID]]
|
||||||
|
- [[https://en.wikipedia.org/wiki/Cross-cutting_concern][Cross-cutting concerns]] and [[https://en.wikipedia.org/wiki/Aspect-oriented_programming][Aspect-oriented programming]]
|
||||||
|
- [[https://en.wikipedia.org/wiki/Separation_of_concerns][Separation of Concerns]]
|
||||||
|
- [[https://en.wikipedia.org/wiki/Information_hiding][Information hiding]] and [[https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)][encapsulation]]
|
||||||
|
|
||||||
|
* Why?
|
||||||
|
|
||||||
|
It has a very pragmatic reason behind it: When something is a module
|
||||||
|
unto itself, presumably it is relying on specific abstractions, and it
|
||||||
|
is possible to freely change this module's internal details (provided
|
||||||
|
that it still respects the same abstractions), to move this module to
|
||||||
|
other contexts (anywhere that provides the same abstractions), and to
|
||||||
|
replace it with other modules (anything that respects the same
|
||||||
|
abstractions).
|
||||||
|
|
||||||
|
It also has a more abstract reason: When something is a module unto
|
||||||
|
itself, the way it is designed and implemented usually presents more
|
||||||
|
insight into the fundamentals of the problem it is solving. It
|
||||||
|
contains fewer incidental details, and more essential details.
|
||||||
|
|
||||||
|
That's all very practical for people. It reduces the amount of
|
||||||
|
information that they must handle, and it permits them to *reason*
|
||||||
|
about the behavior of systems that are unknown or even completely
|
||||||
|
hypothetical.
|
||||||
|
|
||||||
|
It can also be seen as serving as a *contract* which reduces the
|
||||||
|
amount of communication and often the amount of disagreement. I think
|
||||||
|
this is a useful definition too: it conveys the notion that there are
|
||||||
|
multiple parties involved, that they have already agreed on some
|
||||||
|
specific obligations, and that they are free to behave as needed
|
||||||
|
provided that they fulfill those obligations.
|
||||||
|
|
||||||
|
[[https://en.wikipedia.org/wiki/Separation_of_concerns][Separation of Concerns]] gets at this same idea and expresses it in
|
||||||
|
terms of "concerns" rather than contracts.
|
||||||
|
|
||||||
|
I referred earlier to the abstractions themselves as both boundaries
|
||||||
|
and communications channels, and invoking "communications" raises the
|
||||||
|
related question of what *information* is being communicated. (For
|
||||||
|
whatever reason, Wikipedia defines a [[https://en.wikipedia.org/wiki/Concern_(computer_science)][concern]] in terms
|
||||||
|
of... information).
|
||||||
|
|
||||||
|
Some definitions refer directly to information, like the
|
||||||
|
[[https://en.wikipedia.org/wiki/Abstraction_principle_(computer_programming)][abstraction principle]] which aims to reduce duplication of information
|
||||||
|
which fits with [[https://en.wikipedia.org/wiki/Don%2527t_repeat_yourself][don't repeat yourself]] so that "a modification of any
|
||||||
|
single element of a system does not require a change in other
|
||||||
|
logically unrelated elements". [[https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)][Encapsulation]] likewise refers to it
|
||||||
|
via [[https://en.wikipedia.org/wiki/Information_hiding][information hiding]]. Alan Perlis in his [[http://www.cs.yale.edu/homes/perlis-alan/quotes.html][epigrams]] had #20:
|
||||||
|
"Wherever there is modularity there is the potential for
|
||||||
|
misunderstanding: Hiding information implies a need to check
|
||||||
|
communication."
|
||||||
|
|
||||||
|
* Examples
|
||||||
|
|
||||||
|
Network stacks, in particular via the OSI 7-layer model, are a good
|
||||||
|
example of all of this. Higher-level protocols can work in a way that
|
||||||
|
disregards lower-level details (most of the time - matters of
|
||||||
|
bandwidth and latency do sometimes matter). Lower-level protocols can
|
||||||
|
advance and be replaced without much concern for their higher-level
|
||||||
|
use.
|
||||||
|
|
||||||
|
Even the early innovation of packet-switching is a great instance of
|
||||||
|
abstracting network and routing details away from communications
|
||||||
|
|
||||||
|
Disk caches, and memory caches, and most other kinds of caches, work
|
||||||
|
because they still implement the same underlying abstraction (albeit
|
||||||
|
with some minor leakage).
|
||||||
|
|
||||||
|
Even DOS had useful abstractions. Things like
|
||||||
|
[[https://en.wikipedia.org/wiki/DriveSpace][DriveSpace/DoubleSpace]]/Stacker worked well enough because most
|
||||||
|
software that needed files relied on DOS's normal abstractions to
|
||||||
|
access them - so it did not matter to them that the underlying
|
||||||
|
filesystem was actually compressed, or was actually a RAM disk, or was
|
||||||
|
on some obscure SCSI interface. Likewise, for the silliness known as
|
||||||
|
[[https://en.wikipedia.org/wiki/Expanded_memory][EMS]], applications that accessed memory through the EMS abstraction
|
||||||
|
could disregard whether it was a "real" EMS board providing access to
|
||||||
|
that memory, whether it was an expanded memory manager providing
|
||||||
|
indirect access to some other memory or even to a hard disk pretending
|
||||||
|
to be memory.
|
||||||
|
|
||||||
|
** Less-Conventional Examples
|
||||||
|
|
||||||
|
One thing I've watched with some interest is when new abstractions
|
||||||
|
emerge (or, perhaps, old ones become more widespread) to solve
|
||||||
|
problems that I wasn't even aware existed.
|
||||||
|
|
||||||
|
[[https://circleci.com/blog/it-really-is-the-future/][It really is the future]] talks about a lot of more recent forms of
|
||||||
|
modularity from the land of devops, most of which were completely
|
||||||
|
unheard-of in, say, 2010. [[https://www.functionalgeekery.com/episode-75-eric-b-merritt/][Functional Geekery episode 75]] talks about
|
||||||
|
many similar things.
|
||||||
|
|
||||||
|
[[https://jupyter.org/][Jupyter Notebook]] is one of my favorites here. It provides a notebook
|
||||||
|
interface (similar to something like Maple or Mathematica) which:
|
||||||
|
|
||||||
|
- allows the notebook to use various different programming languages
|
||||||
|
underneath,
|
||||||
|
- decouples where the notebook is used and where it is running, due to
|
||||||
|
being implemented as a web application accessed through the browser,
|
||||||
|
- decouples the presentation of a stored notebook from Jupyter itself
|
||||||
|
by using a [[https://nbformat.readthedocs.io/en/latest/][JSON-based file format]] which can be rendered without
|
||||||
|
Jupyter (like GitHub does if you commit a .ipynb file).
|
||||||
|
|
||||||
|
I love notebook interfaces already because they simplify experimenting
|
||||||
|
by handling a lot of things I'd otherwise have to do manually - like
|
||||||
|
saving results and keeping them lined up with the exact code that
|
||||||
|
produced them. Jupyter adds some other use-cases I find marvelous -
|
||||||
|
for instance, I can let the interpreter run on my workstation which
|
||||||
|
has all of the computing power, but I can access it across the
|
||||||
|
Internet from my laptop.
|
||||||
|
|
||||||
|
[[https://zeppelin.apache.org/][Apache Zeppelin]] does similar things with different languages; I've
|
||||||
|
just used it much less.
|
||||||
|
|
||||||
|
Another favorite of mine is [[https://nixos.org/nix/][Nix]] (likewise its cousin [[https://guix.gnu.org/][Guix]]).
|
||||||
|
One excellent article,
|
||||||
|
[[http://blog.ezyang.com/2014/08/the-fundamental-problem-of-programming-language-package-management/][The fundamental problem of programming language package management]],
|
||||||
|
doesn't ever mention Nix but explains very well the problems it sets
|
||||||
|
out to solve. To be able to combine nearly all of the
|
||||||
|
programming-language specific package managers into a single module is
|
||||||
|
a very lofty goal, but Nix appears to do a decent job of it (among
|
||||||
|
other things).
|
||||||
|
|
||||||
|
The [[https://www.lua.org/][Lua]] programming language is noteworthy here. It's written in
|
||||||
|
clean C with minimal dependencies, so it runs nearly anywhere that a C
|
||||||
|
or C++ compiler targets. It's purposely very easy both to *embed*
|
||||||
|
(i.e. to put inside of a program and use as an extension language,
|
||||||
|
such as for plugins or scripting) and to *extend* (i.e. to connect
|
||||||
|
with libraries to allow their functionality to be used from Lua). [[https://www.gnu.org/software/guile/][GNU
|
||||||
|
Guile]] has many of the same properties, I'm told.
|
||||||
|
|
||||||
|
We ordinarily think of object systems as something living in the
|
||||||
|
programming language. However, the object system is sometimes made a
|
||||||
|
module that is outside of the programming language, and languages just
|
||||||
|
interact with it. [[https://en.wikipedia.org/wiki/GObject][GObject]], [[https://en.wikipedia.org/wiki/Component_Object_Model][COM]], and [[https://en.wikipedia.org/wiki/XPCOM][XPCOM]] do this, and to some
|
||||||
|
extent, so does [[https://en.wikipedia.org/wiki/Meta-object_System][Qt & MOC]] - and there are probably hundreds of others,
|
||||||
|
particularly if you allow dead ones created during the object-oriented
|
||||||
|
hype of the '90s. This seems to happen in systems where the object
|
||||||
|
hierarchy is in effect "bigger" than the language.
|
||||||
|
|
||||||
|
[[https://zeromq.org/][ZeroMQ]] is another example: a set of cross-language abstractions for
|
||||||
|
communication patterns in a distributed system. I know it's likely
|
||||||
|
not unique, but it is one of the better-known and the first I thought
|
||||||
|
of, and I think their [[http://zguide.zeromq.org/page:all][guide]] is excellent.
|
||||||
|
|
||||||
|
Interestingly, the same iMatix behind ZeroMQ also created [[https://github.com/imatix/gsl][GSL]] and
|
||||||
|
explained its value in [[https://imatix-legacy.github.io/mop/introduction.html][Model-Oriented Programming]], for which
|
||||||
|
abstraction features heavily. I've not used GSL, and am skeptical of
|
||||||
|
its stated usefulness, but it looks like it is meant to help create
|
||||||
|
compile-time abstractions that likewise sit outside of any particular
|
||||||
|
programming language.
|
||||||
|
|
||||||
|
# TODO: Expand on this.
|
||||||
|
|
||||||
|
[[https://web.hypothes.is/][hypothes.is]] is a curious one that I find fascinating. They're trying
|
||||||
|
to factor out annotation and commenting from something that is handled
|
||||||
|
on a per-webpage basis and turn it into its own module, and I really
|
||||||
|
like what I've seen. However, it does not seem to have caught on
|
||||||
|
much.
|
||||||
|
|
||||||
|
The Unix tradition lives on in certain modern tools. [[https://stedolan.github.io/jq/][jq]] has proven
|
||||||
|
very useful anytime I've had to mess with JSON data. [[http://www.dest-unreach.org/socat/][socat]] and [[http://netcat.sourceforge.net/][netcat]]
|
||||||
|
have saved me numerous times. I'm sure certain people love the fact
|
||||||
|
that [[https://neovim.io/][Neovim]] is designed to be seamlessly embedded and to extend with
|
||||||
|
plugins. [[https://suckless.org/philosophy][suckless]] perhaps takes it too far, but gets an honorary
|
||||||
|
mention...
|
||||||
|
|
||||||
|
People know that I love Emacs, but I also do believe many of the
|
||||||
|
complaints on how large it is. Despite that it is basically its own
|
||||||
|
operating system, /within this/ it has considerable modularity. The
|
||||||
|
same applies somewhat to Blender, I suppose.
|
||||||
|
|
||||||
|
Consider [[https://research.google.com/pubs/pub43146.html][Machine Learning: The High Interest Credit Card of Technical Debt]],
|
||||||
|
a paper that anyone working around machine learning should read and
|
||||||
|
re-read regularly. Large parts of the paper are about ways in which
|
||||||
|
machine learning conflicts with proper modularity and abstraction.
|
||||||
|
(However, [[https://colah.github.io/posts/2015-09-NN-Types-FP/][Neural Networks, Types, and Functional Programming]] is still
|
||||||
|
a good post and shows some sorts of abstraction that still exist
|
||||||
|
at least in neural networks.)
|
||||||
|
|
||||||
|
Even more abstractly: emulators work because so much software
|
||||||
|
respected the abstraction of some specific CPU and hardware platform.
|
||||||
|
|
||||||
|
Submitted without further comment:
|
||||||
|
https://github.com/stevemao/left-pad/issues/4
|
||||||
Loading…
x
Reference in New Issue
Block a user