Various updates to modularity draft

2017-12-13 21:16:04 -05:00
parent 60cc97f219
commit a14630bfda
1 changed files with 184 additions and 124 deletions
--- a/drafts/2017-04-20-modularity.org
+++ b/drafts/2017-04-20-modularity.org
@@ -3,63 +3,118 @@
 #+DATE: April 20, 2017
 #+TAGS: technobabble

-Two central concepts that feature prominently anywhere that computers
-do are _modularity_ and _abstraction_.  This is meant very broadly: it
-applies to designing software, using software, integrating software,
-and to a lot of hardware as well.  It certainly applies elsewhere too,
-particularly other fields of engineering, but it appears to be
-particularly crucial anywhere software is involved.
+_Modularity_ and _abstraction_ feature prominently wherever computers
+are involved.  This is meant very broadly: it applies to designing
+software, using software, integrating software, and to a lot of
+hardware as well.  It applies elsewhere, and almost certainly
+originated there first, however, it appears to be particularly
+crucial around software.

-They're generally accepted as desireable, but a bit ill-understood at
-times.  It's common to find people who treat "abstraction" as
-something that always stands in their way, adds overhead, and confuses
-things.  At the same time, it's common to find people who treat
-modularity as being present anytime something is broken into pieces.
+Definitions, though, are a bit vague (including anything in this
+post).  My goal in this post isn't to try to (re)define them, but to
+explain a bit of their essence, and expand on a few theses:

-"Being abstract is something profoundly different from being vague.
-The purpose of abstraction is not to be vague, but to create a new
-semantic level in which one can be absolutely precise." E. W. Dijkstra
+- Modularity arises naturally in a wide array of places.
+- Modularity and abstraction are intrinsically connected.
+- Whether a given modularization makes sense depends strongly on
+  meaning and relevance of *information* inside and outside of
+  modules, and broad context matters to those.

-"Modular design hinges on the simplicity and abstract nature of the
-interface definition between the modules. A design in which the
-insides of each module need to know all about each other is not a
-modular design but an arbitrary partitioning of the bits." (Tim
-Berners-Lee in [[https://www.w3.org/DesignIssues/Principles.html][Principles of Design]].)
+* Why?

-"Its is not only necessary to make sure your own system is designed to
-be made of modular parts. It is also necessary to realize that your
-own system, no matter how big and wonderful it seems now, should
-always be designed to be a part of another larger system." (Same)
+People generally agree that "modularity" is good.  The idea that
+something complex can be designed and understood in terms of smaller,
+simpler pieces comes naturally to anyone that has built something out
+of smaller pieces or taken something apart.  It runs very deep in the
+Unix philosophy, which ESR gives a good overview of in [[http://www.catb.org/~esr/writings/taoup/html/ch01s06.html][The Art of Unix
+Programming]] (or, listen to it from [[https://youtu.be/tc4ROCJYbm0?t%3D248][Kernighan himself]] at Bell Labs in
+1982.)

-*Abstraction* and *modularity* are tied other inextricably.  This is
-because abstractions draw out the boundaries that modules sit inside,
-and the interfaces (APIs, communication channels) through which they
-talk.  It need not necessarily be a standardized interface or a
-well-documented boundary, though that helps.
+Tim Berners-Lee gives some practical limitations in [[https://www.w3.org/DesignIssues/Principles.html][Principles of
+Design]] and in [[https://www.w3.org/DesignIssues/Modularity.html][Modularity]]: "Modular design hinges on the simplicity and
+abstract nature of the interface definition between the modules. A
+design in which the insides of each module need to know all about each
+other is not a modular design but an arbitrary partitioning of the
+bits... It is not only necessary to make sure your own system is
+designed to be made of modular parts. It is also necessary to realize
+that your own system, no matter how big and wonderful it seems now,
+should always be designed to be a part of another larger system."  Les
+Hatton in [[http://www.leshatton.org/TAIC2008-29-08-2008.html][The role of empiricism in improving the reliability of
+future software]] even did an interesting derivation tying the defect
+density in software to how it is broken into pieces.

-*Decoupling*, *loose coupling*, and *tight coupling* relate to this as
-well.  When two things are decoupled, it often suggests something
-about at least one of them being a module.
+"Abstraction" doesn't have quite the same consensus. In software, it's
+generally understood that decoupled or loosely-coupled is better than
+tightly-coupled, but at the same time, "abstraction" can have the
+connotation of something that gets in the way, adds overhead, and
+confuses things.  Dijkstra, in one of few instances of not being
+snarky, allegedly said, "Being abstract is something profoundly
+different from being vague.  The purpose of abstraction is not to be
+vague, but to create a new semantic level in which one can be
+absolutely precise."  Joel Spolsky, in one of few instances of me
+actually caring what he said, also has a blog post from 2002 on the
+[[https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/][Law of Leaky Abstractions]].  The [[https://en.wikipedia.org/wiki/Principle_of_least_privilege][principle of least privilege]] is
+likewise a thing. So, abstraction too has its practical and
+theoretical limitations.

-Available abstractions vary.  The language itself may set boundaries
-on what abstractions can be created, or practically created (a
-language like Haskell contains various abstractions done largely
-within the type system that cannot be expressed in many others;
-languages like Python, Ruby, or JavaScript might have various
-abstractions that are meaningful only in the context of dynamic
-typing).  Some languages more readily permit the creation of new
-abstractions, and this might lead to a broader range of abstractions
-implemented in libraries.  The operating system and standard library
-provide abstractions that may cross languages (what is a process?
-what is a thread? what is a library? what is a filesystem?).  This
-extends into electronics (where agreed-upon protocols permit
-interconnectivity and an ignorance of internals that is required for
-abstraction) where "modules" take on a tangible form.  Abstractions
-also change over time - both in specifics (consult any list of dead
-protocols and technologies) and in broader classes (consider how we
-now have entire abstractions devoted to provisioning virtual
-resources, and consider how CGI and FastCGI used to be pretty
-widely-used interfaces).
+* How They Relate
+
+I bring these up together because: *abstractions* are the boundaries
+between *modules*, and the communication channels (APIs, languages,
+interfaces, protocols) through which they talk.  It need not
+necessarily be a standardized interface or a well-documented boundary,
+though that helps.
+
+Available abstractions vary. They vary by, for instance:
+- ...what language you choose.  Consider, for instance, that a language
+  like Haskell contains various abstractions done largely within the
+  type system that cannot be expressed in many other languages.
+  Languages like Python, Ruby, or JavaScript might have various
+  abstractions meaningful only in the context of dynamic typing.  Some
+  languages more readily permit the creation of new abstractions, and
+  this might lead to a broader range of abstractions implemented in
+  libraries.
+- ...the operating system and its standard library.  What is a
+  process?  What is a thread?  What is a dynamic library?  What is a
+  filesystem?  What is a file?  What is a block device?  What is a
+  socket?  What is a virtual machine?  What is a bus?  What is a
+  commandline?
+- ...the time period.  How many of the abstractions named above were
+  around or viable in 1970, 1980, 1990, 2000? In the opposite
+  direction, when did you last use that lovely standardized protocol,
+  [[https://en.wikipedia.org/wiki/Common_Gateway_Interface][CGI]], to let your web application and your web server communicate,
+  use [[https://en.wikipedia.org/wiki/PHIGS][PHIGS]] to render graphics, or access a large multiuser system
+  via hard-wired terminals?
+
+As such: Possible ways to modularize things vary.  It may make no
+sense that certain ways of modularization even can or should exist
+until it's been done other ways hundreds or thousands of times.
+
+Other terms are related too.  "Loosely-coupled" (or loose coupling)
+and "tightly-coupled" refer to the sort of abstractions sitting
+between modules, or whether or not there even are separate modules.
+"Decoupling" involves changing the relationship between modules
+(sometimes, creating them in the first place), typically moving things
+to a more sensible abstraction.  "Factoring out" is really a form of
+decoupling in which smaller parts of something are turned into a
+module which the original thing then interfaces with (one canonical
+example is taking some bits of code, often that are very similar or
+identical in many places, and moving them into a single function).  To
+say one has "abstracted over" some details implies that a module is
+handling those details, that the details shouldn't matter, and what
+does matter is the abstraction one is using.
+
+# -----
+Consider the information this module deals in, in essence.
+
+What is the most general form this information could be expressed in,
+without being so general as to encompass other things that are
+irrelevant or so low-level as to needlessly constrain the possible
+contexts?
+
+(Aristotle's theory of definitions?)
+
+# -----

 In a practical sense: Where someone "factors out" something that
 occurs in similar or identical form in multiple places (incidentally,
@@ -75,14 +130,6 @@ module (from what was factored out) and some number of abstractions
  application itself is a module of a different sort.  (Witness that
  sometimes another application will implement the same plugin API.)

-Given the strong ties between modularity and abstraction, the
-possible and sensible ways to modularize things then also vary.
-  
-The modules that make sense may change over time.  ("Have modern
-tools tried to keep embodying the 'Unix philosophy'? Have they
-extended it, even to other forms of abstraction that weren't
-previously considered?")
-
 One reason behind this is more practical in nature: When something is
 a module unto itself, presumably it is relying on specific
 abstractions, and it is possible to move this module to other contexts
@@ -94,14 +141,68 @@ itself, the way it is designed and implemented often presents more
 insight into the fundamentals of the problem it is solving. It
 contains fewer incidental details, and more essential details.

-* Other fluff
+# -------

-I was around to see what was normal for software made on Windows
-3.1, Windows 95, and the like.  My take is that most of these pieces
-of software were sufficiently GUI-oriented that they tried to remove
-most modularity from the user's perspective.  Things like scripting
-and automation were almost solely as afterthoughts, since most
-interaction was designed explicitly around the GUI.
+* Less-Conventional Examples
+
+One thing I've watched with some interest is when new abstractions
+emerge (or, perhaps, old ones become more widespread) to solve
+problems that I wasn't even aware existed.
+
+[[https://circleci.com/blog/it-really-is-the-future/][It really is the future]] talks about a lot of more recent forms of
+modularity, most of which are beyond me and were completely unheard-of
+in, say, 2010.  [[https://www.functionalgeekery.com/episode-75-eric-b-merritt/][Functional Geekery episode 75]] talks about many similar
+things
+
+[[https://jupyter.org/][Jupyter Notebook]] is one of my favorites here.  It provides a notebook
+interface (similar to something like Maple or Mathematica) which:
+
+- allows the notebook to use various different programming languages
+  underneath,
+- decouples where the notebook is used and where it is running, due to
+  being implemented as a web application accessed through the browser,
+- decouples the presentation of a stored notebook from Jupyter itself
+  by using a [[https://nbformat.readthedocs.io/en/latest/][JSON-based file format]] which can be rendered without
+  Jupyter (like GitHub does if you commit a .ipynb file).
+
+I love notebook interfaces already because they simplify experimenting
+by handling a lot of things I'd otherwise have to do manually - like
+saving results and keeping them lined up with the exact code that
+produced them.  Jupyter adds some other use-cases find marvelous - for
+instance, I can let the interpreter run on my much faster workstation,
+but I can access it across the Internet from my much slower laptop.
+
+[[https://zeppelin.apache.org/][Apache Zeppelin]] does similar things with different languages; I just
+use it less.
+
+Another favorite of mine is [[https://nixos.org/nix/][Nix]].  One excellent article, [[http://blog.ezyang.com/2014/08/the-fundamental-problem-of-programming-language-package-management/][The
+fundamental problem of programming language package management]],
+doesn't ever mention Nix but does a great job explaining the sorts of
+problems it exists to solve.  To be able to combine nearly all of the
+programming-language specific package managers into a single module is
+a very lofty goal, but Nix appears to do a decent job of it.
+
+The [[https://www.lua.org/][Lua]] programming language is noteworthy here.  It's written in
+clean C with minimal dependencies, so it runs nearly anyplace with a a
+C or C++ compiler.  It's purposely very easy both to *embed* (i.e. to
+put inside of a program and use as an extension language, such as for
+plugins or scripting) and to *extend* (i.e. to connect with libraries
+to allow their functionality to be used from Lua).  [[https://www.gnu.org/software/guile/][GNU Guile]] has many
+of the same properties.
+
+[[https://web.hypothes.is/][hypothes.is]] is a curious one that I find fascinating.  In effect,
+they're trying to factor out annotation and commenting from something
+that is handled on a per-webpage basis, and I really like what I've
+seen.
+
+The Unix tradition lives on in certain modern tools. [[https://stedolan.github.io/jq/][jq]] has proven
+very useful anytime I've had to mess with JSON data.  [[http://www.dest-unreach.org/socat/][socat]] and [[http://netcat.sourceforge.net/][netcat]]
+have saved me numerous times.  I'm sure certain people love the fact
+that [[https://neovim.io/][Neovim]] is designed to be seamlessly embedded and to extend with
+plugins.  [[https://suckless.org/philosophy][suckless]] perhaps takes it too far, but gets an honorary
+mention...
+
+# ???

 People know that I love Emacs, but I also do believe many of the
 complaints on how large it is.  On the one hand, it is basically its
@@ -118,80 +219,39 @@ underneath, and this makes me wonder why it needs explicit support for
  - Multiple CPUs
  - Multiple hosts

- Nix, Guix
- [[Notes - Distributed stuff notes (from turtl)]]
 - [[Notes - Paper, 2016-11-13]]
- See notes on functional geekery #75
- Jupyter
 - Any Plan 9 papers? (Will have to dig deep in the archives)
  - http://plan9.bell-labs.com/sys/doc/
 - Tanenbaum vs. Linus war & microkernels
- Conjecture: A module is most useful when available in the most
-  general or most accessible context (e.g. Linux commandline tool
-  vs. a Wordpress plugin or an Emacs package) - the TBL quote on
-  least-power sort of corroborates this, but stands separate in some
-  other ways too.
-  - "most general" might not be right here.
- Other conjecture attempt: A module's power is related to how many
-  other modules it can communicate with, without requiring substantial
-  adaptation.  An abstraction's power is related to the modularity it
-  accomodates.
- Another conjecture attempt: An abstraction's power isn't related to
-  how broad it is, but to how well it connects things. A needlessly
-  simplistic abstraction requires a lot of other adaptation to be
-  useful. A needlessly specific one excludes a lot of potential
-  modues.
- hypothes.is is a sort of module unto itself here too, trying to
-  remove commenting and annotation from existing, very siloed
-  solutions.
+- TBL: "The choice of language is a common design choice. The low
+  power end of the scale is typically simpler to design, implement and
+  use, but the high power end of the scale has all the attraction of
+  being an open-ended hook into which anything can be placed: a door
+  to uses bounded only by the imagination of the programmer.  Computer
+  Science in the 1960s to 80s spent a lot of effort making languages
+  which were as powerful as possible. Nowadays we have to appreciate
+  the reasons for picking not the most powerful solution but the least
+  powerful. The reason for this is that the less powerful the
+  language, the more you can do with the data stored in that
+  language. If you write it in a simple declarative from, anyone can
+  write a program to analyze it in many ways."
 - "Self" paper & structural reification?
  - I'm still not sure how this relates, but it may perhaps relate to
    how *not* to make things modular (structural reification is a sort
    of check on the scope of objects/classes)
 - What by Rich Hickey?
- SICP?
-  - https://mitpress.mit.edu/sicp/full-text/sicp/book/node50.html
+  - Simple Made Easy?
+  - The Value of Values?
+- SICP: [[https://mitpress.mit.edu/sicp/full-text/sicp/book/node50.html][Modularity, Objects, and State]]
 - "On Understanding Data Abstraction, Revisited"
- Frameworks Don't Compose ([composition][])
 - "On the Criteria to be Used in Decomposing System into Modules" (Barnas)
- suckless, and their tools & methodology
-  - https://suckless.org/philosophy
-  - even though they can take things waaaay too far...
- Containers?
 - http://www.catb.org/~esr/writings/taoup/html/apb.html#Baldwin-Clark -
  Carliss Baldwin and Kim Clark. Design Rules, Vol 1: The Power of
  Modularity. 2000. MIT Press. ISBN 0-262-024667.
 - https://colah.github.io/posts/2015-09-NN-Types-FP/ - Was this the
  one that talked about 'modularity' in deep learning?
- NodeRED might be interesting here, but first I need a clear idea of
-  what it factored out into a separate component.
- Lua is notable here for the effort spent in making it easy to both
-  embed (e.g. as a scripting or extension language) and extend
-  (e.g. with other C libraries).  Guile may be similar.
- NeoVim is also an interesting case here as it is designed to be
-  embedded, though I'm not sure what this means yet.
-
- Find the link to Les Hatton's slides (cyclomatic complexity?) on the
-  empirical effects of too many / too large modules
-
- Examples of more 'modern' tools:
-  - socat
-  - jq (the JSON processor)
-  - Nix and some related tools (which take related functionality that
-    is present in numerous PL-specific package managers)
-  - Jupyter
-
-* Link-pile:
-
- [[http://www.catb.org/~esr/writings/taoup/html/][The Art of Unix Programming (Eric S. Raymond)]]
- [[https://circleci.com/blog/its-the-future/][It's the Future]]
- [[https://circleci.com/blog/it-really-is-the-future/][It really is the future]]
- [[https://www.youtube.com/watch?v%253Dtc4ROCJYbm0][AT&T Archives: The UNIX Operating System]]
- [[http://blog.ezyang.com/2014/08/the-fundamental-problem-of-programming-language-package-management/][The fundamental problem of programming language package management]]
+- [[https://clojurefun.wordpress.com/2012/08/17/composition-over-convention/][Frameworks Don't Compose]]
+- Brooks, No Silver Bullet?
 - https://www.reddit.com/r/programming/comments/4bjss2/an_11_line_npm_package_called_leftpad_with_only/
- https://www.functionalgeekery.com/episode-75-eric-b-merritt/
- https://www.w3.org/DesignIssues/
- https://www.w3.org/DesignIssues/Modularity.html
- http://www.w3.org/DesignIssues/Principles.html
 - http://www.freecode.com/articles/editorial-the-two-edged-sword
- https://clojurefun.wordpress.com/2012/08/17/composition-over-convention/
+- https://en.wikipedia.org/wiki/Essential_complexity