Summarized some writing from around a year ago
This commit is contained in:
parent
5a4c925911
commit
246bc9a252
239
posts/2015-06-23-stupidity-catalogue-genericstruct.md
Normal file
239
posts/2015-06-23-stupidity-catalogue-genericstruct.md
Normal file
@ -0,0 +1,239 @@
|
||||
---
|
||||
title: Catalogue of My Stupidity: My Haskell 'GenericStruct' Nonsense
|
||||
author: Chris Hodapp
|
||||
date: June 23, 2015
|
||||
tags: stupidity, Technobabble
|
||||
---
|
||||
|
||||
*(A note: I took these notes during my time at Urbanalta, intending
|
||||
them to be a private reference to myself on how to learn from some
|
||||
mistakes. I've tried to scrub the proprietary bits out and leave the
|
||||
general things behind. I do reference some other notes that probably
|
||||
will still stay private.)*
|
||||
|
||||
# Background
|
||||
|
||||
Some background on this: This is some notes on a small Haskell module
|
||||
I did at Urbanalta which I called `GenericStruct`. Most of this post
|
||||
is very Haskell-heavy and perhaps more suited to [HaskellEmbedded][]
|
||||
as it's a very niche usage even within Haskell.
|
||||
|
||||
I talk about this much more extensively in my handwritten work notes,
|
||||
circa 2015-05-05 to 2015-05-20, and in a source file
|
||||
`GenericStruct.hs`. Neither of those are online (and trust me that
|
||||
you don't want to try to understand my scratch notes anyway), but a
|
||||
cleaner summary is in the [Appendix](#appendix).
|
||||
|
||||
The short version is that I needed a way to express the format a
|
||||
packed data structure, similar to a C struct in some ways, but without
|
||||
any padding between fields, and more explicit about the exact size in
|
||||
bits of fields. I wanted this format to also be able to carry some
|
||||
documentation with it because it was meant to be able to express data
|
||||
formats for Bluetooth Low Energy, and so I had a need to present this
|
||||
format in a human-readable way and possibly a more general
|
||||
machine-readable way (such as JSON). This was a similar design goal
|
||||
to my Python creation, AnnotatedStruct, from nearly 2 years ago, but
|
||||
here I wanted the benefits of static typing for when accessing these
|
||||
data structures.
|
||||
|
||||
What complicated matters somewhat is that these data structures,
|
||||
rather than being used directly in Haskell to store things, were to be
|
||||
used with [Ivory][] to model the proper C code for reading and
|
||||
writing.
|
||||
|
||||
What I eventually came up with used Haskell records with
|
||||
specially-crafted data types inside them, and then [GHC.Generics][] to
|
||||
iterate over these data types and inject into them some context
|
||||
information (a field accessor in Haskell by itself cannot have any
|
||||
information about 'where' in the record it is, whether in absolute
|
||||
terms or relative to any other field). Context information here meant
|
||||
things like a bit offset, a size, and a type representation.
|
||||
|
||||
This was a little complicated to implement, but overall, not
|
||||
particularly daunting. The [GHC.Generics][] examples included generic
|
||||
serialization which is a very similar problem in many ways, and I
|
||||
followed from this example and a JSON example.
|
||||
|
||||
(Another note from some prior work: Do not attempt to do record access
|
||||
with Base.Data.Data. It can get some meta-information, like the
|
||||
constructor itself, but only in a sufficiently generic way that you
|
||||
may call nothing specific to a record on it.)
|
||||
|
||||
# Problems
|
||||
|
||||
The problem that I ran into fairly quickly (but not quick enough) is
|
||||
that what I had created had no good way to let me nest data formats
|
||||
inside each other. For instance, I had a 16-bit value which I used in
|
||||
several places, and that 16-bit value was treated in many places as 16
|
||||
individual bitfields with each bit representing a specific Boolean
|
||||
parameter unto itself. In other places, treating it as simply a
|
||||
single 16-bit integer was more meaningful - and this was related to it
|
||||
being used in several places, such that operations like copying from
|
||||
one place to another became meaningful.
|
||||
|
||||
I had no good way to express this. I could not define that format in
|
||||
one place, and then put it inside each struct that used it. I thought
|
||||
initially that implementing this would be a matter of just making
|
||||
certain structures recursive, and I was partly right, but I ran into
|
||||
such complication in the type system that I felt like it was not worth
|
||||
it to proceed further.
|
||||
|
||||
What I wrote yesterday when I ran into these serious snags was:
|
||||
|
||||
- At this stage of complexity, I sort of wish I'd opted for
|
||||
[Template Haskell][] instead. It would have absorbed the change
|
||||
much better. [GHC.Generics][] required me to sort of bend the type
|
||||
system. The problem there is that it had only so far to bend, while
|
||||
with Template Haskell the whole Haskell language (more or less)
|
||||
would be at my disposal, not just some slightly-pliable parts of its
|
||||
type system. (Perhaps this is why Ivory does what it does.)
|
||||
- Idris may have handled this better too by virtue of its dependent
|
||||
types, and for similar reasons.
|
||||
- `johnw` (Freenode IRC `#haskell` denizen &
|
||||
[Galois Inc.](https://galois.com/) employee) mentioned a
|
||||
possibly-viable approach based around an applicative expression of
|
||||
data formats and not requiring things like Template Haskell or
|
||||
possibly Generics. (See IRC logs from 2015-06-04; this was via PM.)
|
||||
- Another person mentioned that this sounded like a job for
|
||||
[Lenses][lens], particularly,
|
||||
their
|
||||
[Iso](https://hackage.haskell.org/package/lens/docs/Control-Lens-Iso.html)
|
||||
(isomorphism) type which had different 'projections' of data.
|
||||
|
||||
# What I did right
|
||||
|
||||
I properly implemented a nice structure with GHC.Generics over top of
|
||||
Haskell records, and kept it fairly compact and strictly-typed. I
|
||||
started making use of it right away, and this meant errors would
|
||||
readily show up (generally as type errors at compile time) as I made
|
||||
changes.
|
||||
|
||||
I kept the code clean and well-documented, and this helped me out
|
||||
substantially with writing the code, understanding it, and then
|
||||
understanding that much of it shouldn't have been written.
|
||||
|
||||
I think that overall that it was a good idea for me to treat the data
|
||||
format as a specification that could be turned into C code, into a
|
||||
JSON description, and into (eventually) a human-readable description.
|
||||
|
||||
# What I did wrong, and should have done instead
|
||||
|
||||
Overall: I tried very hard to solve the very unique, very specific
|
||||
problem. This blocked my view of the real, more general problem.
|
||||
Despite active attempts to discern that general problem, I was fixated
|
||||
on specifics. Despite my preachy guideline elsewhere in my notes
|
||||
that, "Your problem is not a unique snowflake - someone else has
|
||||
studied it," I assumed my problem was a relatively unique snowflake.
|
||||
|
||||
When I expressed the problem to other people, a number of them told me
|
||||
that this sounded like a job for the [Lens][lens] library in Haskell.
|
||||
On top of this, I had used Lenses before. While I had not used them
|
||||
enough to know for certain that Lens was the best solution here, I had
|
||||
used them enough to know that they were a likely first place to look.
|
||||
But, I ignored this experience, and I ignored what other people told
|
||||
me.
|
||||
|
||||
Why I ignored this, I suspect, is because I was focusing too much on
|
||||
the specifics of the issue. This led me to believe that this problem
|
||||
was sufficiently unique and different that Lenses were not an approach
|
||||
I should even look at.
|
||||
|
||||
Lenses might not be the proper approach, but I am almost certain that
|
||||
examining them would have helped me.
|
||||
|
||||
I missed something very crucial: That I would need to nest data
|
||||
formats and share definitions between them. This should have been
|
||||
obvious to me: this is a functional language, and composition (which
|
||||
is what this is) is essential to abstraction and reuse.
|
||||
|
||||
I assumed that my solution would have to be tied to Haskell records.
|
||||
This was not a given. Further, I knew of three methods which created
|
||||
similar structures but did not rely on records: Lenses, Ivory structs,
|
||||
and Ivory bitdata. Records were an irrelevancy (even to the specifics
|
||||
I was fixated on) and I tightly coupled my solution to them. Records
|
||||
are not meant to compose, while some other structures are.
|
||||
|
||||
# Short, general summary
|
||||
|
||||
*(i.e. the part where I get really preachy about vague things)*
|
||||
|
||||
- Foresee what else your problem may need to encompass. Perhaps it
|
||||
only looks like a unique problem because you put too much weight on
|
||||
the specifics, and you've missed the ways it resembles existing,
|
||||
well-studied problems - perhaps even ones you are familiar with.
|
||||
|
||||
- Perhaps you haven't missed anything notable. Still, knowing what
|
||||
else it may need to encompass makes for better solutions, and may
|
||||
prevent you from making design decisions early on which
|
||||
fundamentally limit it in ways that matter later.
|
||||
|
||||
- Unsurprisingly, ekmett probably solved your problem already. (Or
|
||||
perhaps acowley did with [Vinyl][], or perhaps [compdata][] solves
|
||||
it...)
|
||||
|
||||
# Appendix {#appendix}
|
||||
|
||||
My aim was to solve a few problems:
|
||||
|
||||
- Outputting a concrete representation of an entire type (for the
|
||||
sake of inserting into JSON specs, for instance),
|
||||
- Creating a correspondence between native Haskell types and Ivory's
|
||||
specific types (which ended up not being so necessary),
|
||||
- Packing and unpacking a struct value to and from memory
|
||||
automatically (via Ivory),
|
||||
- Unpacking and packing individual fields of a struct (also via
|
||||
Ivory),
|
||||
- Doing the above with the benefit of strict, static typing (i.e. not
|
||||
relying on strings to access a field),
|
||||
- Handling all of this with (in Ivory) an in-memory representation
|
||||
with no padding or alignment concerns,
|
||||
- Having a single specification of a type, including human-readable
|
||||
descriptions.
|
||||
|
||||
What I saw as the largest problem is that accessors for Haskell
|
||||
records have no accessible information on which field they access, or
|
||||
where that field is relative to anything else in the data
|
||||
structure. Thus, if I access a field of a record, I can have no
|
||||
information there about 'where' in the record it is unless I put that
|
||||
information there somehow. The pieces of information that I seemed to
|
||||
need in the field were the field's overall index in the record, and
|
||||
the field's overall memory offset.
|
||||
|
||||
A simpler form of generics, Data.Data, allowed me to solve the first
|
||||
problem easily and produce a list of something like (TypeRep, name,
|
||||
size, position). However, I ran into problems when trying to find a
|
||||
way to insert context into the record somehow. The central issue is
|
||||
that Data.Data provides no way to do anything other than generic
|
||||
operations on a field, and those generic operations are fairly
|
||||
limited. I could find no way to make something like a typeclass and
|
||||
use typeclass methods to update those fields.
|
||||
|
||||
GHC.Generics, on the other hand, made this fairly trivial. I could
|
||||
solve the first problem (albeit in a more complicated way), and what I
|
||||
eventually turned the other problems into was the need to take the
|
||||
generic record type itself (in this case, in the form of a Proxy to
|
||||
it), and given certain constraints on it, to create a generic
|
||||
constructor for this type.
|
||||
|
||||
This proved to be fairly easy. Most of GHC.Generics will as readily
|
||||
traverse a Proxy of a type as a value of the type, given some changes
|
||||
which mostly amount to a lot of use of fmap. The 'to' function in
|
||||
GHC.Generics (after I had cleared up some conceptual confusion) simply
|
||||
took the representation and removed the Proxy (or pushed it
|
||||
elsewhere), until it hit a certain innermost point in which one
|
||||
created an abstract representation of the constructor call itself, but
|
||||
this time with the proper data.
|
||||
|
||||
Most of the rest was just modifying the above to allow me to propagate
|
||||
context information such as index and memory offset, and dealing with
|
||||
the confusion of type families (which I ended up needing much less of
|
||||
than I initially thought).
|
||||
|
||||
|
||||
[GHC.Generics]: https://hackage.haskell.org/package/base/docs/GHC-Generics.html
|
||||
[Ivory]: https://hackage.haskell.org/package/ivory
|
||||
[Vinyl]: https://hackage.haskell.org/package/vinyl
|
||||
[compdata]: https://hackage.haskell.org/package/compdata
|
||||
[lens]: https://hackage.haskell.org/package/lens
|
||||
[Template Haskell]: https://wiki.haskell.org/Template_Haskell
|
||||
[HaskellEmbedded]: https://haskellembedded.github.io/
|
||||
Loading…
x
Reference in New Issue
Block a user