The Top Down View

Mapper & Mapping

Mapper supplies the most front-facing interface to the DSL: #define and #map. It keeps a registry of defined mappings, connecting them to a name symbol; #map runs the mapping over a set of records.

Mapping, as the name suggests, represents the actual defined mapping. It holds information about the class of the original record parser and the target record. It also gives us #process_record, which instantiates a new target record and uses the mapping to populate it.

By including MappingDSL, Mapping provides access to the language implementation used to specify the mappings:

The DSL

The DSL provides an interface to make "Declarations" through #method_missing. There are two main kinds of declarations:

Simple PropertyDeclarations; and
ChildDeclarations

All undefined method calls on a Mapping are treated as property declarations. The method name is the property name, the arguments are passed to the Declaration; and the resulting declarations are collected up in the properties hashmap.

A declaration is an object with a #to_proc method that gives us a closure accepting a target and a record. The record is the source data (the parsed OriginalRecord) and the target is the ActiveModel/ActiveTriples-like object it gets to mutate. The Declaration's job is to hold the information needed to set the right values to the right properties when the mapping is processed. It also promises to call #call, passing record, for any values it tries to set.

The rest of the language is provided through the ParserMethods mixin, which provides access to an OriginalRecord as a "parsed" tree. It gives us #record and #header methods, which return a RecordProxy. This object holds onto messages we want to send to the record when we process the mapping, and it exposes a #call method that replays that "call chain" back on the record, this will eventually be invoked by a declaration.

Example

Example PropertyDeclaration

# creates a `Mapping` object with name `:mapping`.
define :mapping do
  # `title` defines a `PropertyDeclaration` and stores it in the mapping's `properties` hash.
  #   In this case the declaration accepts a simple value ("blah"). When it is called (`declaration.to_proc.call(target, record)`),
  #   it promises to set the target's title property to "blah".
  title "blah"
 
  # This example is more complex. `description` is still a property declaration, but it reacts differently to the value passed.
  #   `record` returns a `RecordProxy`. At mapping definition time, that proxy accepts the messages (`one`, `two`, `three`) and their 
  #   arguments and blocks, storing them in its call chain.
  #
  # The RecordProxy is then passed to the property declaration, which now promises to give it a parsed record at processing time. The 
  #   proxy sends the stored call chain, in order, along to the parsed record and returns a value or values that the declaration will
  #   set to the target's `description` property.
  description record.one.two(arg).three { |v| v.capitalize }
end

ValueArray

The value processing work happens here, at the ValueArray. This is the interface the parsed record exposes for manipulating it's value. The technique used is sometimes called "method chaining"; each method call returns an instance of the same class, allowing repeated calls to effectively change the state of the base object.

Possible Deep Dives

Declarations

What is going on in `#to_proc`? This is some of the scarier code in the DSL:

- https://github.com/dpla/KriKri/blob/3ce75bb56b558b30c10508564d3ddf93f94b1a83/lib/krikri/mapping_dsl/property_declaration.rb#L19-L41
- https://github.com/dpla/KriKri/blob/3ce75bb56b558b30c10508564d3ddf93f94b1a83/lib/krikri/mapping_dsl/child_declaration.rb#L16-L40

ValueArray

We said that method chaining works by changing the state of the underlying object. In reality, we return new instances. This is good, since it's more static (i.e. functional); but the instances share some member objects! This leads to some subtlety about how ValueArray works in practice:

https://github.com/dpla/KriKri/blob/3ce75bb56b558b30c10508564d3ddf93f94b1a83/lib/krikri/parser.rb#L184-L202
Look at how @bindings is handled throughout
See also, the recent DSL backtrack PR: https://github.com/dpla/KriKri/pull/244/files

RecordProxy

#method_missing can be used to bad effect. We deploy some tricks to make sure RecordProxy breaks at definition time (rather than on each processed record):

https://github.com/dpla/KriKri/blob/3ce75bb56b558b30c10508564d3ddf93f94b1a83/lib/krikri/mapping_dsl/parser_methods.rb#L117-L138
Are there other places we could surface errors closer to when we write a mapping?

Tech

DSL Code Read

The Top Down View

Mapper & Mapping

The DSL

Example

ValueArray

Possible Deep Dives

Declarations

What is going on in `#to_proc`? This is some of the scarier code in the DSL:

ValueArray

RecordProxy

The Top Down View

Mapper & Mapping

The DSL

Example

ValueArray

Possible Deep Dives

Declarations

What is going on in #to_proc? This is some of the scarier code in the DSL:

ValueArray

RecordProxy

What is going on in `#to_proc`? This is some of the scarier code in the DSL: