#206 new
bahuvrihi

Schema Cleanup

Reported by bahuvrihi | May 3rd, 2009 @ 09:13 AM

The build on schema can be cleaned up significantly.

  • consider extracting errors into a variable
  • methods to instantiate a node/join
  • long debug where the node/join metadata is printed

In the future (possibly):

  • checking of join signatures

Comments and changes to this ticket

  • bahuvrihi

    bahuvrihi July 5th, 2009 @ 11:59 PM

    Schema are a hash of (identifier, spec) pairs used to instantiate resources. Specs are arrays consisting of a type, a key, zero or more arrays with references to other specs, and then an argument hash.

    [type, key, [refs], ... , {argh}]
    

    The type and key are used to look up an initializer within an environment and are both required. The type is additionally used to organize specs in some circumstances. The reference arrays and argument hash are used to initialize a resource via the instantiate method:

    Resource.instantiate([refs], ... , {argh}, app)
    

    The app is provided by the schema. The reference arrays and argument hash are optional, but if reference arrays are provided, an argument hash must also be provided. A missing argument hash is inferred to be an empty array. Therefore, zero reference arrays and no arguments has the following method signature:

    Resource.instantiate({argh}, app)
    

    Resources should have a to_args method to transform an instance back into an argument array amenable to instantiate. A spec therefore may be looked at as a list of argument to instantiate, plus a type and key that are handled by the schema.

    resource.to_args # => [[refs], ... , {argh}]
    

    Example:

    in:
    - task
    - load
    out:
    - task
    - dump
    sequence:
    - join
    - join
    - - in
    - - out
    -
    data:
    - queue
    - enq
    - - in
    - - hello world
    

    ARGV

    Schema allow an argument vector to be provided instead of an argument hash. In that case, the arguments will be passed to Resource.parse! instead of instantiates. This mode is specifically designed for handling command line arguments; parse! should be designed accordingly. The to_args method should always return the argument hash form and not the argument vector form.

    Hash format

    Specs can be specified in a hash format that translates into the standard array format. In the hash format, the type and key values are keyed by 'type' and 'key' respectively. Arguments are identified by their index in the args array.

    Likewise schema may be presented in an array format where the index of each spec functions as the identifier for the spec.

    For instance, these denote the same resource:

    - - join
      - sync
      - [0]
      - [1]
      - splat: true
    
    0:
      type: join
      key: sync
      0: [0]
      1: [1]
      2:
        splat: true
    

    Merges

    Schema may be merged as long as there are no identifier conflicts. In this manner, parts of a schema may be defined in separate files and merged.

    Strings

    Specs that are strings are expected to refer to a valid YAML schema. The uri location will be opened with open (from open-uri) and the result should load using YAML into a valid, non-string spec. String specs are only opened once; if the result is not a valid spec, an error is raised.

  • bahuvrihi

    bahuvrihi July 7th, 2009 @ 11:15 PM

    List Model

    Kind of nice in that it clearly delineates the workflow from the queue, which are distinct. Obvious method for merging schemas, referencing external resources, etc. No mixup between resources like data (referenced by inputs in the queue, for instance) and workflow elements. Control still in the hands of the resource for references. Method to NOT persist workflow resources, but just build them. Obvious way to rename. Leave it to the user to ensure the workflow is correctly ordered. Structured and consistent. Should parse easily. Can detect name collisions. Can detect disorder in spec sequence. Can detect missing refs, with name and type (at least to the correct array index, or the full name in JSON).

    From JSON:

    _id:
    _rev:
    workflow:
    -                     # inline resource schema, amenable to rename
      _id:
      _rev:
      _var: 0
      _type: task
      _class: load
    -                     # _var caches resource, optional
      _id:
      _rev:
      _var: 1
      _type: task
      _class: dump
    -                     # refs are inferred as array values
      _id:
      _rev:
      _type: join
      _class: join
      inputs: [0]
      outputs: [1]
    - uri                 # opens, loads as JSON...
                          # any fields (_var, etc)
                          # are accepted directly
    
    - _id:                # manually specified refs
      _rev:
      _type: type
      _class: arbitrary
      _refs: [refs]
      refs: [0]
      key: value
    queue:
    -                     # relative names, allows queues to be
      _id:                # distinct resources, and resued
      _rev:
      task: 0
      data: [a, b, c]
    - uri
    

    From parse:

    workflow:
    - - 0       # -- load
      - task
      - load
    - - 1       # -- dump
      - task
      - dump
    - -         # --[0][1]
      - join
      - join
      - - 0
      - - 1
    - -         # --/debugger
      - middleware
      - debugger
    - - 2       # --. type arbitrary -[0]- arg
      - type
      - arbitrary
      - - 0
      - arg
    queue:      # --@ 0 a b c
    - - 0
      - a
      - b
      - c
    

    Entries map like:

    -
      _var:
      _type:
      _class:
      config:
    
    # schema
    var = _class.instantitate({:config => {}}, app)
    
    -
      task: 0
      inputs: [a, b, c]
    
    # queue
    app.enq(cache[0], inputs)
    
  • bahuvrihi

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

A framework for making configurable, file-based tasks and workflows.

People watching this ticket

Attachments

Pages