Dev notes - Hydra camp

Questions and practicalities

(temporarily moved to google docs)

Uncategoriezed notes

Reindex everything into Solr from Fedora!
- ActiveFedora::Base.reindex_everything
opaquenamespaces: a community registry / namespace for RDF properties. Probably best practice to try to put locally-required properties there or somewhere similar. This was started by Karen and Tom at U Oregon.
Page numbers: can use this as literal sequence and put marked pagenumbers in page label property.
Book.where(title:"hat") # note this returns an array, not a single book object (the way find does)
re: IDs. consider using the default id as an internal identifier and another field as a local id for human use.
one challenge of hosting on amazon is moving large files around. sufia uses ffmpeg and converts them to playable proxies. anything that can be handled by imagemagick, openoffice / libreoffice (creates a thumbnail from the first page of the doc)
Currently recommending different heads for different types of collections. use a shared gem for the data models. Then create a separate admin head for managing them all in one place.
DCE also recommends keeping number of servers as low as possible until metrics indicate you should make changes (and can measure those changes on desired vectors)
Dspace, libguides, digital library publishing (drupal, node apps), archive-it – NYU harvests into hydra (ichabod) using Internet Archive API. metadata of record is in another system, but can be supplemented in the admin backend. Also have batch-loaded enrichment data.
- EAD 'kitchen sink' fixture! https://github.com/NYULibraries/findingaids/blob/development/spec/fixtures/examples/EAD_Tracer.xml
- the findingaids app they have is similar to stanford's arclight.
- http://summit2015.lodlam.net/about/
fedora 4 implements w3c standard for access controls
REST vs. CRUD
- GET, POST, PUT/PATCH, DELETE (ActiveFedora, HTTP)
- Read, Ceate, Update, Delete - ActiveRecord (RDMS)
ORM - object relational mapper - the code that interfaces with the actual database
LDP is a way to use REST to talk about objects with containment relationships
Gemfile.lock shows the expansion of the gemfile
If developing on core, replace hydra gem with the gemspec contents of same if you want to mess with changing versions of those dependencies. (or if you want to manage / change these versions manually)
look at other peoples' .gitignore files for rails and sufia projects
What should our ID be? sufia uses NOIDs. note: fedora has a concept of different minters so this may be a factor here as well. NOID translates to fedora as pair paths, but fedora doesn't actually store it that way. so why did sufia do it this way?
characterization object contains xml. you could take each of those values and store them as properties.

Deployment

travis-ci.org/curationexperts/alexandria-v2/builds
https://travis-ci.org/projecthydra/sufia
capistrano: deployment manager. redundant with a ci workflow?
bambu - stanford's environment management solution.
PRODUCTION setup
- tomcat, solr, fedora replace hydra-jetty / jetty wrapper
- postgres as opposed to sqlite
- WATCH RAILS VERSIONS between multithreading and databases
staging
- may be as much like production as possible. may have less CPU power, less memory, smaller HD. May be an exact clone. Also take into account how much time / effort this may require.
ditch testunit and install rspec. spec directory:
- spec
  - fixtures
    - pbcore
      - artesia
        joyce_chen
        image_1.xml
      - mars (filemaker database)
        audio_1.xml
        image_1.xml
look at fixtures vs. factories
- fixtures have to be maintained.
- factories behave the way you tell them to behave; sometimes you need to put in real data.
sandy metz railsconf 2013 presentation video
github.com/afred/openvault - look at factories here.
github.com/projecthydra-labs/hydradam
github.com/WGBH/pbucore
- huge xslt stylesheet to convert XML into RDF-XML, which they will then use to load into fedora4.
Amazon Ops products: elastic beanstalk, opsworks

Testing

rspec only? selinium with rspec?
or capybara with cucumber?

Example fedora instances:

scholarsphere
dl.tufts.edu - tufts digital library - put a hydra head on top of existing fedora repo. awesome transcription / TEI w/ embedded timecode / audio player
levysheetmusic - changes / customizations to interface
hullhistorycentre.org.uk - hull city archives - example of EADs. (nice search box page!)
hydra.hull.ac.uk - has a backend with workflow stuff. would likely be happy to give a short demo. (also note interesting icons)
alexandria digital research
spotlight (stanford) - library.stanford.edu/projects/spotlight - for exhibit building. - note: blacklight gallery gem gives you different views of results lists.
another gem: date slider
digital.case.edu (built on worthwhile, rdf-driven) - open seadragon + iiif-compliant server for amazing image viewing. view metadata / different formats.
dl.tufts.edu - MIRA (management of institutional repository assets. more workflow-type, controlled deposit.
http://demo.curationexperts.com/
WGBH - digitize on-demand. Metadata is published and there's a button.
HydraDAM (replaced Artesia at WGBH)