Current summary of this group's work
Members of the group are not engaged in a coordinated effort to produce anything in particular, on any specific timeline. Individuals are working in various directions and advising / consulting one another.
Area: Linked Data Fragments
- purpose: Caching remote data for local use
- Current plan is to use Linked Data Fragments (LDF)
- and, within LDF, triple pattern fragments: http://linkeddatafragments.org/in-depth/#tpf
- lightweight querying, not the whole SPARQL interface.
- Work is begun in ActiveTriples: https://github.com/ActiveTriples/linked-data-fragments/
you send it a subject and it gives you back everything about that subject. caching happens in the back. implementation does this with switchable back ends, currently best with marmotta.
- "A Hydra Ontology endpoint that caches resources from remote dataset and returns: { <subject> ?predicate ?object . }" (2015-05-28)
Area: Indexing / Solr
- If an entry changes, it will need to be re-indexed in every solr document containing that entry.
- Therefore, we must use atomic updates.
- Therefore, all fields in solr must be stored
- Therefore, we need a way to exclude full-text fields from document retrieval or everything will take one million years.
- This does not currently exist in solr but the group is in contact with a solr developer who is willing to help.
- https://issues.apache.org/jira/browse/SOLR-3191
- Note some discussion of how to use this once it's ready on 2015-12-10
- Note this is not a concern if you don't have any full-text data
- Related: 'side-car indexer' described as 'Option 3' on 2015-04-29.
Area: Making assertions about external triples
- Members of this group are very early in explorations of this work.
- Want to make a local correction or add additional info / context to a triple owned externally
- Implies that we need the external resource to be versioned, which necessitates caching.
- Also want to record author of the assertion
- Can be done with either named graphs or reification.
- Also discussed on 2015-08-06 as 'meta-authority records'
Area: Generally sharing projects and helping one another
- Working around usage Limits from LC (2015-01-07)
- How to link the same concept in two vocabularies (2015-01-07)
- Metadata enrichment initiative: https://github.com/boston-library/mei (2015-12-10, 2015-11-12, 2015-07-23)
- Minting Linked Data URIs / Hosting Linked Data Vocabulary: https://github.com/OregonDigital/ControlledVocabularyManager (2015-11-12)
Personal notes from 2016.01.07 meeting
* linked data coding work happening in active triples.
* goal of this group - local cache of linked data in use by the application
* LDF spec - implementation is triple pattern fragments - lightweight querying, not the whole SPARQL interface.
* this WG has a similar implementation - you send it a subject and it gives you back everything about that subject. caching happens in the back. implementation does this with switchable back ends, lead is marmotta.
* other hydra people’s hydra thing. (hydra ontology)
forthcoming:
Also look at descriptive metadata group.
* creation of properties / ontologies for vocabularies
* creation of custom vocabularies.
controlled vocabulary manager - a repo - look at notes from last two meetings - powers opaquenamespace. (https://github.com/OregonDigital/ControlledVocabularyManager)
sanderson has a custom proof-of-concept code using marmotta.
no one in this group currently has a lot of bandwidth to be doing this work; mostly sharing ideas.
Corey offers to read a grant proposal.
Other standing item.
Use case - everything as a stored field in Solr so we can do atomic updates.
OCR complicates this because it’s so big that you don’t want it returned every time you request the document. there’s a patch to exclude a field but it’s currently broken.