Applied Linked Data IG / WG

Current summary of this group's work

Members of the group are not engaged in a coordinated effort to produce anything in particular, on any specific timeline. Individuals are working in various directions and advising / consulting one another.

Area: Linked Data Fragments

purpose: Caching remote data for local use
- See notes from LDCX 2015: https://docs.google.com/document/d/1wC8kbAyI9w_SSQMP3JbZDBkRZmk7JACQq5cqNhlxXdg/edit#heading=h.m3d3swo2wxuh
Current plan is to use Linked Data Fragments (LDF)
and, within LDF, triple pattern fragments: http://linkeddatafragments.org/in-depth/#tpf
- lightweight querying, not the whole SPARQL interface.
Work is begun in ActiveTriples: https://github.com/ActiveTriples/linked-data-fragments/
- you send it a subject and it gives you back everything about that subject. caching happens in the back. implementation does this with switchable back ends, currently best with marmotta.
- "A Hydra Ontology endpoint that caches resources from remote dataset and returns: { <subject> ?predicate ?object . }" (2015-05-28)

Area: Indexing / Solr

If an entry changes, it will need to be re-indexed in every solr document containing that entry.
- Therefore, we must use atomic updates.
- Therefore, all fields in solr must be stored
- Therefore, we need a way to exclude full-text fields from document retrieval or everything will take one million years.
  - This does not currently exist in solr but the group is in contact with a solr developer who is willing to help.
  - https://issues.apache.org/jira/browse/SOLR-3191
    - Note some discussion of how to use this once it's ready on 2015-12-10
  - Note this is not a concern if you don't have any full-text data
Related: 'side-car indexer' described as 'Option 3' on 2015-04-29.

Area: Making assertions about external triples

Members of this group are very early in explorations of this work.
Want to make a local correction or add additional info / context to a triple owned externally
Implies that we need the external resource to be versioned, which necessitates caching.
Also want to record author of the assertion
Can be done with either named graphs or reification.
Also discussed on 2015-08-06 as 'meta-authority records'

Area: Generally sharing projects and helping one another

Working around usage Limits from LC (2015-01-07)
How to link the same concept in two vocabularies (2015-01-07)
Metadata enrichment initiative: https://github.com/boston-library/mei (2015-12-10, 2015-11-12, 2015-07-23)
Minting Linked Data URIs / Hosting Linked Data Vocabulary: https://github.com/OregonDigital/ControlledVocabularyManager (2015-11-12)

Personal notes from 2016.01.07 meeting

* linked data coding work happening in active triples.

* goal of this group - local cache of linked data in use by the application

* LDF spec - implementation is triple pattern fragments - lightweight querying, not the whole SPARQL interface.

* this WG has a similar implementation - you send it a subject and it gives you back everything about that subject. caching happens in the back. implementation does this with switchable back ends, lead is marmotta.

* other hydra people’s hydra thing. (hydra ontology)

forthcoming:

Also look at descriptive metadata group.

* creation of properties / ontologies for vocabularies

* creation of custom vocabularies.

controlled vocabulary manager - a repo - look at notes from last two meetings - powers opaquenamespace. (https://github.com/OregonDigital/ControlledVocabularyManager)

sanderson has a custom proof-of-concept code using marmotta.

no one in this group currently has a lot of bandwidth to be doing this work; mostly sharing ideas.

Corey offers to read a grant proposal.

Other standing item.

Use case - everything as a stored field in Solr so we can do atomic updates.

OCR complicates this because it’s so big that you don’t want it returned every time you request the document. there’s a patch to exclude a field but it’s currently broken.