Preparing RDF Data for Loading into EDG

If you already have data in RDF format, this section of the guide will help you load it into EDG. 

EDG separates ontologies – asset collections that define schema using classes, properties and shapes, from other types of asset collections – those that contain “instance data” described by the ontologies. RDF files you will load into EDG should follow this separation. Especially important is that files with instance data do not contain schema definitions. If they do, then you would need to load them into EDG ontologies and then, in EDG, move instance data into another type of collection. Depending on the size and complexity of your data, it may be simpler to process the files to enforce this separation.

Prior to loading RDF data you will need to decide how to organize it in EDG – how many asset collections you will have, what each one will contain, etc.

You may decide to follow the partitioning approach implemented in your files i.e., create a separate EDG asset collection for each file. Or you may want to combine some of the files into a single asset collection in EDG. If you do the former and your files reference each other using owl:imports statements, these statements will no longer work since the base URI of an asset collection in EDG will be different from the base URI of the file you are loading. After you create an asset collection for each file, use Settings>External Graph URI to help EDG transform these owl:imports statements. This must be done prior to loading data. 

If you will decide to combine multiple files into a single asset collection in EDG, then prior to load, remove any owl:imports statements that cross reference the files you are combining.

You may also keep some files as files in EDG workspace, without physically loading their data into an asset collection. This may be an option if you will never change their content in EDG. In other words, for EDG purposes, their content is read only. In this case, you can simply upload them into EDG workspace in a project of their own. Then, create an asset collection in EDG and use Settings>Includes to include these files by reference. Note that in this case, you will not be able to use Search the EDG to find resources in these files. Nor will they be found by the global look up. These features only work for data that is physically part of EDG asset collections.

You also need to make sure that files you are using do not contain owl:imports to graphs outside of EDG’s workspace. All imports must refer only to EDG asset collections or to files in the workspace. External imports will not be resolved since dynamically loading data from the external server may time out and present a security risk.

TriG import option makes it possible to create and load data into multiple asset collections by importing a single file. More information is available in the Importing TriG section of the User Guide

The rest of this section provides instructions specific to a type of an asset collection you will be loading your RDF data into.

If you have RDF file that contains definitions for classes and properties, it should be loaded into an ontology in EDG. As described earlier in this section, other asset collections are expected to contain data, not schema. One exception is Enumerations. They are treated as ontologies.

As described in the Importing RDF section of the User Guide,  EDG expects that RDF to be imported into a Taxonomy in EDG conforms to the W3C SKOS standard. That section describes specific details about classes and properties that EDG expects to see such as skos:ConceptScheme and skos:hasTopConcept

There are generally no special requirements on the classes, properties or instances that must be present in an RDF file to be imported into EDG Ontology. The only exception is the presence of rdfs:subClassOf statements for classes, which are required in order for a class to appear in the Class Hierarchy panel. If RDF file you are loading contains some classes that do not have a “parent” class, EDG can generate statements connecting them to either owl:Thing or rdfs:Resource

EDG Crosswalks, by default, must contain RDF triples of the following form:

example:Vocabulary1_Resource1 skos:closeMatch :example:Vocabulary2_Resource1.

You can use a different property to connect resources being crosswalked. To do so, please select an alternative property on the Manage tab of a crosswalk.

Valuable data for use in an EDG asset collections may not be available in RDF. The Importing spreadsheet data section of the User Guide describes how to import spreadsheets with a choice of patterns. If your spreadsheet does not fit one of these patterns, you will need to transform it to align with a pattern. Alternatively, you can develop a custom importer.  There are several options for developing custom imports, including, for example, writing a server side script using Active Data Shapes (ADS). You can also create an ontology from a spreadsheet data – using the header row as properties and the worksheet name as a class.

If you are unsure which choices for developing custom importers would let you best take advantage of the data available to you, contact your EDG support representative.