Search the EDG

Search the EDG searches across all EDG resources that EDG administrator and/or manager of a specific collection elected to include in the index. This decision can be made upon creation of a collection and can also be made later on the Manage tab of the collection.

Submitting a search from the home page opens a page with the search results summary. Your search key words will be matched against text in any of the fields that have been configured for searching. You may have fields that contain text tagged with different languages, note that search will be performed across all values, irrespective of the language. In the search results page, you can:

  • refine or completely change the search terms and re-execute the search,
  • filter results via facets,
  • click on any of the result results to see more information about it,
  • make comments about any of the found resources and see comments made by other users,
  • access interactive visual views and diagrams for a found resource – what is available depends on a type of the item.

Note that what information is presented in the search summary and detail views, along with the properties and facets, is no longer configurable via an administrative UI in EDG version 6.0. By default, the results are configured to include more information, the summary and detail views will only contain labels of the found items and what asset collection they belong to. In order to customize this screens, please see the developer guide to create your own extension or contact a TopQuadrant representative for a custom extension. All properties of the collection will be indexed for search as well as applicable facets.

The results of search are sorted by score, then alphabetically. The score is calculated based on the number of matches to the term within the document. Lucene also offers query boosting per field.

For example searching:

*id^3 -will find anything that ends in "id" and assign it a weight of 3, boosting those results

Lucene integration and query language

Search the EDG uses Apache Lucene for indexing text. The index is (re)built on system startup and then in intervals set by a configuration setting. To rebuild the cache on demand, use the Cached Graphs administration page. In addition to simply using search keywords, users can combine them with Lucene operators to form richer search queries. By default wildcard search has been implemented before and after the search term. For example, searching for customer actually searches *customer*.

Example Lucene Operators

  • Wildcards (* ?): “?” performs a single character wildcard search and “*” performs a multiple character wildcard search. For example, te?t matches “test”, “text”, etc and Ken* matches all values that start with “Ken”. *product* -will find anything with the word “product” in it. name -will find anything that ends in “name”

  • Fuzzy (~): Matches similar spellings of the word. For example john~, will match “john” and “jean”. The similarity threshold is set to 0.5 by default. You can adjust it using any number between 0 and 1. For example, john~0.8.
  • Prohibit (-term or NOT term): excludes matches that contain the term after the “-” or “NOT” symbols. For example, Ken* -Kentucky matches all values that start with “Ken” but excludes anything that matches ‘Kentucky’.
  • Modifiers (AND OR): using the AND operator will match items that contain both terms while the OR operator matches an item that contains either of the terms.
  • Range queries: using TO operator will match items with the range of values. For example, “Finland TO Germany” in curly brackets. If you want to limit this to a particular field (for example) only a label, use <property name>:{<search query>}. For example, skos_prefLabel:{Finland TO Germany}.

EDG administrators can disable the indexing via the Disable Lucene Indexing parameter. Disabling the index reverts to using SPARQL queries and will cause poor performance for large or numerous collections. The full range of Lucene syntax is only available when the index is active, otherwise only partial value matching is supported. Note, this will also disable the text index search used for tabular editors.