Search the EDG searches across all EDG resources that EDG administrator and/or manager of a specific collection elected to include in the index. This decision can be made upon creation of a collection and can also be made later on the Manage tab of the collection or through Governance Model. An entire subject area or business area can be included in Search the EDG by checking the box as shown below. 

Submitting a search from the home page opens a page with the search results summary. Your search key words will be matched against text in any of the properties of the collections that are managed by Search the EDG. You may have fields that contain text tagged with different languages, note that search will be performed across all values, irrespective of the language. In the search results page, you can:

  • refine or completely change the search terms and re-execute the search,
  • filter results via facets,
  • click on any of the result results to see more information about it,
  • make comments about any of the found resources and see comments made by other users,
  • endorse any of the found resources and see endorsements made my others,
  • access interactive visual views and diagrams for a found resource – what is available depends on a type of the item,
  • access results and facets via API’s (see below),
  • search using Lucene query expression.


All properties of the collection will be indexed for search as well as applicable facets. The results of search are sorted by score, then alphabetically. The facet results list shows the top 10 facets for the results.  The score is calculated based on the number of matches to the term within the document. Lucene also offers query boosting per field (be sure to check use advanced syntax). 

For example searching:

*id^3 -will find anything that ends in "id" and assign it a weight of 3, boosting those results

 


Lucene integration and query language

Search the EDG uses Apache Lucene for indexing text. The index is built on system startup if the index does not already exist. The indexer is using Graph Listeners and is updated near real time as data changes. To rebuild the cache on demand, use the Cached Graphs administration page. In addition to simply using search keywords, users can combine them with Lucene operators to form richer search queries. By default wildcard search has been implemented before and after the search term. For example, searching for customer actually searches *customer*.

Check the box for Use Advanced Syntax for Lucene query expressions: 

Example Lucene Operators

  • Wildcards (* ?): “?” performs a single character wildcard search and “*” performs a multiple character wildcard search. For example, te?t matches “test”, “text”, etc and Ken* matches all values that start with “Ken”. *product* -will find anything with the word “product” in it. name -will find anything that ends in “name”
  • Fuzzy (~): Matches similar spellings of the word. For example john~, will match “john” and “jean”. The similarity threshold is set to 0.5 by default. You can adjust it using any number between 0 and 1. For example, john~0.8.
  • Prohibit (-term or NOT term): excludes matches that contain the term after the “-” or “NOT” symbols. For example, Ken* -Kentucky matches all values that start with “Ken” but excludes anything that matches ‘Kentucky’.
  • Modifiers (AND OR): using the AND operator will match items that contain both terms while the OR operator matches an item that contains either of the terms.
  • Range queries: using TO operator will match items with the range of values. For example, “Finland TO Germany” in curly brackets. If you want to limit this to a particular field (for example) only a label, use <property name>:{<search query>}. For example, skos_prefLabel:{Finland TO Germany}.

Note: to search over special characters (such as / ? and -) enable the WhitespaceAnalyzer option in the EDG Configuration Parameters page. You will need to rebuild the search index after making the switch. This can be done on the Search the EDG index page. Changing this configuration parameter will also apply to the editor page search panels in EDG and a rebuild of the text indices will need to be completed as well. 

Search results via API

Search results and facets can be accessed via API. Results are returned as JSON.

Service Syntax

http://…/search/results

http://…/search/facets

Arguments

term search term or phrase (uses query syntax) optional
limit number of results returned optional
offset number for offset optional
withFacets true or false optional