-
-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search / Multilingual support for contact, links and overview description (indexing and aggregations) #6588
Conversation
if the term field ends with Object, then it considers it as multilingual and it creates the facet on the UI language indexed field.
An index field used in aggregation may be multilingual. eg. When the field was based on a multilingual thesaurus, the field key can be used in facet for translation (eg. th_httpinspireeceuropaeutheme-theme_tree.key) or a codelist field. In such case, the translation are loaded client side. But that was not supported for field not related to a thesaurus or a codelist eg. OrgForResource or tag. For organisation field the changes required are: * Index organisation name as a multilingual field (was only indexing main language) * Change index field type to object (with language field typed as keyword (for making aggregation)) * Translate recursively search response (only first level multilingual fields where translated eg. resourceTitleObject) - Based on geoadmin/geocat@68d60be * Add language replacer for aggregation based on the language strategy ```js "tag": { terms: { field: "tag.${aggLang}", OrgForResource: { terms: { field: "OrgForResourceObject.${aggLang}", // field: "OrgForResourceObject.default", // field: "OrgForResourceObject.langfre", ``` Based on the strategy, aggLang is: * if forcedLanguage or searchInThatLanguage, then the forced language * if searchInDetectedLanguage, then the detected language * if searchInUILanguage or searchInAllLanguages, then the search UI language Note: We can use `OrgForResourceObject.*` to create an aggregation combining all values. Index Changes: * organisation fields are now multilingual fields eg. `OrgForResource` > `OrgForResourceObject` ```js OrgForResourceObject": { "default": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langeng": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langfre": "SPF Finances - Administration Générale de la Documentation Patrimoniale (AGDP)", "langdut": "FOD Financien - Algemene Administratie van de Patrimoniumdocumentatie (AAPD)", "langger": "FOD Finanzen - Generalverwaltung Vermögensdokumentation (GVVD)" }, "custodianOrgForResourceObject": { "default": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langeng": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langfre": "SPF Finances - Administration Générale de la Documentation Patrimoniale (AGDP)", "langdut": "FOD Financien - Algemene Administratie van de Patrimoniumdocumentatie (AAPD)", "langger": "FOD Finanzen - Generalverwaltung Vermögensdokumentation (GVVD)" }, "contactForResource": [ { "organisationObject": { "default": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langeng": "FPS Finance - General Administration of Patrimonial Documentation (GAPD)", "langfre": "SPF Finances - Administration Générale de la Documentation Patrimoniale (AGDP)", "langdut": "FOD Financien - Algemene Administratie van de Patrimoniumdocumentatie (AAPD)", "langger": "FOD Finanzen - Generalverwaltung Vermögensdokumentation (GVVD)" }, "role": "custodian", ``` It does not require changes on the client app as multilingual fields are converted to simple field based on UI language. `OrgForResourceObject` field is translated and a field `OrgForResource` containing the value in UI language or the default value is created. So now record view also support organisation name encoded using multilingual encoding. Inspired by some work done by @fgravin and @cmangeat for geocat.ch which choose the facet in the UI language.
56b9d84
to
da8448c
Compare
<element>additionalDocumentation</element> | ||
<element>specification</element> | ||
<element>reportReference</element> | ||
</doc> | ||
</xsl:variable> | ||
|
||
<xsl:template name="collect-documents"> | ||
<xsl:variable name="root" select="."/> | ||
<xsl:param name="forIndexing" select="false()" as="xs:boolean"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a code comment to explain what is the usage of this parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added description 80fbd5f
select="substring-before(., $valueSeparator)"/> | ||
select="if ($valueSeparator != '') | ||
then substring-before(., $valueSeparator) | ||
else substring(., 1, 2)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the value 2 is for the iso2lang code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added description 80fbd5f
select="substring-after(., $valueSeparator)"/> | ||
select="if ($valueSeparator != '') | ||
then substring-after(., $valueSeparator) | ||
else substring(., 4)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What means the value 4? If not obvious, can you add a code comment to explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added description 80fbd5f
it is a particular case for the onlinesrc-add.xsl which can add URL containing the separator '#'
select="substring-after(., $valueSeparator)"/> | ||
select="if ($valueSeparator != '') | ||
then substring-after(., $valueSeparator) | ||
else substring(., 4)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar as previous comment
Testing the pull request, the Organisation facet list seems not translated. I've created a metadata with English and French, using these values for the organisation name:
I get the following in French UI: But doing the following steps in the French UI. only displays the French organisation, none of the other that are not from multilingual metadata:
|
You have to test with the various language option and find which one is best for your catalogue content Copied some info from that PR description in the code to explain how the |
I see the results vary depending on that selection, that might make sense, but it's a bit confusing if you have multilingual and non-multilingual metadata and select to search in the in UI language with the metadata alternate language (for example, French). In this case only get the organisations with a value in French in the multilingual metadata, but not in the non-multilingual metadata (if defined in other language than French). It's not a bug, but I'm not sure if it might be better to index the multilingual values for non-multilingual metadata with the primary language value. In any case, this can be done in a separate pull request, if required. Independently of that, the case I described causes different results (I see it's in both cases selected the value
|
It really depends on what UI languages you provided and what is the content of your catalogue. If you have UI language list set to english and french and also a mix of metadata not all translated (or a minority translated), then you would configure the facet to be OrgForResource: {
terms: {
field: "OrgForResourceObject.default", if all (or most) are translated, I would use: languageStrategy: "searchInDetectedLanguage",
languageWhitelist: ["eng", "fre"],
facetConfig: {
OrgForResource: {
terms: {
field: "OrgForResourceObject.${aggLang}", if you propose only UI in english and have a mix of non english monolingual metadata harvested from various places: languageStrategy: "searchInAllLanguages",
facetConfig: {
OrgForResource: {
terms: {
field: "OrgForResourceObject.default", it really depends on the content and targeted audience languages.
|
Thanks for the clarifications. |
Multilingual contact organisation support (in aggregation and record view)
An index field used in aggregation may be multilingual. eg. When the field was based on a multilingual thesaurus,
the field key can be used in facet for translation (eg. th_httpinspireeceuropaeutheme-theme_tree.key) or a codelist field. In such case, the translation are loaded client side. But that was not supported for field not related to a thesaurus or a codelist eg. OrgForResource or tag.
For organisation field the changes required are:
main language)
Based on the strategy, aggLang is:
Note: We can't use
OrgForResourceObject.*
to create an aggregationcombining all values.
Index Changes:
OrgForResource
>OrgForResourceObject
It does not require changes on the client app as multilingual fields are
converted to simple field based on UI language.
OrgForResourceObject
field is translated and a field
OrgForResource
containing the value inUI language or the default value is created. So now record view also
support organisation name encoded using multilingual encoding.
Inspired by some work done by @fgravin and @cmangeat for geocat.ch which choose the facet in the UI language.
Multilingual links support
In ISO19139, name and description can be multilingual. ISO19115-3 adds the possibility to also provide one URL per languages.
With the migration to Elasticsearch only the main language was indexed and used in the UI. Like for organisation name, record view now display link details (URL, name and description) using the UI language if available
Multilingual overview support
An overview can have a description. Display it based on UI language.
ISO19115-3 / Editor / Fix update and indexing of links (not in distribution)
eg. data quality report and legend can now be updated properly in multilingual records.