Product Release 12 July 2024

Performance in Large Repositories

Performance in Large Repositories

Dear Coreon Users and Repository Managers
we were working on a significant performance improvement when navigating and filtering large repositories.
This has now been deployed.
Best – your Coreon Team!

2024 07 Large Repository Statistics

Performance in Large Repositories

Coreon repositories up to now were usually hovering in a range of 2000 to 25000 concepts (i.e. nodes in the graph). To be future-ready and also to support deployments with many many concepts (such as half a million or more) it was time to improve the data indexing. Namely to:

  1. Allow fast filtering and navigating into branches of a repository.
  2. Reduce server load when running such requests.

Technically, an additional index on the database has been added.
As a user you will notice the following improvements:

  • Faster loading of the concept map in large repositories.
    This is because the computation of the “root” nodes is now way faster.
  • Work-in-a branch: when navigating and querying the repository in a branch, the overall response time is way faster.
  • Filtering into branches: applying filter operators such as is below concept are now also way faster.

    Miscellaneous

    A few “minor” improvements, based on user feedback, also made it into this July 2024 release:

    • Filter operator “below concept”: Above changes trigger a welcome side-effect. Background: the filter operator “is below concept” and similar ones (“not below” etc.) are filtering whether a concept is below a given concept, help to query subbranches of a repository below a “starting” point.
      Until now the given concept that acts as a starting point was also included in the set. From now on, the meaning of the operator is semantically more precise, the given concept itself is no longer included, i.e. it is not passing the filter.
    • Plugins and Annotation Jobs – Term Recognition: The Coreon API’s recognize_terms call had a weakness when querying in German language for plural allomorphs. This has been fixed.
    • Excel export: Improved layout of the generated sheet. This leads to better readability.
      Michael Wetzel
      Michael Wetzel

      Michael has a deep knowledge of multilingual problem solving and long term experience in product management. An expert in language technologies and solutions such as globalisation, documentation, and content management systems as well as text mining, enterprise search, multilingual classifications and nomenclatures. Michael was for years product manager of TRADOS MultiTerm. He is an active contributor to the ISO TC37/SC3 and DIN NA 105 standards.