KorAP/Kustvakt

COSMAS II - Queries with #REG-Feature > lead to an empty result set #636

notesjor posted onGitHub

Kustvakt version current version (2023-07-18)

Describe the bug Apparently there is a bug that does not translate queries with the "COSMAS II" syntax correctly into the KorAP system. The problem occurs especially in connection with queries that use the RegEx feature of COSMAS. Ex: #REG(\innen$) for a query on opposite-gendered forms.

To Reproduce Send this search request with a vail Bearer-Token against KorAP-API https://korap.ids-mannheim.de/api/v1.0/search?context=sentence&cutoff=false&ql=cosmas2&q=%23REG%28%2A%5C%2Ainnen%24%29&page=1

Expected behavior Same results like COSMAS II

Desktop (please complete the following information):

  • API

Smartphone (please complete the following information):

  • API

This is already reported here. Closed as duplicate.

posted by Akron over 1 year ago

Maybe I don't understand the regex. Starting with a quantifier in my opinion should definitely fail. What I can see is using ".*\*innen$", which works, but unfortunately can be a known "Killer-Query" in Lucene. Because of the Tokenization in DeReKo this query, however, won't match any tokens.

posted by Akron over 1 year ago

Fund this Issue

$0.00
Funded

Pull requests