KorAP/Kustvakt

Do you want to work on this issue?

You can request for a bounty in order to promote it!

POLIQARP - Parsing Error in RegEx #637

notesjor posted onGitHub

If you use the escape char '' in a polqarp-Regex, you recived an error.

Query:

like: "\innen$"

HTTP-Request:

https://korap.ids-mannheim.de/api/v1.0/search?context=sentence&cutoff=false&ql=poliqarp&q=%22%2A%5C%2Ainnen%24%22&page=1

Result:

{ "meta": { "startPage": 1, "snippets": true, "context": "sentence", "tokens": false, "cutOff": false, "timeout": 90000 }, "@context": "http://korap.ids-mannheim.de/ns/koral/0.3/context.jsonld", "errors": [ [ 302, "Failing to parse at symbol: '\\'", 2 ], [ 302, "Could not parse query >>> \"*\\*innen$\" <<<." ] ] }


I answered here:

Maybe I don't understand the regex. Starting with a quantifier in my opinion should definitely fail. What I can see is using ".*\*innen$", which works, but unfortunately can be a known "Killer-Query" in Lucene. Because of the Tokenization in DeReKo this query, however, won't match any tokens.

P.S. GitHub Markdown sometimes kills symbols like backslash.

posted by Akron over 1 year ago

Fund this Issue

$0.00
Funded
Only logged in users can fund an issue

Pull requests