JSON-NLP is the encoding format that allows us to provide advanced annotations for language data that goes beyond any other JSON based annotation format or standard that we know of. In particular, JSON-NLP allows us to encode discontinuities and invisible elements in language, including missing elements from ellipsis, gapping, sluicing, topic drop, pro-drop, or simply speech errors and typos. More on that in a future post.
More importantly, the Demo shows now the detailed clause level semantic annotation of our NLP technologies. We offer annotation of temporal and aspectual properties of predicates at the clause level. This annotation includes full analysis of Tense, Voice, Mood, and Aspect, as well as all scope relations between clauses and elementary predicates, quantifiers, operators, and other elements relevant for a detailed semantic analysis.
The detailed feature annotation of clause level predicates, and in particular the analysis of temporal logic relations, that is sequencing of sub-events enables us to perform very precise analysis of temporal relations, duration, and also composition of quantifiers and temporal interpretation.
Imagine, in a news article we read that:
The Chancellor of Germany met the president of the United States of America after a visit to Moscow.
The sequence of events in temporal logic is:
1. Chancellor of Germany visited Moscow 2. Chancellor of Germany met President of USA
Natural language expressions do not necessarily present sub-events according to the underlying temporal sequence.
In addition to that, the interpretation of tense is essential for semantic analysis or triple extraction. While in the following sentences the embedded clause with a past tense predicate expresses an assertion that the Chancellor of Germany visited Moscow, this is not the case for the second, exactly same predicate:
1. Reuters reported that the Chancellor of Germany visited Moscow. 2. Reuters will report that the Chancellor of Germany visited Moscow.
In the first sentence the embedded past tense predicate is in the scope of the matrix clause Reuters reported, which is also in past tense, implying factivity and truth of the report event, and the embedded visit event in its scope.
In the second sentence the matrix event report is in future tense, thus we cannot reason that the embedded predicate is true.
Our scope analytics over sentences and clauses, as well as the detailed analytics of clause and predicate level tense, aspect, mood, voice, etc., enables us to not only sequence events and sub-events reported in text and speech, it allows us also to reason about implications of truth, factivity, and assertions reported or made in some text.
Needless to say, these analytics we can provide for basically any language customers want to process, given that we can provide the underlying NLP components. For languages for which we do not have existing part-of-speech taggers, syntactic parsers, lexical analyzers, etc., we can develop those in very short time.
Please contact the Semiring Team for details. Send us your suggestions and comments! We love to stay in touch with our readers and customers.