Download presentation
Presentation is loading. Please wait.
1
Custom rules on subject verb agreement
David Ling
2
Contents Subject verb agreement: singular + plural_verb
A cat go to school Subject verb agreement: plural + singular_verb Cats goes to school POSTag and OpenNLP Chunker Control with Grammarly
3
Subject verb agreement: singular + plural_verb
POSTag Singular noun (both from chunker and Postag) Skip one optional adverb Plural verb (both from chunker and postag) Matches: cat(NN) go(VBZ) 3 exceptions added: Does….singular noun Modal verb…singular noun And singular noun Successful
4
Subject verb agreement: plural + singular_verb
Anti-pattern: Except with preposition in front example: One of the cats A word in the sentences Plural Noun Skip optional adverb Singular verb Successful Fail when adj+plural noun. Why? Due to disambiguation rules for POSTags
5
Both of them are not correct 100% of the time (cat: VBZ)
Why fail? In LanguageTool, you can obtain the part of speech by using POSTag and Chunker POSTag: Uses customizable rules with the embedded POS dictionary Example rule: NNS/VBZ delete NNS if another NNS follows It causes(NNS/VBZ) somethings(NNS). It causes(VBZ) somethings(NNS). Cats(NNS/VBZ) goes(NNS/VBZ) Cats(VBZ) goes(BBS/VBZ) Chunker: A parser in opennlp library, statistically trained Group tokens into chunks and give part of the speech B– begin of the chunk I– intermediate of the chunk E– End of the chunk Both of them are not correct 100% of the time (cat: VBZ)
6
As the POSTag wrongly interpret the POS of “cats” as VBZ (singular verb)
the sequence doesn’t match the rule pattern So I made another rule, add vbz to antipattern, use pos from chunker instead
7
Now it can recall adjective + plural noun
I think the rule is still not sophisticated To be robust, more exceptions and testing are needed to reduce false alarms
8
And actually, there are existing verb agreement checking in the grammar.xml
The rule group contains 40 rules Most of them focused on the beginning of the sentence, and only allow determinant but not numbers in front of the plural noun Able to recognize: “The cats”, “Cats”, but not “The two cats”, “All cats”
9
Our LanguageTool Grammarly Suggest “an”
10
Suggest “they want” Suggest “the” Suggest “the whole” Suggest “to”
11
Next possible rules: Plural or singular verb of compound subjects:
Except cases with “that” causes: A dog [that loves its owners] deserves a reward. Number + singular noun There are three cats.
12
Number +singular noun
13
Marking process Machine Realize influent phases in a paragraph
Change/add/delete word See if it is more fluent Machine Learn good language model Low probability occurs at the influent phases (5 gram moving window). Try change/add/delete inside the window to see if the probability raises (beam search)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.