【Coref】【论文笔记】多轮处理的指代消解

其他的很多方法使用简单的决策来判定两个表述是否属于共指，由于低准确率的特征数往往高于高准确率的特征数，因此这将导致问题。
因此本文设计了一种模型，根据准确率的高低，由高到低堆叠了多个sieve，每一sieve的输入是上一轮的实体类的输出。此外，在一个实体类中还可以共享属性（性别、数量）。
这些sieve保证了高准确率的强特征起到更加有用的作用，保证了每一次的决策都使用了所有的信息（信息共享）
句法信息用于识别表述的中心词以及确定句子中表述的顺序

Exact Match

This model links two mentions only if they con-
tain exactly the same extent text, including modifiers and determiners, e.g., the Shahab 3 ground-ground missile. As expected, this model is extremely pre- cise, with a pairwise precision over 96%.

如果两个表述的修饰语和限定词相同，说明共指

如果以下条件被任意满足一个，那么两个表述共指

Appositive-同位语

the two nominal mentions are in an appositive construction, e.g., [Israel’s Deputy De- fense Minister], [Ephraim Sneh] , said . . . We use the same syntactic rules to detect appositions as Haghighi and Klein (2009).

两个名词性表述处在一个同位语结构中。我们参考了2009那篇论文的规则

Predicate nominative-谓语主格

the two mentions (nominal or pronominal) are in a copulative subject-object re- lation, e.g., [The New York-based College Board] is [a nonprofit organization that administers the SATs and promotes higher education] (Poon and Domin- gos, 2008).

两个表述（名词性/代词性）处在一个交错的主客体关系中。如：xxxx is xxx

Role appositive-角色同位语

the candidate antecedent is headed by a noun and appears as a modifier in an NP whose head is the current mention, e.g., [[ac- tress] Rebecca Schaeffer]. This feature is inspired by Haghighi and Klein (2009), who triggered it only if the mention is labeled as a person by the NER. We constrain this heuristic more in our work: we allow this feature to match only if: (a) the mention is labeled as a person, (b) the antecedent is animate (we detail animacy detection in Pass 7), and (c) the antecedent’s gender is not neutral.

如 [[演员]小明]。
要求：

1. 表述被标记为PERSON
2. 先行词是animate
3. 先行词的性别不是中性

Relative pronoun-关系代词-没有实现

the mention is a relative pronoun that modifies the head of the antecedent NP, e.g., [the finance street [which] has already formed in the Waitan district].

这个表述是一个关系代词，它修饰了先行词的中心词

Acronym-首字母缩略词

both mentions are tagged as NNP and one of them is an acronym of the other, e.g., [Agence France Presse] . . . [AFP]. We use a simple acronym detection algorithm, which marks a mention as an acronym of another if its text equals the sequence of upper case characters in the other mention. We will adopt better solutions for acronym detection in future work (Schwartz, 2003).
两个表述都被标记为NNP
一个是另一个的缩写
论文里就用的Agence France Presse = AFP
Demonym-区域居民称谓词

one of the mentions is a demonym of the other, e.g., [Israel] . . . [Israeli]. For demonym detection we use a static list of countries and their gentilic forms from Wikipedia.3

一个表述是另一个表述的～，比如美国 = 美国人

这个sieve的目的在于：简单的仅考虑中心词相同容易导致问题，比如【北京大学】和【清华大学】不是相同的

Cluster head match

the mention head word matches any head word in the antecedent cluster. Note that this feature is actually more relaxed than naive head matching between mention and an- tecedent candidate because it is satisfied when the mention’s head matches the head of any entity in the candidate’s cluster. We constrain this feature by en- forcing a conjunction with the features below.

这个表述的先行词匹配在候选集合中任意的中心词。

所有在表述类中的非停用词词语集合，包含在候选先行词的非停用词词语集合

表述的修饰语都包含在候选先行词的修饰语中，这个主要针对的是两个表述之间的关系，而不是两个cluster（上一个特征）

sieve3：

- cluster head match
- word inclusion
- compatible modifiers only 
- not i-within-i

Mention Detection Sieve
Discourse Processing Sieve
Exact String Match Sieve
Relaxed String Match Sieve
Precise Constructs Sieve (e.g., appositives)
6-8. Strict Head Matching Sieves A-C
Proper Head Word Match Sieve
Alias Sieve
Relaxed Head Matching Sieve
Lexical Chain Sieve
Pronouns Sieve

Blog of YQ