Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing
Global Reasoning over Database Structures for Text-to-SQL Parsing
Grammar-based Neural Text-to-SQL Generation

What are terminal and nonterminal symbols?

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

https://github.com/benbogin/spider-schema-gnn

In this paper, we present an encoder-decoder semantic parser, where the structure of the DB schema is encoded with
a graph neural network, and this representation is later used at both encoding and decoding time.

定义问题：

给定： $set \{(x^{(k)}, y^{(k)}, S^{(k)})\}_{k=1}^N$ ，其中 $x^{(k)}$ 为question、 $y^{(k)}$ 为SQL query、 $S^{(k)}$ 为schema
目标：map unseen question-schema pair (x, S)
对于schema S包含：
1. set of DB tables: $\mathcal{T}$
2. set of columns: $\mathcal{C}_t$ for each $t \in \mathcal{T}$
3. set of foreign key pair: $(c_f, c_p) \in \mathcal{F}$
4. 统称所有的schema tables & columns 为 schema items，记作： $\mathcal{V}=\mathcal{T} \cup \{\mathcal{C}_t\}_{t \in \mathcal{T}}$

其他相关笔记：

https://blog.csdn.net/weixin_48167662/article/details/106654904

A Neural Semantic Parser for SQL

定义词语 $x_i$ 和schema item $v$的相似度打分 $s_{link}(v, x_i)$ （$v$ has type $\tau$）

则有：

$p_{link}(v|x_i)=\frac{exp(s_{link}(v, x_i))}{\sum_{v' \in \mathcal{V}_{\tau} \cup \{\emptyset \}exp(s_{link}(v', x_i)))}}$

其中， $\mathcal{V}_{\tau}$ 为所有的schema items， $s_{link}(\emptyset, \cdot)=0$

Encoder

包含两个输入： $[\omega_{x_i};l_i]$

word embedding $\omega_{x_i}$
$l_i=\sum_{\tau} \sum_{v \in \mathcal{V}_\tau} p_{link}(v|x_i) \cdot r_v$ ，其中 $r_v$ 是对每个schema item $v$的邻居和type学出的embedding

这样， $p_{link}(v|x_i)$ 通过schema items增强了每个词语$x_i$的语义

Decoder

这里使用了一个grammar-based LSTM decoder with attention。

Rule分为：

schema-independent and generate non-terminals or SQL keywords
schema-specific and generate schema items

对于，每个decoding step $j$，将有： $g_j \stackrel{decoding LSTM}{\longrightarrow} o_j$ ，若Rule：

schema-independent: $g_j$为一个学到的global embedding
schema-specific: $g_j$ 是一个schema item $v$的type的学到的embedding $\tau(v)$

完整的计算过程如下：

$a_j=h_i^{ \intercal}o_j$
$c_j=\sum_i{a_j h_j}$
$s_j^{glob}=FF([o_j;c_j])$
$s_j^{loc}=S_{link}a_j$
$p_j=softmax([s_j^{glob};s_j^{loc}])$

Modeling Schemas with GNNs

模型包含以下内容：

schema转换为一个graph
此graph根据输入的question进行了soft修剪
GNN根据全局schema对节点生成表示
encoder、decoder均用了schema representation

以下进行详细介绍

Schema-to-graph

为了将schema S转为graph，定义以下内容：

graph node（schema items $\mathcal{V}$）
edge：
- 对于tabel $t$ 中的column $c_t$ ，定义：set $\varepsilon_{\leftrightarrow}: (c_t, t) (t, c_t)$ ，图中绿边
- 对于foreign-primary key column pair，定义：
  - $\varepsilon_{\to}: (c_{t_1}, c_{t_2})$$、$$(t_1, t_2)$
  - $\varepsilon_{\gets}: (c_{t_2}, c_{t_1})$ 、 $(t_2, t_1)$ ，图中虚线

Question-conditioned relevance（问题-条件的相关性）

对schema item $v$定义相关性打分：

$$\rho_v=max_i p_{link}(v|x_i)$$

图中，相关的schema items为深橘黄色，无关的item为浅橘黄色

Neural graph representation

对于每个节点 $v$ ，得到： $h_v^{(0)}=r_v \cdot \rho_v$ ，然后通过$L$ step的GNN，在每步中，

$$a_v^{(l)}=\sum_{type \in \{\rightarrow, \leftrightarrow\}} \sum_{(u,v) \in \varepsilon_{type}}W_{type}h^{l-1}_u +b_{type}$$

$$h_v^{(l)}=GRU(h_v^{(l-1)}, a_v^{(l)})$$

在$L$ step后，得到： $\varphi_v=h_v^{(L)}$

Encoder

$$L_i^{\rho}=\sum_{\mathcal{T}} \sum_{v \in \mathcal{V}_{\mathcal{T}}} \rho_v p_{link}(v|x_i)$$

然后将 $L_i^{\rho}$ 与encoder输出的$h_i$进行concat

Decoder

定义矩阵 $\hat{U} \in \Re^{d \times |\hat{J}|}=[u_{i_1}, u_{i_2}, ..., u_{i_{|\hat{J}|}}]$

$\hat{a}_j=softmax(\hat{U}^Tu_j)$
$s_j^{att}=\hat{a}_j S^{att}$
$P_j=softmax([s_j^{glob};s_j^{loc}+s_j^{att}])$

Global Reasoning over Database Structures for Text-to-SQL Parsing

这篇是上面一篇的改进工作

https://github.com/benbogin/spider-schema-gnn-global

In this paper, we propose a semantic parser that reasons over the DB structure and question to make a global decision about which DB constants should be used in a query.

本文提供了一个GNN来获取DB schema的表示，从而选择可能在query中出现的db 常量(db constants)
训练自回归模型的top-k个查询，然后重排序

Schema-augmented Semantic Parser

Problem Setup部分与上文完全一样，用 $set \{(x^{(k)}, y^{(k)}, S^{(k)})\}_{k=1}^N$ 进行表示。

Base Model

具有基于语法的解码的标准自上而下语义解析器

输入的问题（ $x_1, x_2, ..., x_{|x|}$ ）通过BiLSTM进行encoding，获取隐层表示$e_i$来表示$x_i$的信息。然后通过另一个LSTM使用SQL语法进行decoding获取输出$y$

当前一步解码无符号的table和column时，解析器开始解码DB常量。为了选择DB常量，做了下面的操作：

在question上面计算attention： $\{\alpha_i\}_{i=1}^{|x|}$
对于DB常量 $v$计算打分： $s_v=\sum_i \alpha_i s_{link}(v, x_i)$ ，这里 $s_{link}$ 可以根据word embedding或者其他特征得到，例如两个输入的编辑距离or重复文本
$softmax(\{s_v\}_{v \in \mathcal{V}})$

DB schema encoding

根据上一篇论文，通过将DB schema转为graph，对每个DB常量学得一个表示：$h_v$，然后通过GCN对每个节点的$h_v$进行端到端的学习。

为了将GCN的能力集中在节点上，将计算每个节点的关联概率：$\rho_v$，并将问题当作条件控制“门”进行输入GCN。（这句没理解）

对于每个database constant，GCN的输入为： $h_v^{(0)}=\rho_v \cdot r_v$
GCN循环$L$步
final: $h_v=h_v^{(L)}$

上面的$\rho_v$会在下面详细介绍

Global Reasoning over DB Structures

使用Gating GCN为每个DB constant预测相关分数(relevance score)
使用encoder GCN对每个DB constant计算表示，然后预测K个候选query
re-ranking GCN基于选择的DB constant，对这些候选依次打分

图里虚线和箭头表示因为decoder输出SQL query，因此没有梯度从re-ranking GCN传递到decoder

Global gating

输入部分与上面提到的graph完全一样，但是多了新的节点：$v_{global}$

定义在节点$v$的GCN输入为： $g_v^{(0)}=FF([r_v; \bar{h}_v; \rho_v])$ ，其中 $\bar{h}_v=\sum_i p_{link}(x_i|v) \cdot e_i$ 为question tokens的表示的weight average

这样，通过最后一步的graph表示计算得到一个相关性概率： $\rho_v^{global}=\sigma(FF(g_v^{(L)}))$ ，这个概率代替了$\rho_v$作为encoder GCN的输入

由于已知question对应的query $y$，因此可以得到对应的DB constants $\mathcal{u}_y$

故，可以定义relevance loss： $-\sum_{v \in \mathcal{u}_y} log \rho_v^{global} -\sum_{v \in \mathcal{u}_y}log(1-\rho_v^{global})$ ，这样gating GCN可以根据relevance loss和decoding loss训练。

Discriminative re-ranking

对于每个候选 $(x, S, \hat{y})$ 计算re-ranker打分， $\hat{y}$ 为候选query

计算logit： $s_{\hat{y}} =\omega^T FF(f_{u_{\hat{y}}}, e^{align})$ ，其中：

$\omega$是学来的参数
$f_{u_{\hat{y}}}$ 是set $\mathcal{u}$的表示
$e^{align}$是question和db constants之间的对齐表示

re-ranker试图对正确的query $y$最小化re-ranker loss

对于re-ranking GCN来说，输入仅为：

包含被选择的$u_{\hat{y}}$的子图
全局节点$v_{global}$

因此输入可以用： $f_v^{(0)}=FF(r_v; \bar{h}_v)$ 进行表示，经过$L$ step后，得到： $f_{u_{\hat{y}}}=f^{(L)v_{global}}$ ，至此，得到： $f_{u_{\hat{y}}}$

对于 $v \in \mathcal{V}$ 的每个节点定义表示：

$f_v^{(L)}$$   if   $$v \in \mathcal{u}_{\hat{y}}$
$r_v$ otherwise

然后计算： $l_i^{\varphi}=\sum_{v \in \mathcal{V}} P_link(v|x_i) \cdot \varphi_v$ ，然后链接： $e_i^{align}=[e_i;l_i^{\varphi}]$

Grammar-based Neural Text-to-SQL Generation

TODO

Blog of YQ

this is subtitle :)

【NL2SQL】DB Schema表示

Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing

A Neural Semantic Parser for SQL

Encoder

Decoder

Modeling Schemas with GNNs

Schema-to-graph

Question-conditioned relevance（问题-条件的相关性）

Neural graph representation

Encoder

Decoder

Global Reasoning over Database Structures for Text-to-SQL Parsing

Schema-augmented Semantic Parser

Base Model

DB schema encoding

Global Reasoning over DB Structures

Global gating

Discriminative re-ranking

Grammar-based Neural Text-to-SQL Generation