附录3 参考文献

Published: 周二 26 四月 2022
Updated: 周二 26 四月 2022
By sxw

In NLP.

tags: NLP

1.1.4 自然语言处理的发展历程

  1. 文章“Giving GPT-3 a Turing Test”

https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html

1.3.6 注意力机制

  1. 开源项目:文字注意力热力图可视化(Text-Attention-Heatmap-Visualization)

https://github.com/jiesutd/Text-Attention-Heatmap-Visualization

1.3.8 多模态学习

  1. 论文“Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books”

https://arxiv.org/abs/1506.06724

1.4.2 Batch Size的选择

  1. 论文“Revisiting Small Batch Training for Deep Neural Networks”

https://arxiv.org/abs/1804.07612

1.4.3 数据集不平衡问题

  1. 论文“Focal Loss for Dense Object Detection”

https://arxiv.org/abs/1708.02002

  1. 论文“A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection”

https://arxiv.org/abs/1607.07155

1.5.4 预训练模型与数据安全

  1. 论文“Extracting Training Data from Large Language Models”

https://arxiv.org/abs/2012.07805

2.1.3 使用pip包管理程序和Python虚环境

  1. Python官方文档:虚环境

https://docs.python.org/zh-cn/3/tutorial/venv.html

2.1.5 安装Python自然语言处理常用的库

  1. 论文“PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation”

https://arxiv.org/abs/1906.11455

2.3.5 文本规范化

  1. BERT-KPE

https://github.com/thunlp/BERT-KPE/blob/master/preprocess/prepro_utils.py

2.5.1 通过ctype调用C/C++代码

  1. Python官方文档ctypes

https://docs.python.org/zh-cn/3.8/library/ctypes.html

3.2.1 PyTorch的优势

  1. 近年来几个NLP顶级会议中使用PyTorch和TensorFlow论文数比较数据

http://horace.io/pytorch-vs-tensorflow/

4.2.6 使用torch.nn的Transformer模型

  1. 论文“Attention is all you need”

https://arxiv.org/abs/1706.03762

4.3.6 使用LogSoftMax函数

  1. PyTorch中SoftMax和LogSoftMax的实现(C++代码)地址

https://github.com/pytorch/pytorch/blob/v1.6.0/aten/src/ATen/native/SoftMax.cpp

4.5.2 使用Adam优化器

  1. 论文“Adam: A Method for Stochastic Optimization”

https://arxiv.org/abs/1412.6980

  1. 论文“On the Convergence of Adam and Beyond”

https://arxiv.org/abs/1904.09237

4.5.3 使用AdamW优化器

  1. 论文“Decoupled Weight Decay Regularization”

https://arxiv.org/abs/1711.05101

4.9.2 在PyTorch中使用TensorBoard

PyTorch官方网站文档中对TensorBoard使用方法的介绍

https://pytorch.org/docs/master/tensorboard.html#torch-utils-tensorboard

6.3.4 使用pkuseg

  1. 开源项目pkuseg-python中提供的tags.txt文件

https://github.com/lancopku/pkuseg-python/blob/master/tags.txt

9.1.1 背景

  1. 论文“Generating Sequences With Recurrent Neural Networks”

https://arxiv.org/abs/1308.0850

论文“Recurrent Continuous Translation Models”

https://www.aclweb.org/anthology/D13-1176/

论文“Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation”

https://arxiv.org/abs/1406.1078

论文“Sequence to Sequence Learning with Neural Networks”

https://arxiv.org/abs/1409.3215

9.2 使用PyTorch实现Seq2seq模型

  1. 开源项目PyTorch-Seq2seq

https://github.com/bentrevett/pytorch-seq2seq

10.1.1 最早应用于计算机视觉

  1. 文章“Attention and Memory in Deep Learning and NLP”

http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

  1. 论文“Recurrent Models of Visual Attention”

http://arxiv.org/abs/1406.6247

10.4.2 Self-Attention相关的工作

  1. 论文“Long Short-Term Memory-Networks for Machine Reading”

https://www.aclweb.org/anthology/D16-1053/

论文“A Structured Self-Attentive Sentence Embedding”

https://arxiv.org/abs/1703.03130

论文“A Deep Reinforced Model for Abstractive Summarization”

https://arxiv.org/abs/1705.04304

10.6 Multi-hop Attention

  1. 论文“Memory Networks”

https://arxiv.org/abs/1410.3916

论文“End-To-End Memory Networks”

https://arxiv.org/abs/1503.08895

论文“Multihop Attention Networks for Question Answer Matching”

https://dl.acm.org/doi/10.1145/3209978.3210009

10.7 Soft Attention和Hard Attention

论文“Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”

https://arxiv.org/abs/1502.03044

10.8 Full Attention和Sparse Attention

论文“Generating Long Sequences with Sparse Transformers”

https://arxiv.org/abs/1904.10509

11.1. 背景

  1. 论文“Attention Is All You Need”

https://arxiv.org/abs/1706.03762

论文“Convolutional Sequence to Sequence Learning”

https://arxiv.org/abs/1705.03122

11.2.1 背景

  1. 论文“Convolutional Neural Networks for Sentence Classification”

https://arxiv.org/abs/1408.5882

11.3.4 使用Positional Encoding

  1. Positional Encoding完整代码

https://github.com/jalammar/jalammar.github.io/blob/master/notebookes/transformer/transformer_positional_encoding_graph.ipyn

11.4 Transformer的改进

  1. 论文“Generating Long Sequences with Sparse Transformers”

https://arxiv.org/abs/1904.10509

论文“Local Self-Attention over Long Text for Efficient Document Retrieval”

https://arxiv.org/abs/2005.04908

12.1.3 自然语言处理预训练的发展

  1. 论文“Deep Contextualized Word Representations”

https://arxiv.org/abs/1802.05365

12.3 GPT模型

  1. 论文“Improving Language Understanding by Generative Pre-Training”

https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

12.3.6 GPT2和GPT3

  1. 论文“Language Models are Unsupervised Multitask Learners”

https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf

论文“Language Models are Few-Shot Learners”

https://arxiv.org/abs/2005.14165

文章“Giving GPT-3 a Turing Test”

https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html

12.4 BERT模型

  1. 论文“BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”

https://arxiv.org/abs/1810.04805

论文“Cloze procedure: A new tool for measuring readability”

https://journals.sagepub.com/doi/10.1177/107769905303000401

论文“RoBERTa: A Robustly Optimized BERT Pretraining Approach”

https://arxiv.org/abs/1907.11692

论文“ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”

https://arxiv.org/abs/1909.11942

14.1.1 实验目标与数据集介绍

  1. 论文“"Neural Chinese Address Parsing”

https://www.aclweb.org/anthology/N19-1346/

14.4.4 训练模型

  1. neural-chinese-address-parsing 中包含的测试脚本conlleval.pl

https://github.com/leodotnet/neural-chinese-address-parsing/blob/master/conlleval.pl

15.7.2 评估模型

  1. GPT2-Chinese

https://github.com/Morizeyao/GPT2-Chinese/blob/master/generate.py

social