1.4.4 |
GitHub开源主页 |
OpenAI:https://github.com/openai/。 Microsoft:https://github.com/microsoft。 Google Research:https://github.com/google-research/。 Pytorch:https://github.com/pytorch。 Hugging Face:https://github.com/huggingface。 清华大学NLP实验室:https://github.com/thunlp。 北京大学语言计算与机器学习组:https://github.com/lancopku。 |
一些有用的开源项目 |
funNLP:https://github.com/fighting41love/funNLP。 HanLP:https://github.com/hankcs/HanLP。 Chinese Word Vectors中文词向量:https://github.com/Embedding/Chinese-Word-Vectors。 中文GPT2:https://github.com/Morizeyao/GPT2-Chinese/。 UER-py:https://github.com/dbiir/UER-py。 |
|
2.1.3 |
博客 |
https://es2q.com/blog/tags/installpy/ |
清华TUNA提供的Pip源 |
https://mirrors.tuna.tsinghua.edu.cn/help/pypi/ https://pypi.tuna.tsinghua.edu.cn/simple |
|
2.3.3 |
GitHub仓库 |
https://github.com/goto456/stopwords |
2.5.2 |
博客 |
https://es2q.com/blog/tags/py-fun/ |
3.1 |
PyTorch在GitHub的代码仓库地址 |
https://github.com/pytorch/pytorch |
3.2.1 |
TensorFlow机器学习框架在GitHub的代码仓库 |
https://github.com/tensorflow/tensorflow |
3.2.3 |
PaddlePaddle在GitHub的代码仓库 |
https://github.com/paddlepaddle/paddle |
3.2.4 |
CNTK在GitHub的代码仓库 |
https://github.com/Microsoft/CNTK |
3.2.1 |
安装CPU版本PyTorch代码的网址 |
https://download.pytorch.org/whl/torch_stable.html |
3.3.2 |
英伟达官方驱动程序(GeForce)的下载页面 |
https://www.nvidia.cn/geforce/drivers/ |
英伟达的官方网站上cuDNN的下载页面 |
https://developer.nvidia.com/cudnn |
|
CUDA的下载页面 |
https://developer.nvidia.com/cuda-downloads |
|
3.3.3 |
Docker官方提供的Docker Hub |
https://hub.docker.com/ |
PyTorch官方有关于历史版本的页面 |
https://pytorch.org/get-started/previous-versions/ |
|
3.4 |
transformers的GitHub仓库首页 |
https://github.com/huggingface/transformers |
通过代码安装Transformers中的网址 |
https://github.com/huggingface/transformers |
|
3.5 |
Apex的代码仓库地址为 |
https://github.com/NVIDIA/apex |
4.8.3 |
IMDB数据集 |
http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz |
TREC数据集 |
http://cogcomp.org/Data/QA/QC/train_5500.label http://cogcomp.org/Data/QA/QC/TREC_10.label |
|
斯坦福自然语言推理数据集 |
http://nlp.stanford.edu/projects/snli/snli_1.0.zip |
|
torchtext.datasets.MultiNLI数据集 |
http://www.nyu.edu/projects/bowman/multinli/multinli_1.0.zip |
|
torchtext.datasets.XNLI数据集 |
http://www.nyu.edu/projects/bowman/xnli/XNLI-1.0.zip |
|
Multi30k数据集 |
http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/validation.tar.gz http://www.quest.dcs.shef.ac.uk/wmt17_files_mmt/mmt_task1_test2016.tar.gz |
|
6.3.1 |
S-MSRSeg(微软亚洲研究院计算语言计算组 )下载地址 |
https://www.microsoft.com/en-us/download/details.aspx?id=52522 |
6.3.2 |
ICTCLAS官网 |
http://ictclas.nlpir.org/ |
ICTCLAS代码仓库地址 |
https://github.com/NLPIR-team/NLPIR |
|
ICTCLAS的源码 |
https://github.com/NLPIR-team/nlpir-analysis-cn-ictclas |
|
6.3.3 |
结巴分词 |
https://github.com/fxsjy/jieba |
6.3.4 |
pkuseg项目的Releas页面 |
https://github.com/lancopku/pkuseg-python/releases |
8.2.4 |
腾讯 AI 实验室发布的中文词嵌入的预训练权重 |
https://ai.tencent.com/ailab/nlp/zh/embedding.html |
8.3.2 |
安装GloVe |
http://github.com/stanfordnlp/glove |
8.4.1 |
腾讯AI实验室的中文词向量主页 |
https://ai.tencent.com/ailab/nlp/zh/embedding.html |
腾讯AI实验室的中文词向量下载地址为 |
https://ai.tencent.com/ailab/nlp/zh/data/Tencent_AILab_ChineseEmbedding.tar.gz |
|
9.2.5 |
开源的Beam Search实现参考 |
https://github.com/budzianowski/PyTorch-Beam-Search-Decoding/ |
9.3.1 |
IWSLT 2015数据集 |
https://wit3.fbk.eu/ |
IWSLT 2015数据集主页 |
https://wit3.fbk.eu/2015-01 |
|
10.5 |
Multihop Attention实现的开源仓库开源仓库 |
https://github.com/yolomeus/multihop-attention-pytorch/ |
|
|
|
12.1.1 |
文章 |
https://zhuanlan.zhihu.com/p/93781241 |
ImageNet数据集 |
http://www.image-net.org/ |
|
12.3.6 |
GPT-2开源代码仓库地址 |
https://github.com/openai/gpt-2 |
GPT-3开源代码仓库地址 |
https//github.com/openai/gpt-3 |
|
12.4.1 |
Google的BERT代码仓库地址 |
https://github.com/google-research/bert |
BERT的PyTorch实现 |
https://github.com/huggingface/transformers |
|
12.5.1 |
Transformers的开源主页 |
https://github.com/huggingface/transformers/ |
Transformers的文档地址 |
https://huggingface.co/transformers/ |
|
12.6.1 |
TAL-EduBERT开源仓库地址 |
https://github.com/tal-tech/edu-bert |
12.6.2 |
albert_zh开源仓库地址 |
https://github.com/brightmart/albert_zh |
13.1 |
论文使用数据集 |
https://github.com/leodotnet/neural-chinese-address-parsing |
13.5.2 |
HTML5界面 |
https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/css/bootstrap.min.css https://code.jquery.com/jquery-3.1.1.slim.min.js https://cdnjs.cloudflare.com/ajax/libs/tether/1.4.0/js/tether.min.js https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/js/bootstrap.min.js |
页面效果网址 |
http://127.0.0.1:1234 |
|
14.1.1 |
chinese-poetry数据集仓库地址 |
https://github.com/chinese-poetry/chinese-poetry |
14.1.2 |
funNLP仓库提供的繁简体字对照表地址 |
https://github.com/fighting41love/funNLP/blob/master/data/繁简体转换词库/fanjian_suoyin.txt/fanjian_suoyin.txt |
14.5.1 |
Chinese Word Vectors 中文词向量 |
https://github.com/Embedding/Chinese-Word-Vectors |
14.7.1 |
预训练模型 |
https://github.com/Morizeyao/GPT2-Chinese https://github.com/Morizeyao/GPT2-Chinese.git |
14.7.2 |
代码中的网址 |
http://arxiv.org/abs/1904.09751 https://gist.github.com/thomwolf/1a5a29f6962089e871b94cbd09daf317 |
14.8.2 |
代码中的网址 |
https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/css/bootstrap.min.css https://code.jquery.com/jquery-3.1.1.slim.min.js https://cdnjs.cloudflare.com/ajax/libs/tether/1.4.0/js/tether.min.js https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/js/bootstrap.min.js |