The NLP Cypher | 02.21.21
The NLP Cypher | 02.21.21

There's a group of ML hackers attempting to recreate GPT-3 on their own.

Earlier this year, EleutherAI sent data nerds buzzing when they released their pile dataset (825 GB English text corpus targeted at training large-scale language models) paper. This breakthrough takes care of the data problem, now all they need is the compute:

They are building it using Tensorflow's Mesh library.


An implementation of model & data parallel GPT2 & GPT3-like models, with the ability to scale up to full GPT3 sizes…

Their discord server:

Join the EleutherAI Discord Server!

Check out the EleutherAI community on Discord – hang out with 3,168 other members and enjoy free voice and text chat.

Oh, and Hello Mars!


PyTorch U+007C Ray and Distributed Training

If you want to stay on top of the latest distributed training with PyTorch and Ray, this is a healthy intro:

Getting Started with Distributed Machine Learning with PyTorch and Ray

Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning…

Transformers Interpret

“Transformers interpret allows any transformers model to be explained in just two lines. It even supports visualizations in both notebooks and as savable html files.”

So for example if you were doing sentiment analysis on the sentence below:

“I love you, I like you”

This output U+1F447 would tell you what words have the biggest impact on inference.

[(‘BOS_TOKEN’, 0.0),
(‘I’, 0.46820529249283205),
(‘love’, 0.46061853275727177),
(‘you’, 0.566412765400519),
(‘,’, -0.017154456486408547),
(‘I’, -0.053763869433472),
(‘like’, 0.10987746237531228),
(‘you’, 0.48221682341218103),
(‘EOS_TOKEN’, 0.0)]

Then you visualize it with 1 line of code:



Transformers Interpret is a model explainability tool designed to work exclusively with U+1F917 transformers. In line with…


“ConvLab-2 is an open-source toolkit that enables researchers to build task-oriented dialog systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems.”


ConvLab-2 is an open-source toolkit that enables researchers to build task-oriented dialog systems with…

Question Generation Tutorial on Udemy

The creator of QuestGen library, Ramsri Golla, has a new course on Udemy!

Here's a description of what you'll learn in case you are interested:

  • Generate assessments like MCQs, True/False questions etc from any content using state-of-the-art natural language processing techniques.
  • Apply recent advancements like BERT, OpenAI GPT-2, and T5 transformers to solve real-world problems in edtech.
  • Use NLP libraries like Spacy, NLTK, AllenNLP, HuggingFace transformers, etc.
  • Use Google Colab environment to run all these algorithms.
25% Off Coupon:

Question Generation using Natural Language processing

This course focuses on using state-of-the-art Natural Language processing techniques to solve the problem of question…

MIT CS Courses

Electrical Engineering and Computer Science courses at MIT.

Electrical Engineering and Computer Science

Graduates of MIT's electrical engineering and computer science department work in diverse industries and conduct…

Wiki’s API

Article describing the genesis of Wikipedia’s API, the problem of originally not having a holistic API strategy at the Wikimedia Foundation (WMF) and their solution to this problem. The API was completed in December of 2020.

The New API for Wikipedia

I recently left my job at the Wikimedia Foundation (WMF) to head up engineering at MTTR. I'm proud of the hard work my…

Source Code:


Sample client for the Wikimedia API Platform. Contribute to wikimedia/apiclient-wiki development by creating an account…

Docker Swarm Implementation

Container Orchestration With Docker Swarm

NLP Cloud is a service I have contributed to recently. It is using several interesting technologies under the hood so I…

Papers Without Code U+1F62C

Where unreproducible papers come to live…

Papers without code – where unreproducible papers come to live

where unreproducible papers come to live

where unreproducible papers come to

Repo Cypher U+1F468‍U+1F4BB

A collection of recently released repos that caught our U+1F441

65 Million Probably Asked Questions and New Retriever Model

A new QA-pair retriever model, RePAQ, to complement Probably Asked Questions (PAQ), a resource of 65M automatically-generated QA-pairs.


This repository contains code and models to support the research paper PAQ: 65 Million Probably-Asked Questions and…

Connected Papers U+1F4C8

Fact Check Summarization

Abstractive Summarization using two methods:

1. JAENS: joint entity and summary generation

2. Summary-worthy entity classification with summarization (multi-task learning)

This approach is interested in handling the factual consistency of entities in abstractive summarization (AS), which is an ongoing research problem.

*runs on fairseq*


We provide the code for the paper "Entity-level Factual Consistency of Abstractive Text Summarization", by Feng Nan…

Connected Papers U+1F4C8

Emoji Transfer

Training transformers for sentiment analysis with emoji data.


This is the repository for Emoji-Based Transfer Learning for Sentiment Tasks. Datasets…

Connected Papers U+1F4C8

Relation Extraction Over Universal Graph

Distantly Supervised Relation Extraction (DS-RE) over knowledge graph and textual data.


Codes and datasets for our paper "Two Training Strategies for Improving Relation Extraction over Universal Graph" We…

Connected Papers U+1F4C8

Apache Log Generator

Automating the parsing task of Apache logs by formulating it as a machine translation (MT) task.


This repository contains tools used for generating synthetic Apache logs and the tools needed to parse reference…

Connected Papers U+1F4C8


New question answering evaluation benchmark. Takes in consideration on how the deployment of a QA model can impact performance. For example, QA interfaces such as speech, text or translation can induce unique inference error that most evaluation benchmarks don’t consider.


All materials for the paper

Connected Papers U+1F4C8

Optimizing Inference on CPU for Transformers

Empirical analysis of scalability and performance of inferencing a Transformer-based model on CPUs.

Optimizing Inference Performance of Transformers on CPUs

The Transformer architecture revolutionized the field of natural language processing (NLP). Transformers-based models…

Connected Papers U+1F4C8

Exploring Transformers for NLG

A pithy introduction to transformers of GPT, BERT, and XLNET for NLG.

Connected Papers U+1F4C8

Dataset of the Week: ArtEmis

A dataset that associates human emotions with artworks and contains explanations in natural language of the rationale behind each triggered emotion.


Where is it?


ArtEmis: Affective Language for Art Stanford University 1 LIX, Ecole Polytechnique 2 King Abdullah University of…

Quantum Stat

