Towards AI Can Help your Team Adopt AI: Corporate Training, Consulting, and Talent Solutions.


The NLP Cypher | 12.06.20
Latest   Machine Learning   Newsletter

The NLP Cypher | 12.06.20

Last Updated on July 24, 2023 by Editorial Team

Author(s): Ricky Costa

Originally published on Towards AI.

Landscape with a Marsh (Bril)


The NLP Cypher U+007C 12.06.20


Hey, welcome back! Plenty of NLP to discuss this week as NeurIPS takes off today. Over the last couple of days, the usual suspects opened the research paper firehose. Have a look U+1F447

Carnegie Mellon University at NeurIPS 2020

Carnegie Mellon University is proud to present 88 papers at the 34th Conference on Neural Information Processing…

OpenAI at NeurIPS 2020

Live demos and discussions at our virtual booth.

Microsoft at NeurIPS 2020 – Microsoft Research

Microsoft is delighted to sponsor and attend the 34th Annual Conference on Neural Information Processing System…

Salesforce Research at NeurIPS 2020

This year marks the 34th annual conference on Neural Information Processing Systems (NeurIPS) reimagined for the first…

Super Duper NLP Repo U+270C

We recently made an awesome contribution to the Super Duper NLP Repo, adding 47 notebooks bringing us to 313 total! Added a decent selection of notebooks relating to adapters, the NEMO library, GEDI GPT-2, and PERIN for semantic parsing. Want to thank Abhilash Majumder & Eyal Gruss for their awesome contribution! U+1F60E

Oh, and EMNLP has yet to go away, Eric Wallace et al. released his slides from the conference on the interpretability of NLP models predictions.


(1) Overview of Interpretability

(2) What Parts of An Input Led to a Prediction?

(3) What Decision Rules Led to a Prediction?

(4) Which Training Examples Caused a Prediction?

(5) Implementing Interpretations

(6) Open Problems


Jraph U+007C DeepMind’s GNN Lib

While DeepMind isn’t solving age-old problems in protein folding, they just released a GNN library (in jax). It probably flew under everyone’s radar…

Here’s a basic script for working with graph tuples:


Permalink GitHub is home to over 50 million developers working together to host and review code, manage projects, and…

El GitHub


Jraph (pronounced giraffe) is a lightweight library for working with graph neural networks in jax. It provides a data…

Kaggle Data Science and ML 2020 Survey

Everyone’s favorite data science survey was released:


Coursera most popular learning resource.

A lot data scientists working in small companies (less than 50 employees).

Wow, Jupyter is the go-to IDE in data science(U+1F62C).

Only 15% say transformers are the most commonly used model architecture.

AWS leads cloud, but Google comes in 2nd, (that was a surprise, I would’ve guessed Azure).

Tensorboard more popular than I thought.


State of Data Science and Machine Learning 2020

Download our executive summary for a profile of today's working data scientist and their tools

Data Flow

A blog from Google Cloud (with code snippets) discussing how to create data pipelines for your ML models. It focuses on batching, the singleton model pattern, and dealing with threading/processing. A helpful read for those deploying in the enterprise.

ML inference in Dataflow pipelines U+007C Google Cloud Blog

In this blog, we covered some of the patterns for running remote/local inference calls, including; batching, the…

MSFP U+007C Data Type for Efficient Inference

Microsoft invented a new data type used in data representation with a focus on improved latency during model inference called… MSFP.

[MSFP] enables dot product operations — the core of the matrix-matrix and matrix-vector multiplication operators critical to DNN inference — to be performed nearly as efficiently as with integer data types, but with accuracy comparable to floating point.

Apparently MS uses MSFP in Project Brainwave, their real-time production-scale DNN inference in the cloud. As models get bigger, big tech is getting smarter on how to deal with scale and inference in production.

A Microsoft custom data type for efficient inference – Microsoft Research

AI is taking on an increasingly important role in many Microsoft products, such as Bing and Office 365. In some cases…

Recommenders Update

When we first spoke about TensorFlow’s Recommenders library several newsletters ago, I was really excited but TF has upped the ante by building deep learning recommender models “that can retrieve the best candidates out of millions in milliseconds.” U+1F440

It uses Google’s ScaNN library released this past summer, you can check out the repo here:

The second part of their update is their leveraging of DCN (Deep cross networks) models.

TensorFlow Recommenders: Scalable retrieval and feature interaction modelling

November 30, 2020 – Posted by Ruoxi Wang, Phil Sun, Rakesh Shivanna and Maciej Kula (Google) In September, we…

Repo Cypher U+1F468‍U+1F4BB

A collection of repos/papers that caught our U+1F441


DframCy provides clean APIs to convert spaCy’s linguistic annotations, Matcher and PhraseMatcher information to Pandas dataframe.


DframCy is a light-weight utility module to integrate Pandas Dataframe to spaCy's linguistic annotation and training…

Wolfram’s Model Stash

Wolfram has his own Deep Learning model hub. Just stumbled upon this one when I saw one of Wolfram’s tweets earlier this week. U+1F648

Wolfram Neural Net Repository

The Wolfram Neural Net Repository is a public resource that hosts an expanding collection of trained and untrained…


The algorithm receives a book and it discovers main characters and main relations between characters.

Oldie but goodie.


The algorithm receives a book and it discovers main characters, main relations between characters and more powerful…


New research paper on the improvement of memory and latency w/r/t BERT inference that utilizes several techniques in compression and model architecture. The authors boast of “achieving up to 2.4× and 13.4× inference latency and memory savings, respectively, with less than 1%-pt. drop in accuracy.” U+1F440


OCR and Deep Learning

Couple of weeks ago on LinkedIn I posted a question regarding current OCR techniques that led to a great discussion with my connections. This week, I found this U+1F447. WINNING!


Long Text Classification with BERT

Looking to classify text documents with more than 250 words per doc?

Notebook (U+1F525)


Permalink GitHub is home to over 50 million developers working together to host and review code, manage projects, and…


Using BERT For Classifying Documents with Long Texts

How to fine-tuning Bert for inputs longer than a few words or sentences

Dataset of the Week: XED

What is it?

A multi-lingual dataset consisting of emotion annotated movie subtitles from OPUS used for sentiment analysis. The task is formulated as multi-label classification.

Where is it?


This is the XED dataset. The dataset consists of emotion annotated movie subtitles from OPUS. We use Plutchik's 8 core…

Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.

For complete coverage, follow our Twitter: @Quantum_Stat

Quantum Stat

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓