NLP News Cypher | 05.03.20
Last Updated on July 27, 2023 by Editorial Team
Author(s): Ricky Costa
Originally published on Towards AI.
NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER
NLP News Cypher U+007C 05.03.20
Freedom
Last week, so much happened. ICLR was a great digital turnout and there were tons of papers/code drops from the NLP community. As a result, this weekβs newsletter is loaded to the gills.
And just when you thought we could rest from AI conferencesβ¦
Facebook be like:
Facebook Research at ICASSP 2020
Facebook AI researchers are presenting their work virtually at the 45th International Conference on Acoustics, Speechβ¦
ai.facebook.com
BTW, Ubuntu says hi!
Ubuntu 20.04 LTS arrives U+007C Ubuntu
April 23rd 2020: Canonical, the publisher of Ubuntu, today announced the general availability of Ubuntu 20.04 LTS, withβ¦
ubuntu.com
KDNuggets, we U+2764 you too:
The Super Duper NLP Repo: 100 Ready-to-Run Colab Notebooks – KDnuggets
There are 2 major components of a machine learning modeling project of any kind: the data, and the algorithms (andβ¦
www.kdnuggets.com
Oh, and meanwhile, back at the ranch: U+1F6F8βs are real:
This Week:
ICLR Highlights
Meenaβs Heart in a Blender
Text-2-Tabular Data
BLINK
A Mosaic
Stanfordβs Knowledge Graphs
Wolfman Cometh via YouTube
Dataset of the Week: HybridQA
ICLR Highlights
For the TL;DR crowd:
ICLR 2020 Roundup
Firstly, commiserations, again, that Addis Ababa didn't get 1000's of global AI researchers visiting this week but I'dβ¦
www.linkedin.com
Knowledge Graphs Are AβBoomin
Michael Galkin dropped U+1F525U+1F525U+1F525 this week. Per usual, after every major AI conference, Michael sums up the cream of the crop w/r/t graphs and NLP.
His TOC:
- Neural Reasoning for Complex QA with KGs
- KG-augmented Language Models
- KG Embeddings: Temporal and Inductive Inference
- Entity Matching with GNNs
- Bonus: KGs in Text RPGs!
Knowledge Graphs @ ICLR 2020
U+1F44B Hello, I hope you are all doing well during the lockdown. ICLR 2020 went fully virtual, and here is a fully virtualβ¦
medium.com
Check out this code for using Wikipedia KGs to answer open-domain QA:
AkariAsai/learning_to_retrieve_reasoning_paths
This is the official implementation of the following paper: Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richardβ¦
github.com
My fav topic from Galkinβs review: KGs and RPG games U+1F447.
Paper:
GitHub:
rajammanabrolu/KG-A2C
Goal driven language generation using knowledge graph A2C agents. This code accompanies the paper Graph Constrainedβ¦
github.com
Meenaβs Heart in a Blender
A couple of months ago Google made headlines when they released βthe worldβs bestβ chatbot called Meena. Their code was never open-sourced. Well now itβs Facebookβs turn, and theyβve open-sourced their chatbot with 3 model sizes: 90M, 2.7B, and 9.4B. Itβs called Blender.
You can try the 90M model here with our Colab of the week:
Google Colaboratory
Edit description
colab.research.google.com
Text-2-Tabular Data
Google drops BERT on retrieving tabular data with natural language. The takeaway is that instead of using traditional text-2-SQL type queries (which is difficult to scale across various tables), it uses BERT to encode the tables and questions as input. (FYI, rows, columns, and ranks get their own embedding!) This allows for better generalization.
How does it perform?
On the SQA dataset, it takes SOTA from 55.1 to 67.2!
With datasets WIKISQL and WIKITQ, it performs on par with the SOTA.
GitHub:
google-research/tapas
Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weaklyβ¦
github.com
Blog:
Using Neural Networks to Find Answers in Tables
Much of the world's information is stored in the form of tables, which can be found on the web or in databases andβ¦
ai.googleblog.com
BLINK
If youβre looking for an entity linking python library you should check out Facebookβs BLINK. This is a two-stage model using first a retrieval bi-encoder to embed candidates' context and entity descriptions and then a cross-encoder in the 2nd stage. The library uses the 2019/08/01 Wikipedia dump as a knowledge base which means it takes up a hell lot of disk space. The codebase is easy to follow and set-up.
GitHub:
facebookresearch/BLINK
BLINK is an Entity Linking python library that uses Wikipedia as the target knowledge base. The process of linkingβ¦
github.com
A Mosaic
The Allen Institute is doing the most interesting work in the reading comprehension/commonsense regions of NLP. They have great demos and I want to share with you COMeT, their event/commonsense knowledge graph. Itβs pretty good, I queried βI went to the doctorβs officeβ and the graph generated intuitive reasons as to βwhyβ I would go to the office. You should give it a whirl.
Mosaic Knowledge Graphs
Demo of COMeT, a knowledge base construction engine that learns to produce new nodes and connections in commonsenseβ¦
mosaickg.apps.allenai.org
GitHub:
atcbosselut/comet-commonsense
To run a generation experiment (either conceptnet or atomic), follow these instructions: First clone, the repo: gitβ¦
github.com
Stanfordβs Knowledge Graphs
Seminar on Knowledge Graphs U+1F60E, videos included.
CS 520
How should AI explicitly represent knowledge? Department of Computer Science, Stanford University, Spring 2020 Tuesdaysβ¦
web.stanford.edu
Wolfman Cometh via YouTube
A crisp and clear talk about the current state of NLP and future trends from everyoneβs favorite and U+1F917βs very own: Thomas Wolf. Last week they dropped an educational videoβ¦
Here are the topic/time stamps from the video:
Video:
Dataset of the Week: HybridQA
What is it?
Dataset allows for multi-hop QA over tabular data. It contains over 70K question-answer pairs based on 13,000 tables, each table is in average linked to 44 passages.
Sample:
Where is it?
wenhuchen/HybridQA
This repository contains the dataset and code for the paper HybridQA: A Dataset of Multi-Hop Question Answeringoverβ¦
github.com
Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.
If you enjoyed this article, help us out and share with friends!
For complete coverage, follow our Twitter: @Quantum_Stat
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI