The NLP Cypher | 03.07.21
Last Updated on July 24, 2023 by Editorial Team
Author(s): Ricky Costa
Originally published on Towards AI.
NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER
The NLP Cypher U+007C 03.07.21
The Crowβs Nest
Hey Welcome back! Had a loooong weekend of busy busy, so this weekβs NL will be less wordy than usual, but weβll be back to normalcy next week.
Oh and by the way,
Maybeβ¦ the universe is just a giant neural networkβ¦ U+1F937βU+2642οΈ
At least thatβs the new theory out of MIT. FYI, it sounds eerily similar to Stephen Wolframβs graph approach to physics.
The only question I have is, whoβs running the compute? U+1F937βU+2642οΈ
The Universe Might Be One Big Neural Network, Study Finds
One scientist says the universe is a giant neural net. The wild concept uses neural net theory to unify quantum andβ¦
www.popularmechanics.com
FYI, we added 25 new notebooks to the Super Duper NLP Repo!! U+1F447
OpenChat
OpenChat is an awesome repo where one can interact with top tier dialogue models with just 1 line of code. Currently, it supports:
- Microsoftβs DialoGPT : small, medium, large.
- Facebookβs BlenderBot : small, medium, large, xlarge.
hyunwoongko/openchat
OpenChat is opensource chatting framework for generative models. You can talk with AI with only one line of codeβ¦
github.com
AI Index 2021
The yearly and comprehensive report on AI is out. The scope of the report is focused more on a global and strategic scale. For NLP focused content, start on page 62. The report is +200 pages long U+1F648.
AI Index 2021
The 2021 AI Index report is one of the most comprehensive reports about artificial intelligence to date. This latestβ¦
hai.stanford.edu
OpenAIβs Reflection on its Latest Multi-Modal Models
They go in deep on CLIPβs neurons and their representations. They also analyze where they can go wrong.
Multimodal Neurons in Artificial Neural Networks
We've discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, orβ¦
openai.com
Mastering Python U+007C The OverFlow
Last week I had part II of this series, hereβs III and IV now.
Level Up: Mastering statistics with Python – part 3 – Stack Overflow Blog
Welcome back! This is the third class in our Level Up series on statistics with Python. If you're just tuning in, youβ¦
stackoverflow.blog
Level Up: Mastering statistics with Python β part 4 β Stack Overflow Blog
code-for-a-living March 2, 2021 While many introductory statistics classes teach the CLT, very few actually attempt toβ¦
stackoverflow.blog
YAMNet U+007C Transfer Learning for Audio
YAMNet (βYet another Audio Mobilenet Networkβ) is a pretrained model that predicts 521 audio events based on the AudioSet corpus.
Transfer Learning for Audio Data with YAMNet
March 02, 2021 – Posted by Luiz GUStavo Martins, Developer Advocate Transfer learning is a popular machine learningβ¦
blog.tensorflow.org
Several Methods for Updating Neural Networks
Here are the methods discussed:
Update Model on New Data Only
Update Model on Old and New Data
Ensemble Model With Model on New Data Only
Ensemble Model With Model on Old and New Data
How to Update Neural Network Models With More Data – Machine Learning Mastery
Deep learning neural network models used for predictive modeling may need to be updated.
machinelearningmastery.com
Top Data Labeling Software
In-depth analysis of 10 data labeling tools for machine learning datasets.
Data Labeling Software: Best Tools for Data Labeling in 2021 – neptune.ai
In machine learning and AI development, the aspects of data labeling are essential. You need a structured set ofβ¦
neptune.ai
Repo Cypher U+1F468βU+1F4BB
A collection of recently released repos that caught our U+1F441
Gradual Finetune
If you are just fine-tuning your model once, you may be missing out. paper
fe1ixxu/Gradual-Finetune
Gradually fine-tuning in a multi-step process can yield substantial further gains and can be applied without modifyingβ¦
github.com
Connected Papers U+1F4C8
Forte U+007C NLP Pipeline Toolkit
A multi-purpose platform for searching documents, information extraction and language generation.
asyml/forte
Forte is a toolkit for building Natural Language Processing pipelines, featuring cross-task interaction, adaptableβ¦
github.com
Connected Papers U+1F4C8
Meta-Curriculum Learning for Machine Translation
Improving the meta-learning (teacher model) of MT for low-resource languages
NLP2CT/Meta-Curriculum
Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation (AAAI 2021) Please cite asβ¦
github.com
Connected Papers U+1F4C8
ANEA
Automatically annotates named entities
uds-lsv/anea
ANEA is a tool to automatically annotate named entities in unlabeled text based on entity lists for the use as distantβ¦
github.com
Connected Papers U+1F4C8
RuSentEval
Evaluation toolkit for Russian sentence embeddings.
vmkhlv/rusenteval
RuSentEval is an evaluation toolkit for sentence embeddings for Russian. In this repo you can find the data and scriptsβ¦
github.com
Connected Papers U+1F4C8
Learning Chess Blindfolded
Training language models on chess notation. U+1F525U+1F525
shtoshni92/learning-chess-blindfolded
Chess as a testbed for evaluating language models on world state tracking. Pretrained model released via Huggingfaceβ¦
github.com
Connected Papers U+1F4C8
RAGA
Using Graph attention for the entity alignment task.
zhurboo/RAGA
Relation-aware Graph Attention Networks for Global Entity Alignment – zhurboo/RAGA
github.com
Connected Papers U+1F4C8
Dataset of the Week: Wikipedia-based Image Text (WIT) Dataset
What is it?
A multimodal multilingual dataset. WIT is composed of a curated set of 37.6 million entity rich image-text examples with 11.5 million unique images across 108 Wikipedia languages.
Example
Where is it?
google-research-datasets/wit
Wikipedia-based Image Text (WIT) Dataset is a large multimodal multilingual dataset. WIT is composed of a curated setβ¦
github.com
Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.
For complete coverage, follow our Twitter: @Quantum_Stat
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI