Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

The NLP Cypher | 03.07.21
Latest   Machine Learning   Newsletter

The NLP Cypher | 03.07.21

Last Updated on July 24, 2023 by Editorial Team

Author(s): Ricky Costa

Originally published on Towards AI.

The Lookout — “All’s Well” U+007C Homer

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

The NLP Cypher U+007C 03.07.21

The Crow’s Nest

Hey Welcome back! Had a loooong weekend of busy busy, so this week’s NL will be less wordy than usual, but we’ll be back to normalcy next week.

Oh and by the way,

Maybe… the universe is just a giant neural network… U+1F937‍U+2642️

At least that’s the new theory out of MIT. FYI, it sounds eerily similar to Stephen Wolfram’s graph approach to physics.

The only question I have is, who’s running the compute? U+1F937‍U+2642️

The Universe Might Be One Big Neural Network, Study Finds

One scientist says the universe is a giant neural net. The wild concept uses neural net theory to unify quantum and…

www.popularmechanics.com

FYI, we added 25 new notebooks to the Super Duper NLP Repo!! U+1F447

OpenChat

OpenChat is an awesome repo where one can interact with top tier dialogue models with just 1 line of code. Currently, it supports:

  • Microsoft’s DialoGPT : small, medium, large.
  • Facebook’s BlenderBot : small, medium, large, xlarge.

hyunwoongko/openchat

OpenChat is opensource chatting framework for generative models. You can talk with AI with only one line of code…

github.com

AI Index 2021

The yearly and comprehensive report on AI is out. The scope of the report is focused more on a global and strategic scale. For NLP focused content, start on page 62. The report is +200 pages long U+1F648.

AI Index 2021

The 2021 AI Index report is one of the most comprehensive reports about artificial intelligence to date. This latest…

hai.stanford.edu

OpenAI’s Reflection on its Latest Multi-Modal Models

They go in deep on CLIP’s neurons and their representations. They also analyze where they can go wrong.

Multimodal Neurons in Artificial Neural Networks

We've discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or…

openai.com

Mastering Python U+007C The OverFlow

Last week I had part II of this series, here’s III and IV now.

Level Up: Mastering statistics with Python – part 3 – Stack Overflow Blog

Welcome back! This is the third class in our Level Up series on statistics with Python. If you're just tuning in, you…

stackoverflow.blog

Level Up: Mastering statistics with Python — part 4 — Stack Overflow Blog

code-for-a-living March 2, 2021 While many introductory statistics classes teach the CLT, very few actually attempt to…

stackoverflow.blog

YAMNet U+007C Transfer Learning for Audio

YAMNet (“Yet another Audio Mobilenet Network”) is a pretrained model that predicts 521 audio events based on the AudioSet corpus.

Transfer Learning for Audio Data with YAMNet

March 02, 2021 – Posted by Luiz GUStavo Martins, Developer Advocate Transfer learning is a popular machine learning…

blog.tensorflow.org

Several Methods for Updating Neural Networks

Here are the methods discussed:

Update Model on New Data Only

Update Model on Old and New Data

Ensemble Model With Model on New Data Only

Ensemble Model With Model on Old and New Data

How to Update Neural Network Models With More Data – Machine Learning Mastery

Deep learning neural network models used for predictive modeling may need to be updated.

machinelearningmastery.com

Top Data Labeling Software

In-depth analysis of 10 data labeling tools for machine learning datasets.

Data Labeling Software: Best Tools for Data Labeling in 2021 – neptune.ai

In machine learning and AI development, the aspects of data labeling are essential. You need a structured set of…

neptune.ai

Repo Cypher U+1F468‍U+1F4BB

A collection of recently released repos that caught our U+1F441

Gradual Finetune

If you are just fine-tuning your model once, you may be missing out. paper

fe1ixxu/Gradual-Finetune

Gradually fine-tuning in a multi-step process can yield substantial further gains and can be applied without modifying…

github.com

Connected Papers U+1F4C8

Forte U+007C NLP Pipeline Toolkit

A multi-purpose platform for searching documents, information extraction and language generation.

asyml/forte

Forte is a toolkit for building Natural Language Processing pipelines, featuring cross-task interaction, adaptable…

github.com

Connected Papers U+1F4C8

Meta-Curriculum Learning for Machine Translation

Improving the meta-learning (teacher model) of MT for low-resource languages

NLP2CT/Meta-Curriculum

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation (AAAI 2021) Please cite as…

github.com

Connected Papers U+1F4C8

ANEA

Automatically annotates named entities

uds-lsv/anea

ANEA is a tool to automatically annotate named entities in unlabeled text based on entity lists for the use as distant…

github.com

Connected Papers U+1F4C8

RuSentEval

Evaluation toolkit for Russian sentence embeddings.

vmkhlv/rusenteval

RuSentEval is an evaluation toolkit for sentence embeddings for Russian. In this repo you can find the data and scripts…

github.com

Connected Papers U+1F4C8

Learning Chess Blindfolded

Training language models on chess notation. U+1F525U+1F525

shtoshni92/learning-chess-blindfolded

Chess as a testbed for evaluating language models on world state tracking. Pretrained model released via Huggingface…

github.com

Connected Papers U+1F4C8

RAGA

Using Graph attention for the entity alignment task.

zhurboo/RAGA

Relation-aware Graph Attention Networks for Global Entity Alignment – zhurboo/RAGA

github.com

Connected Papers U+1F4C8

Dataset of the Week: Wikipedia-based Image Text (WIT) Dataset

What is it?

A multimodal multilingual dataset. WIT is composed of a curated set of 37.6 million entity rich image-text examples with 11.5 million unique images across 108 Wikipedia languages.

Example

Where is it?

google-research-datasets/wit

Wikipedia-based Image Text (WIT) Dataset is a large multimodal multilingual dataset. WIT is composed of a curated set…

github.com

Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.

For complete coverage, follow our Twitter: @Quantum_Stat

Quantum Stat

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓