NLP News Cypher | 05.10.20

Last Updated on July 24, 2023 by Editorial Team

Author(s): Ricky Costa

Originally published on Towards AI.

Photo by Hendrik Cornelissen on Unsplash

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

NLP News Cypher U+007C 05.10.20

Traveler

And we’re back. We’ve released another update to the Big Bad NLP Database! Another 50 datasets taking us past 400 total and yet, still so many left to go. I would like to thank all contributors: Martin Schmitt, Rachel Bawden, Devamanyu Hazarika, Panagiotis Simakis, and Andrew Thompson.

Oh, someone gifted me an award on Reddit, not sure what this means. But I have a teddy bear now (it’s the brown looking thing below), it’s called a Hugz award, U+1F937‍U+2642️ Cheers!

Also,U+1F6F8’s continue to exist.

declassified

An awesome peep on Reddit showed how they enhanced the video quality of the 3 UFO videos released several weeks ago. I couldn’t see much of a delta in the video quality, but it’s still interesting to know their workflow. Outline below:

Happy Momma’s Day U+1F469‍U+1F466‍U+1F466 !

FYI, a surprise coming this week, stay tuned!

This Week:

From RoBERTa Import Scratch

InferKit U+007C Bringing AutoML to NLP

Papers w/ Code Has A Paper w/ Code

The Keras Site

TL;DR Summarization

Dataset of the Week: WikiTableQuestions

From RoBERTa Import Scratch

If you want to pre-train a SOTA model like RoBERTa from scratch check out this codebase (also includes fine-tuning)! The blog is really intuitive because there are code blocks annotating the author’s workflow, in addition to a Colab! It goes over data, tokenizers, and model handling.

FastHugs: Language Modelling with Tranformers and Fastai

This aims to be an end-to-end description with code of how to train a transformer language model using fastai (v2) and…

www.ntentional.com

Colab of the Week:

Google Colaboratory

Edit description

colab.research.google.com

InferKit U+007C Bringing AutoML to NLP

From the maker of TalkWithTransformer, Adam King shares his latest ML project: InferKit! What is it? For now, it allows you to do state-of-the-art text classification WITHOUT any code, and it’s super simple to use. No need for hyper-parameter tuning, you just drop your CSV in the browser, click train, and InferKit’s cloud architecture does the rest. After training is done, you get an email alert, follow the link and it comes shipped with its own endpoint APIU+1F525U+1F525. I’ve already tried it and it was seamless. Soon, InferKit will also be able to conduct text generation!

FYI, anyone who signs up gets $25 of free credits. Live dangerously, give it a whirl.

App:

InferKit

Train state-of-the-art machine learning models to categorize your data with custom labels-no coding required. Use the…

inferkit.com

Papers w/ Code Has A Paper and Code

A great update from Paperswithcode. Their database now holds over 2,500 leaderboards! In addition, they have a new extraction model, AxCell, that allows you to extract table results from an ML research paper!

Surprise, their model is open-sourced:

paperswithcode/axcell

This repository is the official implementation of AxCell: Automatic Extraction of Results from Machine Learning Papers…

github.com

The Keras Site

New site for Keras. Not a surprise as Mr. Chollet has recently been dropping gems on Twitter along with many Colab notebooks (I’ve included one below). The site comes with a new batch of guidelines and examples.

Guides:

Developer guides

Our developer guides are deep-dives into specific topics such as layer sublassing, fine-tuning, or model saving…

keras.io

Site:

Keras: the Python deep learning API

Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load: it…

keras.io

Enjoy the neural network hallucinations in this Colab:

Google Colaboratory

Edit description

colab.research.google.com

TL;DR Summarization

Allen Institute's demo for summarizing computer science research papers is here. In addition, they’ve released SCITLDR, a new dataset with 3,935 TLDRs of author-written summaries. U+1F60E

FYI, on SCITLDR, it outperforms BART! For best results, you can feed it the abstract, intro, and conclusion of your test set. U+1F447

Demo:

SciTLDR

Edit description

scitldr.apps.allenai.org

Paper:

LINK

Dataset of the Week: WikiTableQuestions

What is it?

Dataset is for the task of question answering on a semi-structured HTML table.

Sample:

Where is it?

ppasupat/WikiTableQuestions

Version 1.0.2 (October 4, 2016) The WikiTableQuestions dataset is for the task of question answering on semi-structured…

github.com

Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.

If you enjoyed this article, help us out and share with friends!

For complete coverage, follow our Twitter: @Quantum_Stat

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

LAI #66: Information Theory for People in a Hurry

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

NLP News Cypher | 05.10.20

Author(s): Ricky Costa

NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER

NLP News Cypher U+007C 05.10.20

Traveler

This Week:

From RoBERTa Import Scratch

FastHugs: Language Modelling with Tranformers and Fastai

This aims to be an end-to-end description with code of how to train a transformer language model using fastai (v2) and…

Google Colaboratory

Edit description

InferKit U+007C Bringing AutoML to NLP

InferKit

Train state-of-the-art machine learning models to categorize your data with custom labels-no coding required. Use the…

Papers w/ Code Has A Paper and Code

paperswithcode/axcell

This repository is the official implementation of AxCell: Automatic Extraction of Results from Machine Learning Papers…

The Keras Site

Developer guides

Our developer guides are deep-dives into specific topics such as layer sublassing, fine-tuning, or model saving…

Keras: the Python deep learning API

Keras is an API designed for human beings, not machines. Keras follows best practices for reducing cognitive load: it…

Google Colaboratory

Edit description

TL;DR Summarization

SciTLDR

Edit description

Dataset of the Week: WikiTableQuestions

What is it?

Sample:

Where is it?

ppasupat/WikiTableQuestions

Version 1.0.2 (October 4, 2016) The WikiTableQuestions dataset is for the task of question answering on semi-structured…

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥