NLP News Cypher | 03.01.20
Last Updated on July 24, 2023 by Editorial Team
Author(s): Ricky Costa
Originally published on Towards AI.
NATURAL LANGUAGE PROCESSING (NLP) WEEKLY NEWSLETTER
NLP News Cypher U+007C 03.01.20
Beyond Here Lies Nothingβ¦
As you may have guessed, current laws that govern simple systems (i.e. Newtonian physics) allow for the observation of independent variables. That is, if you were to observe a variable in this so-called simple system, information isnβt lost from other variables in the system of observation.
Observing the trajectory of Venus doesnβt impact the trajectory of Earth.
However, this independence becomes difficult to observe accurately as we increase the number of variables, and where we find existing relationships between variables. These variables, that usually come in clusters or systems, which are inter-dependent are the units that make up a complex system.
Various complex systems exist in present reality, or what Neo would call the Matrix, such as: evolutionary processes, climate, the brain, language (NLP?), and even the stock marketβ¦
Letβs take a variable βsentimentβ and a complex system βstock marketβ. If we were to observe the state of the stock market from the perspective of news headlines and attempt to classify whether a headline is bullish, bearish or neutral; we would soon realize we are in the throes of complexity.
Hereβs an example, letβs say we have this headline:
Gold is up 6% in the pre-market as the downward pressure of the coronavirus outbreak weighs on equity stocks.
What do you think is the ground-truth sentiment for this headline? Well gold is up thatβs good (bullish right?), but wait itβs saying equities are down (bearish then?), right but itβs saying both pos/neg statements in the headline (so itβs neutral right?!U+1F937βU+2642οΈ). It seems as though the packets of information in the headline are not independent, we lose information from reducing sentiment down to a clause. You guessed it, natural language is hard homies!
Let us meditate on this until next week (cliffhanger), in the meantime, this is what our upcoming demo of a real-time platform that classifies financial news by topics/sentiment looks like U+1F447. Havenβt finished deploying, if you want early access please DM me on El Twitter.
How was your week?
This Week:
Machine Learning TOKYO
Decompose, that is the Question
NLP Getting Multi-Lingual
Show me TensorFlow for $100 Alex
Colab Demo for NER Task
Hey BERT⦠Welcome to the Matrix
Dataset of the Week: HotpotQA
Machine Learning TOKYO
Sometimes, a repo page comes along and saves the day. This bad boy holds links to the MIT lecture series on a variety of topics from CV, NLP, and RL!
Machine-Learning-Tokyo/AI_Curriculum
Open Deep Learning and Reinforcement Learning lectures from top Universities like Stanford University, MIT, UCβ¦
github.com
Decompose, that is the Question
New paper shows how decomposing a complex question into small sub-questions helps improve performance on the task of question answering. They use an unsupervised decomposition model to decompose questions extracted from the Common Crawl, they then use a standard QA model to answer them which is then used to downstream on multi-hop questions on HotpotQA dataset.
Thread:
A thread written by @EthanJPerez
New! "Unsupervised Question Decomposition for Question Answering": We decompose a hard Q into several, easier Qs withβ¦
threader.app
Paper:
NLP Getting Multi-Lingual
Mr. Abed built a sentiment analysis tool for the Arabic language using MULTIFIT model and deployed on Heroku!
Demo:
Arabic Text Classification
Using this neural nets model (MULTIFiT), you can classify Arabic reviews or similar text as positive or negativeβ¦
arabic-nlp.herokuapp.com
Show me TensorFlow for $100 Alex
TensorFlow announced last week they will retweet models that you share on their ML tracking platform TensorBoard. If you want your app to get coverage either on social media or possibly at their dev summit, hereβs the details:
Site:
TensorBoard.dev
A managed TensorBoard experience that lets you upload and share your ML experiment results with anyone.
tensorboard.dev
Colab Demo for NER Task
Sick of using GPU in your colab notebook U+1F622? Mr. Rush has released a colab notebook that uses TPUs to train a transformer for named entity recognition on PyTorch! (it uses PyTorch Lightning)
Colab:
Google Colaboratory
Edit description
colab.research.google.com
Code:
huggingface/transformers
U+1F917 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch. β¦
github.com
Hey BERT⦠Welcome to the Matrix
Thereβs a hidden treasure trove of treats on GitHub. Someone found a repo that holds A TON of papers on everything BERT! And I mean everything!
- Downstream task
- Generation
- Modification (multi-task, masking strategy, etc.)
- Transformer variants
- Probe
- Inside BERT
- Multi-lingual
- Other than English models
- Domain specific
- Multi-modal
- Model compression
- Misc.
tomohideshibata/BERT-related-papers
BERT-related papers. Contribute to tomohideshibata/BERT-related-papers development by creating an account on GitHub.
github.com
Dataset of the Week: HotpotQA
What is it?
βItβs a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems.β
Sample:
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Explore HotpotQA
hotpotqa.github.io
Where is it?
HotpotQA
HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supportingβ¦
hotpotqa.github.io
Every Sunday we do a weekly round-up of NLP news and code drops from researchers around the world.
If you enjoyed this article, help us out and share with friends or social media!
For complete coverage, follow our twitter: @Quantum_Stat
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI