
LSTM for Sequence Classification
Last Updated on January 18, 2025 by Editorial Team
Author(s): Sarvesh Khetan
Originally published on Towards AI.
Table of Contents :
- Single Layer Architecture
1. LSTM Architecture
2. Learning in LSTM
3. How LSTM solves issues in RNN
4. Issues with LSTM
5. Pytorch Code - Stacked Layer Architecture
1. Architecture Diagram
2. Pytorch Code

Single Layer Architecture

LSTM Architecture
This is similar to RNN architecture that we saw here just that now we will replace the RNN unit with an LSTM unit

Now let’s discuss what goes on inside the LSTM unit, following video clearly explains the same!!

Learning in LSTM
Since we are using LSTM to solve a classification task we can use cross entropy loss to train the network, as shown below

Now this optimization can be solved using any optimizer i.e. gradient descent / Adam / AdaGrad / … (stochastic or mini batch version). Below lets try solving it using gradient descent

Now to calculate these derivates we will take help of computation graph for this RNN architecture which is shown below



Similarly calculate derivatives of other matrices !!

How LSTM solves issues with RNN
Here we discusses issues with RNN namely vanishing gradient and exploding gradient. Hence to see if LSTM faces with similar issue or not let’s consider the above derivative dE / dWf



Issues with LSTM
LSTMs solved the vanishing gradient problem but LSTMs are computationally very very heavy thus taking huge training time and hence we wanted something which can train much faster and also give at least as good results as LSTMs because LSTMs really gave very good results.

Pytorch Code
# Create a single LSTM cell
lstm_cell = nn.LSTMCell(input_size=10, hidden_size=10)

Stacked Architecture

Architecture Diagram


Pytorch Code
lstm_stack = nn.LSTM(input_size=10, hidden_size=10, num_layers=3)
# 3 single LSTM cells stacked on top of each other
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!
Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Discover Your Dream AI Career at Towards AI Jobs
Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!
Note: Content contains the views of the contributing authors and not Towards AI.