Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

How to Build an End-to-End Deep Learning Portfolio Project
Latest

How to Build an End-to-End Deep Learning Portfolio Project

Last Updated on March 4, 2021 by Editorial Team

Author(s): Yash Prakash

Deep Learning

The complete guide to the steps I used for building a complete, real-world significant project to showcase proudly on myΒ profile.

Photo by Octavian Dan onΒ Unsplash

It was in the late December 2020 when one evening, I was casually scrolling through my Twitter timeline that I caught a tweet from a famous YouTuber that I followed and I paused. He had tweeted about how it was a pain to go through the huge number of comments that each of this videos received and how too often, so many good commentsβ€Šβ€”β€Što which he would’ve really loved to reply toβ€Šβ€”β€Šget lost in the sheerΒ volume.

Being a data science practitioner, I was intrigued by the idea of efficiently handling such a huge inflow of comments on videos. Upon thinking about it for a few hours, I was ready to believe that it really was a genuineΒ problem.

It was then that the idea of doing a project based on that particular use case was born. I wanted to do something to make shifting through the massive number of commentsΒ easier.

In this article, I will be going over how I built a full portfolio project on this idea and how you can find a particular problem to solve and build a project around itΒ too.

So, hanging on to this thought, my pointΒ is:

Focus on a real problem that you might be able to solve through yourΒ skills.

It is really necessary to focus on a problem that a colleague you know, a family member or a friend or anyone who has chosen to share their experiences in a field that you are passionate about, haveΒ faced.

Forget about deep learning for a while, these are the series of nine questions that you should ask yourselfΒ first:

  1. What problem are you going toΒ solve?
  2. Why do you need to solveΒ it?
  3. What kind of data will the problem require and can you access itΒ somehow?
  4. If yes, then how and from where do you access it? Through a public API, some web scraping, Kaggle, Google datasets, GitHub,Β where?
  5. Once you have the dataβ€Šβ€”β€Šhow do you clean it and make it usable to solve thatΒ problem?
  6. How do you go about deciding on a modelling approach?
  7. How will you know which is the good solution? Or is there a good solution? Can you defineΒ it?
  8. What tools and libraries available can you use toΒ model?

and finally,

9. How will your results lookΒ like?

When you decide on a problem, you automatically start thinking about the next steps from the above list. It’s very natural to go down that route and start converting your idea into a full fledged application.

Therefore, it all begins with a problem and how you will be attempting to solveΒ it.

Go ahead and jot down that list now. It is the one I use when doing my own projectsΒ too!

Once you have, let’s moveΒ on.

From here on, I will go over each of those nine points and describe how I built a whole project aroundΒ it.

Let’s go!

Photo by Armand Khoury onΒ Unsplash

1. What problem am I going toΒ solve?

Like I mentioned in the little anecdote earlier, I like to define this particular point in a singleΒ line.

It makes it simple, brief, easy to understand and hence, actionable.

So here went my answer to the question:

I want to make it easier to analyse the comments on a YouTubeΒ video.

That’s it. That’s my motivation to make this project. Now go ahead and defineΒ yours.

2. Why do I need to solve thatΒ problem?

Immediately as I’d begun to think about that tweet, there was this feeling inside me that told me that this problem can certainly be solved through some deep learning approach.

It would be fantastic if a deep learning technique could filter out the comments and make sure only the relevant ones are showcased first and as a group so that it is easier for the user to read through them, and hence choose to reply to them asΒ well.

So my end goal becameβ€Šβ€”β€ŠI want to use my skills to help provide a solution to this problem. And because it would be an appropriate use of natural language processing, as well as a challenge to my ability to bring together a viable solution asΒ well.

The main takeaway from this question isβ€Šβ€”β€Šyou need to know why you want to solve this problem. Is there a need? Is no good solution available? And if something is available, how do go about making yours different?

3. What kind of data will the problem require and can I access itΒ somehow?

This question was easier to answerβ€Šβ€”β€ŠI will need to access comments from the videosΒ somehow.

Through research, I came to know about the convenient YouTube Data API that allows us to do just thatβ€Šβ€”β€Šfetch comments. Now I only needed to write a script around it to build my own commentsΒ dataset.

For my project, I’d also answered the question no. 4 within this question itself. 😁

This step is crucial in making sure you can actually move on to think about a possible way to model the problem through deep learning once you have a source ofΒ data.

Now that you know how and where to get access to the data, go ahead and obtain it or even better, try to build your own dataset like IΒ did!

5. Once I have the dataβ€Šβ€”β€Šhow do I clean it and make it usable to solve thatΒ problem?

I’d collected all the comments I’d want to use for the modelling later. Now came the decision to actually transform the data well to be able to feed it into whatever model I spin upΒ later.

Every project requires a different set of steps and practices to transform the data. Hence, spending quality time with your data is very important.

In my case, I had a csv file of approximately 5000 comments from a video I’d selected. I’d saved it in the form of four columns for every row of commentβ€Šβ€”β€Šthe text, the author, the comment-id, and the likeΒ count.

Looking closely at the comments, I found that typically, the creator I’d fetched the comments from received two kinds of comments:

One, in which a commentor made sure to thank him for the video or generally applaud his film-making, editing skills, etc. These were typically very short, concise comments.

And two, in which the commentor made a remark about a particular thing he liked the most in the video and took some time to really write in depth about it. These comments often ended with a thank you message too. Typically, these comments were on the longer side of the spectrum.

These two type of comments gave me an idea of segregating them into two categories and hence, perform different operations on each, according to the features I will decide later on to include in theΒ project.

Therefore, looking back, the main points to takeaway from this question isΒ that:

Make sure to spend a while cleaning, slicing, transforming your data according to your needs. Research, think and write about what features you’d want to have and go about transforming your data according toΒ those.

Now, let’s go over the brain of the application. The modelling steps.Β πŸ˜„

6. How do I go about deciding on a modelling approach?

It was clear to me from the start that I’ll need to experiment with various ways to implement NLP applications into my project. It was important that I spend time studying and researching exactly what will I need to approach my desiredΒ results.

But first, I wanted to define what I wanted to do in theΒ app.

I settled on the following:

  • Make a way to bring out the important topics talked about in the commentsβ€Šβ€”β€Šcluster themΒ together
  • Implement a semantic retrieval algorithm to query similar comments from the corpus with respect to a given searchedΒ topic

These were the two main features of my project. Later, I included two other thingsΒ too:

  • Display top emotes used by people in the comments (something diverted from mainstream and a bit moreΒ fun)
  • Display the comments as some neat, pretty word clouds! (something aestheticΒ πŸ™‚Β )

It took me weeks to study, research and settle on things that I wanted to implement in theΒ project.

Remember thisβ€Šβ€”β€ŠNot everything that comes to your mind instinctively can really be included in the final application, and not everything that comes to you too late into building the final product can’t be included.

There is no hard and fast rule to abide by here. It is your project. You can include and exclude things as and when you like. So spend time doing justΒ that.

Let’s go over the next question.

7. How will I know which is the good solution? Or is there a good solution? Can I defineΒ it?

This step is almost as important as the previousΒ one.

The features I’d decided to include in the app directly answered this question.

If I can make looking through thousands of comments and picking out some good ones really easy for the userβ€Šβ€”β€Šspecifically, just a tap of a buttonβ€Šβ€”β€Šthat would be the ideal solution, the one I’m lookingΒ for.

Making it as simply as possible from the user’s perspective is crucial to getting a good project done. Not everyone has the eyes of the developer who sits behind the computer screen for hours and days to write theΒ project.

This is the definition of a good project for me. Accomplising your goal by implementing the required features is as important as making sure those features are incredibly simple to understand asΒ well.

The solution should be crisp, clean and easy to interpret and analyse. Also, if you can make it easy AND fun, it would be fantastic.

That was my thought as I began to search for ways to make sure I could also, a bit later if possible, build out a frontend for the project asΒ well.

Now comes the last two questions. These are the final stepping stones to getting a good product rolled out so do read themΒ through.

8. and 9. What tools and libraries available can I use to model? and, How will my results lookΒ like?

There were a variety of NLP libraries available for my usecase. According to the features I’d settled upon, I decided to use the following:

  • for clustering comments with similar topicsβ€” I used sentence-transformers to model semantic similarity for more efficient clustering. I modelled this on the longer set of commentsΒ only.
  • for finding the top emotes, I used the emoji library and made sure to look up each emote and store their frequencies. I used this only on the shorter comments.
  • for retrieving comments from a searched query, I used sentence-transformers again but this time, on the entire set of comments.
  • Once that it was built, I came across the amazing Streamlit library which enabled me to make a beautiful frontend in mere hours of work. And that too by writing code in PythonΒ itself!
  • Once the UI was done, the next step was to deploy/serve the project in a convenient way. Since this is an open-source project I’m building, I decided to use Docker forΒ it.

It is important that you make sure that someone who comes across your project in GitHub (I’m assuming you’ll be displaying it there) can easily run your app for themselves.

It is therefore necessary to include a clear set of intructions in the README for the project. Explain everything you can think of and more. Not everything that appears obvious to you is also very apparent for them. So do it. Make sure to make it as thorough as possible.

And that is it. We’re finallyΒ done.

If you’ve followed along all the way, congratulations! You’re one step closer in making a cool portfolio project that you can be proudΒ of!

Go, follow the steps and DOΒ IT!

Photo by Jackson Simmer onΒ Unsplash

One lastΒ thing.

In case you were as excited by the idea of this project just like me and want to learn even moreβ€Šβ€”β€ŠI have some good news forΒ you.

I am giving away the step by step guide to my whole workflow while building this project from scratch. as a FREE eBook ofΒ course!

All you have to do is sign up for itΒ here.

Learning Data Science isn’t that hard, but follow me and let’s make it fun together. 🀠

Weeks of hard work yielded a result. Check out this whole project on GitHubβ€Šβ€”β€Šit is called:Β Insight.

Feel free to get in touch with ideas to improve this project, if you really want to. I appreciate any feedback you might have. Also get in touch if you want to build a frontend for the app in React/Vueβ€Šβ€”β€Šit will be fun toΒ collab!

yashprakash13/Insight

Thank you for reading and I hope you learned some good insights from this article. See you in the nextΒ one!


How to Build an End-to-End Deep Learning Portfolio Project was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓