Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

My Top 3 Tips for Getting Kaggle Expert Rank With Your First 5 Notebooks
Latest

My Top 3 Tips for Getting Kaggle Expert Rank With Your First 5 Notebooks

Last Updated on December 1, 2022 by Editorial Team

Author(s): Pere Martra

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Becoming a Kaggle expert requires work, but it is a very achievable goal. I’ll tell you three tips that helped me the most to do it with only five notebooks.

Photo by sporlab on Unsplash

Kaggle is the most prestigious data science competition site. Having a good profile on Kaggle can open many doors, and is one of the best places where you can showcase your Data Science problem-solving skills.

The Kagglers got medals based on how well they did in the competitions. But Kaggle isn’t just about competition. There are four categories in which you can progress.

  • Competitions. Maybe the most prestigious category inside Kaggle. You can receive medals depending on the results of the competitions you participate in.
  • Datasets. Data Scientists can publish their DataSet creations. Medals are awarded based on the votes of other Kagglers.
  • Notebooks. Perhaps the most prestigious category after the competitions. Not all uploaded notebooks have to be associated with an active Kaggle competition. Any notebook can get votes from the community. As in the case of the DataSets, the more votes obtained, the higher the quality of the medals and the higher the ranking.
  • Discussion. It is the least prestigious of all the categories. Votes are obtained for comments made on the Kaggle platform.

In each of these categories, you can opt for 5 ranks: Novice, Contributor, Expert, Master, and Grand Master.

The two initial ranks are really easy to obtain. Let’s say that by filling out the profile, interacting a little with the community, and posting a job, you already reach the Contributor level. But obtaining the Expert rank requires effort, it is the first rank that must be obtained by earning medals.

In this article, I will discuss how to achieve the Expert rank in the Notebook category, which is possibly the most prestigious after competitions.

What are the requirements to achieve Expert rank in the Notebook category on Kaggle?

Very easy! We need to get at least five bronze medals. To get a medal, we must obtain five votes from Kagglers with a higher category than Novice. The truth is that it seems simple, but we must consider that only 1 in 20 notebooks in Kaggle receive more than 2 votes. What we are going to attempt is to receive more than 5 votes from Kagglers in 100% of our notebooks.

Image by Author

In the image, we can see the requirements for each of the categories. As you can see, I only have the check in the Notebooks category. So, I can only be considered an expert in that category.

Data Scientists with more experience or who are more dedicated to Kaggle can earn the rank of an expert in more categories. With what is usually described as Kaggle Expert * n. Being n the number of categories in which he is considered an expert.

Competitions are the most prestigious category, followed by Notebooks and Datasets, and finally, discussions.

My first five Notebooks.

This was my first notebook. Currently, he has a silver medal. Awarded for getting more than 20 votes from Kagglers ranked higher than Novice.

I intended to get the highest score possible in the MNIST Competition with a Model created by me. In the Notebook, there are some techniques that aren’t used often. For example, instead of using Dropout layers, I used SpatialDropout. I also struggled with the callback functions and tried to achieve the highest possible score. The most important thing is that the Notebook explains the reason for each of the techniques used.

The second Notebook now also has a silver medal. Like the previous one, it was also part of one of Kaggle’s basic competitions. The approach on this one was entirely different. I focused 100% on the data transformation, and on explaining step by step the reason for each of the modifications.

The model used was a regression model from the SciKit library. I didn’t waste time with him. I would say that the majority of the work was invested in data processing and the generation of graphs to understand the data and its transformations.

Moreover, I attempted to make some functions customizable with variables. So that other Kagglers could play with the Notebook and test how it worked by altering these values.

This notebook was not part of any competition. But I used one of the best-known Kaggle Datasets: Card fraud detection.

It was just an experiment using the SciKit Learn library, in which I tried to create a function that was capable of modifying the data by itself. Applying the algorithm to obtain good performance without the participation of the Data Scientist.

The notebook was much better received than I expected.

This notebook was also part of one of the basic competitions. It’s a notebook that really surprised me that it got so many votes. Right now, it has a silver medal. In it, I’m not trying to get a good score, but instead, I’m manually generating a Simple Logistic Regression Model. It was intended as a simple guide to learning how to create simple models manually.

By entering the same competition as the previous Notebook and scoring much better, the Notebook got far fewer votes. It surprised me at first, but later, after taking a tour of the other competing notebooks, I could say that this one contributed very little. It was just one more of the notebooks that used Transfer Leaning to solve the competition.

It is true that the model used was not one of the most widely used and that I tried to make it a basic approach to Transfer Learning. Still, fortunately, he received a bronze medal, which enabled me to move up to the Kaggle Expert level in the Notebooks category. With my first five notebooks.

The three tips to follow.

Now I have some more notebooks in Kaggle and only one of them is without a medal. I have some ideas about what the Notebooks should have to obtain a medal or get the most votes.

Tip 1: Take care of the presentation.

Is this really the first tip? Yes, it is! There are many notebooks on Kaggle. The fact that someone gives you more than five seconds of their time can depend on something as simple as their first impression. If I go into a notebook and see the phrase that Kaggle puts by default at the beginning, I usually go out without looking much further.

Image By Author

My presentation is not fantastic. I’m a developer, and as you know, we aren’t very good at combining colors or shapes. But I have tried using a highlight for the titles and a different font.

For this, I have used a cell that I hide in the Notebook. The cell contains HTML code that modifies the presentation.

from IPython.core.display import HTML
HTML("""
<style>
.output_png {
display: table-cell;
text-align: center;
vertical-align: middle;
horizontal-align: middle;
}
h1 {
text-align: center;
background-color: Blue;
padding: 20px;
margin: 2;
font-family: monospace;
color:DimGray;
border-radius: 20px
}

h2 {
text-align: center;
background-color: Red;
padding: 20px;
margin: 0;
font-family: monospace;
color:DimGray;
border-radius: 20px
}

h3 {
text-align: center;
background-color: Green;
padding: 15px;
margin: 0;
font-family: monospace;
color:DimGray;
border-radius: 15px
}


body, p {
font-family: monospace;
font-size: 15px;
color: charcoal;
}
div {
font-size: 14px;
margin: 0;

}

h4 {
padding: 0px;
margin: 0;
font-family: monospace;
color: purple;
}
</style>
""")

I’m leaving you the code, but I want you to know that I’ve changed some things so that your Notebooks might look different from mine.

Tip 2. Bring something different or useful.

The notebooks that have worked best for me are the ones where I have not just tried to do something different but have explained why. I think that if someone learns something in your notebook, they are more likely to vote for it because they are grateful.

In the MNIST notebook, I showed multiple callback functions and the SpatialDropout layer. It’s not really advanced knowledge, but it is one that many people who are starting with TensorFlow may not know.

In House Prices Notebook, I explained each of the data transformation techniques, and the truth is that there was a multitude. I am convinced that it is one of the notebooks with more explanations in this regard.

Perhaps the best example is the Notebook which analyzes the sentiment of Tweets by creating a logistic regression model. It’s not a shiny notebook, far from it. You don’t get great performance. But it is almost unique. It teaches a different way of dealing with the problem than the usual one, and it teaches how to create a model from scratch.

The notebook that got the fewest votes was the one based on Transfer Learning, and really, seen from a distance, it was the most normal notebook. Many notebooks can be found using the same technique.

Tip 3. End the Notebook with conclusions.

At the end of each Notebook, I include a part of the conclusions. Trying to close it with a summary of what has been obtained and what has been seen.

Not only that, in most of them, I encourage people to fork and give them ideas on how the notebook can be improved, so they can try it out.

The same goes for the mentions or inspirations. I always try to point out that other notebooks or articles I have consulted for the creation of the Notebook.

Is that all?

For me, these are the three most important pieces of advice I can give. There are people who are dedicated to asking for votes in the comments, and I’m sure they have success with that strategy. But I think it’s not worth it.

It is much better to put a little care in our Notebooks so that the content is interesting and that it is well-explained. A notebook with just brilliant code, but no explanation at all, may perform very well in competition but will have a harder time in the Notebook category, where you are dependent on the votes of the other Kagglers.

The last piece of advice is to have fun with Kaggle! Don’t get hung up on medals, use Kaggle to learn.

I hope we’ll see you there! Do not hesitate to contact me through Kaggle, or in the comments, I will be delighted to visit your Notebooks!

Image By Author

I write about TensorFlow and machine learning regularly. Consider following me on Medium to get updates about new articles. And, of course, You are welcome to connect with me on LinkedIn.

If you like TensorFlow and want to know some interesting techniques, check my series: TensorFlow Beyond The Basics.

TensorFlow beyond the basics


My Top 3 Tips for Getting Kaggle Expert Rank With Your First 5 Notebooks was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓