Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Digital Learning During Covid19: A Complete Analysis
Latest

Digital Learning During Covid19: A Complete Analysis

Last Updated on January 7, 2023 by Editorial Team

Author(s): Abid Ali Awan

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Data Analysis

Using data analysis tools to figure out trends in digital learning during Covid19 and how the online platforms are effective towards improvised communities.

Photo by Compare Fibre onΒ Unsplash

Introduction

There is an imbalance in the education system during the Covid19 pandemic and most of the students don’t even have access to educational tools and online learning platforms. There is an urgent need to come up with solutions and by using LearnPlatform dataset we will figure some of the common patterns and identify the clusters based on demography, geography, and accessibility.

Project goals

In this project, we will be using data analysis tools to figure out trends in digital learning and how it is effective towards improvised communities. We will be comparing districts and states on factors like demography, internet access, learning product access, and finance. In the end, we will summarize our report and point towards the areas that need our more attention to make education accessible for all students the UnitedΒ States.

Image byΒ Author

Data

The Dataset Is available at Kaggle under noncommercial license.

Use this command to download the datasetΒ faster

!pip install kaggle
!kaggle competitions download -c learnplatform-covid19-impact-on-digital-learning

Don't forget to add Kaggle API keyΒ πŸ‘‡

Image byΒ Author

We have three types ofΒ Dataset.

  1. Products data contains Service names, Companies, and educational sectors.
  2. District data contains Demography, Locations, and Educational Spending.
  3. Engagement data contains students’ engagement with different products perΒ day.

Product

The product file products_info.csv includes information about the characteristics of the top 372 products with most users in 2020. The categories listed in this file are part of Learn Platform's product taxonomy.

Image byΒ Author

Loading data educational productΒ data

As we can see we have product names, Companies, and Sector data with Primary Functions.

District

The district file districts_info.csv includes information about the characteristics of school districts, including dataΒ from

  • NCES (2018–19),
  • FCC (Dec 2018),Β and
  • Edunomics Lab.
Image byΒ Authors

Loading data districtΒ data

We have different states, locale, educational spending, and more importantly we have demographical data.

Engagement data

The engagement data are aggregated at the school district level and each file in the folder engagement_data represents data from one school district.

Image byΒ Author

Engagement data extraction function

Our engagement data is divided into multipleΒ .csv files based on District id so we need to create a Python function to extract data from the individual files and concat them intoΒ one.

Loading data engagement data

It took us 19 seconds to load all files from a folder and now we have DataFrame containing all distract Engagement Data.

Missing data

Let’s look at all three DataFrames for missing values perΒ Column

Plot missingΒ function

The function helps us plot an Altair bar chart for all data frames containing missingΒ values.

As we can observe that our district dataset mostly contains missing values in pp_total_raw,pct_free/reduced,county_connections_ratio. Let's see all the missing values together in the nextΒ section.

The subplot below shows missing values in all three datasets we will be removing them in the next section as it is hard to replace them without actual information.

Cleaning data

We will be cleaning district data as it contains an array of values. We will be taking advantage of bothΒ limits.

  1. Converting list pct_black/hispanic and pct_free/reduced intoΒ float
  2. Converting list pp_total_raw into Integers.
  3. Dropping county_connections_ratio due to static 0.18 to 1 values across all categories.

Feature distributions visualization

In this section, we will discover different features distributions.

Districts perΒ states

As we can see the dataset contains more districts from Utah and Illinois than any otherΒ state.

Locale distribution

The suburbs are dominating with 59 percentage, which also means people from medium to high class have more access to internet.

Educational product providers

Google provides more educational products than anyone close toΒ it.

In short online education is dominated by GoogleΒ LLC

Let’s see what are the products provided byΒ Google

Educational sector distributions

Educational sectors are divided into three categories, PreK-12, Higher Education, and Corporate.

Some products are specific to a sector but others are quiteΒ general.

As we can see PreK education is dominating the distribution with 54 percentage.

Primary functions with main and subcategories

There are three types of main primary functions.

  1. LC = Learning & Curriculum
  2. CM = Classroom Management
  3. SDO = School & District Operations.

You can interact with the Plotly sunburst plot to explore the distribution of main categories and subcategories.

The Learning products have majority shares in this group and the most common subcategories are digital learning platforms.

Essential function subcategories distribution

We can observe all the subcategories distribution.

Content creation and digital learning are leading in thisΒ market.

Merging threeΒ datasets

We will be merging all three datasets on LP ID and district id. As we can see the combined dataset has 18 columns that will later help us with both geographical and time seriesΒ plots.

Top educational products

Google products are dominating with Wikipedia and Netflix as an exception. Netflix provides kids with educational content and tutorials.

Distribution of race, reduced fee, expenditures, and Internet connection perΒ state

We will be used the Pandas function to display a table showing the mean distribution of various features related to demography and expenditures.

Black andΒ Hispanic

I seems like Taxes have more Black/Hispanic students than another state followed by Florida, Michigan, and Minnesota.

Reduced fee or free education

Minnesota provides 70 percent of free or reduced fee education, followed by Michigan andΒ Indiana.

Per-pupil total expenditure

New York spends more on education than any other State in US. The runner-up States are New Jersey and Minnesota.

Time series distribution of educational product accessΒ students

We will be observing time-series distributions of Product Access Based on Locale andΒ State.

Based onΒ Locale

The cities were hit with Covid19 the hardest which affected the students the most as they have to study from home that is why we can see the dip in product access from April to July 2020 as compared to other locales. During Summer vacation students stopped using educational products. In suburbs, students have more access to the tools than another locale. overall, they all follow a commonΒ pattern.

Based on the top fiveΒ states

New York was hit hardest with Covid19 and that didn’t affect the online learning of students as you can see graph became more consistent after the lockdown.

New York, Wisconsin, and Indian have higher students using these educational productsΒ overall.

Based on the bottom fiveΒ states

This is odd as Michigan and North Carolina have almost zero product access from March till the end of August. Texas has a similar pattern but they had a peak in June. Overall these bottom states became active after September 2020.

It is strange that some states have a very odd pattern, maybe due to a bad educational system or lack of awareness.

Geographical analysis

This section will be looking at different states based on product access and engagement index.

We do not have access to all states data so you might see a lot of gray space on theΒ map.

Education products access perΒ state

New York has a higher product access score than any other state, then comes the Orange States Wisconsin, Illinois, and Indiana. There might be a link between the lower population of Black/Hispanics in these states. Let’s explore that in the nextΒ part.

There is a relation between states with a lower population of Black/Hispanic but it is not the stronger. I can see Newyork and Wisconsin with a lower Black/Hispanic population and higher access rate but when it comes to other states with a lower percentage of people of color it’s all over the place. The system is fairly balanced towards otherΒ races.

Engagement index perΒ state

The Engagement distribution per state is quite similar to product access. It’s logical that if students have access to these products there is a chance of an increase in usage of these platforms.

Focusing on people from the improvised neighborhood

In this section, we will be focusing on the Black/Hispanic community and free or reducedΒ fees.

Correlation

There is a high correlation between product access and engagement index, which means if the student has access to these products they might use them on daily basis. We can also see another high correlation between Black/Hispanic and Free/ Reduced education. This means the government is doing its job in helping improvised communities to get a proper education.

Black/Hispanic products access over theΒ year.

The graph below shows that over time the communities with a lesser population of Blacks and Hispanics have a similar pattern to some of the Top states we have discussed earlier. Let’s compare this graph with the mixed communities.

We can see some changes as the peak has fallen from 1.4 to 1.2. It is a disparity among the different racial groups but its marginal compares to other countries. We can say Black/Hispanic communities have lesser access to educational products.

Code

Conclusion

The data is limited to fewer states and I didn’t find any concrete evidence that Black or Hispanic communities get unfair treatment. Online education is fairly balanced towards all. I did see how some of the states perform worst in terms of product access rating, this might be due to bad policymaking and lack of awareness. We can see that Google is dominating the online education industry by providing a complete ecosystem. The majority of Pre-K students are using these platforms for digital learning. Suburbs have the highest students accessing these products and we can see a clear correlation between product access and engagement index. There is also a high correlation between Black/Hispanic and Free education/ Reduced Fee, which means the government is doing its best to help the poor community by providing free education.

You can follow me on LinkedIn, Twitter, and Polywork where I post my article on weeklyΒ basis.


Digital Learning During Covid19: A Complete Analysis was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓