Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Why do we need a Data-Centric AI Community?
Artificial Intelligence

Why do we need a Data-Centric AI Community?

Last Updated on March 2, 2022 by Editorial Team

Author(s): Fabiana Clemente

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Why do we need a Data-Centric AI Community?

A place to discuss data quality for dataΒ science

Photo by Duy Pham onΒ Unsplash

According to Alation’s State of Data Culture Report, 87% of employees attribute poor data quality to why most organizations fail to adopt AI meaningfully. Based on a 2020 study by McKinsey, high-quality data is crucial for digital transformations to propel an organization, past competitors.

As machine learning algorithms coding frameworks evolve rapidly, it’s safe to say the scarcest resource in AI is high-quality data at scale. High-quality data is the bottleneck.

Despite several findings on the importance of data in the AI industry, more than 90% of research papers in the AI domain are still model-centric. According to Andrew Ng, this is due to the difficulty in creating large datasets that can become generally recognized standards.

The fact is that the current threshold that Machine Learning has reached could only be breached by improving both the quality and quantity ofΒ data.

Thus the data-centric movement was born. The movement represents the recent transition from focusing on modeling to the underlying data used to train and evaluateΒ models.

As a next step in the movement, today we’re excited to announce the Data-Centric AI Communityβ€Šβ€”β€Ša place to discuss data quality for dataΒ science.

What is Data-Centric AI and Why We ShouldΒ Care

Photo by Papaioannou Kostas onΒ Unsplash

Data-Centric AI is the approach to AI development that considers the training dataset as the centerpiece of the solution instead of theΒ model.

Let’s take a step back and understand the hype on data-centric AI. Coined by Andrew Ng, data-centric AI emphasized the importance of focusing on data quality over algorithms and models. Further, deeplearning.ai and Landing AI announced the first-ever data-centric competition. Not only did it create awareness but also inverted the traditional competitions and asked to improve a dataset given a fixedΒ model.

Finally, in 2021, the Data-Centric AI workshop was conducted to cultivate the DCAI community into a vibrant interdisciplinary field that tackles practical data problems. Several companies have adopted the approach and produced results. According to Landing AI, Some improvements from the adoption of a data-centric approachΒ include:

  • build computer vision applications 10xΒ faster
  • reduced time to deploy an application byΒ 65%
  • improved yield and accuracy by up toΒ 40%

With all the proven benefits in the industry, launching the DCAI community aims to complete the missing piece in the data-centric AI movement.

The 3 Pillars of the Data-Centric AI Community

While the data-centric approach is still evolving and can span over various stages of a machine learning lifecycle, we’ve identified the most significant pain points amidst the data scientists and aim to focus on those in the DCAI Community.

We call them the 3 pillars of the DCAI community:

  • Data Profiling: Understanding the existing data is the first step to improving the data. Profile your data in a few lines of code. Give it a try on pandas-profiling!
  • Synthetic Data: It is artificially created that keeps the original data properties, ensuring its business value while being privacy compliant. Give a try on ydata-synthetic!
  • Data Labelling: Isn’t it one of your most significant pain points in data quality? The DCAI Community cultivates meaningful discussions around this and other topics in our slack workspace!

In addition, we have (and continue to) collate all the useful open-source tools, and resources on the 3 pillars, at the Awesome-Data-Centric-AI GitHub repository.

Endless Possibilities Together

Data-centric communityβ€Šβ€”β€ŠImage by theΒ author

β€œWhat can I expect?β€β€Šβ€”β€ŠI hear youΒ ask.

At the Data-Centric AI Community, we believe that together, we can actively change the paradigm towards better data. We want to bring together experts from the industry and foster meaningful conversations.

Expect a regular calendar of events and content creation that will help you understand this approach better and allow you to become a data-centric ai evangelist. As we partner with experts in the industry, you will get the much-needed guidance directly from those who have already done what you plan toΒ do.

Accelerating AI with improved data is at the core of what we do, and this open-source community is yet another step towards our meaningful journey. We invite you to be part of itβ€Šβ€”β€Štogether, the possibilities areΒ endless.

Fabiana Clemente is CDO atΒ YData.

Accelerating AI with improvedΒ data.

YData provides the first data development platform for Data ScienceΒ teams.


Artificial Intelligence was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓