Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Data-Centric AI: Decoding the Hype
Artificial Intelligence

Data-Centric AI: Decoding the Hype

Last Updated on March 24, 2022 by Editorial Team

Author(s): Paul Dovidavicius

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Data-centric approaches to Model Centric-focus onΒ AI

Photo by fauxels fromΒ Pexels

The database and its impact on the quality of ML-based solutions provide different sessions, and it’s important for the NG session (Zagatti, 2021). The advantages are the large investment in the preparation of data and its team. Andrew discussed different advantages of a larger contribution in the preparation of data with his team, demonstrating that improving data quality that is exists effective to collect three times the data with a highΒ amount.

Different data, the sameΒ model

It is known by the components that make up the solution that helps in development as AI practitioners:

β€œAI System = Code + Data, where code means model/algorithm”

It refers that we can improve it by coding or improving data by giving some solution. Both solutions can work better for it. What is the best way to strike the appropriate balance in order toΒ succeed?

The data is free through databases or Kaggle; for, instance, it provides more models in which centric approach is dealing with more or less well-behaved improve solutions. It means that improving solutions required focusing on the only element that could be tweaked and changed, the code. However, what we see in the industry is a very different story. Andrew NG expressed a viewpoint so I completely agree: until now, the approach of model-centric has had a significant impact on the available gear in the ML area teams for various dataΒ science.

Data-centric vs. model-centric

In my opinion, achieving a solid AI solution necessitates giving the balance of what is known as a β€œmodel-centric vs. a data-centric perspective”; yet, I am aware that the data side retains the greater stakes of value. Happily, this viewpoint is not based on intuition, and β€œAndrew NG and his team” chose to demonstrate it with different experiments using real-world data. But first, let’s clarify what it means to be model-centric and data-centric.

Image byΒ Author

One of the examples presented during the session was the detection of faults in steel sheetsβ€Šβ€”β€Šit gives the sequence of photos from different sheets of steel to construct the best model to recognize the defects during the process of manufacturing, and it provides the best accuracy on the base of baseline systems, and it gives vision model to well-tuned hyper-parameters. The goal is to achieve 90 percent accuracy. How it can be accomplished by different accuracy.

The baseline model is to improve the 90 percent, and it gives impossible for the model-centric and it gives an improvement in the network architecture search and it gives the state of the art architecture. The data-driven approach is to identify the clean noisy labels and inconsistency. The findings are asΒ follows:

Image byΒ Author

It deals with the steel sheets defects detection in which the baseline presents the accuracy in the baseline is 76.2%, model-centric is +0% and data-centric is +16.9% that shows the improvement in the data-driven approach (Dario,Β 2021).

The benefits of adopting an approach data-centric are not limited to the vision of computers; they also apply to other areas such as natural language processing (NLP) and β€œtabular and time-series data.”

Why is it important to switch from a model to a data-centric approach?

Data is extremely important in AI research, and adopting a strategy that prioritizes obtaining high-quality data is criticalβ€Šβ€”β€ŠAfter all, useful data isn’t easy to get by just noisy, but also extremely expensive to get. AI is treated in the same way that we would care for the greatest materials while building a house. The right hyper-parameters and model selection are giving generalizable results and it gives more performant and it optimizable to influence systems and it gives high-quality models to train and it utilized to train the models. AI provides the clean and de-noising datasets to become the fundamental differentiator in the structure of data. Semi-supervised learning techniques can be highly beneficial for detecting and correcting inconsistencies, and synthetic data can be used to produce and simulate more events to aid with generalization issues.

Final Thoughts

Data is one of the most expensive assets today, thanks to the infrastructure involved, the number of human resources dedicated to it, and the rarity of having it acquired in optimum circumstances. Data quality must be maintained and improved at every stage of AI development, each of which will, by definition, require various frameworks and tools and, don’t forget, this must be delivered and measured on a continual basis, making MLOps a valuable ally in achieving a suitable and successful data-centric paradigm in AI solution development.

References

Hajij, M., Zamzmi, G., Ramamurthy, K. N., & Saenz, A. G. (2021). Data-Centric AI Requires Rethinking Data Notion. arXiv preprint arXiv:2110.02491.

Zagatti, G. A., Ng, S. K., & Bressan, S. (2021). A Data Warehouse of Wi-Fi Sessions for Contact Tracing and Outbreak Investigation. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XLVIII (pp. 85–104). Springer, Berlin, Heidelberg.

Rafi Karlansik., (2021). Need for data-centric ML platforms. Available at: https://databricks.com/blog/2021/06/23/need-for-data-centric-ml-platforms.html

Fabiana Clemente., (2019) From model-centric to data-centric. Available atΒ : https://towardsdatascience.com/from-model-centric-to-data-centric-4beb8ef50475

Dario Radecic., (2021). Data-centric vs Model-centric AI? The Answer is clear. Available at:

https://towardsdatascience.com/data-centric-vs-model-centric-ai-the-answer-is-clear-4b607c58af67


Data-Centric AI: Decoding the Hype was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓