Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take the GenAI Test: 25 Questions, 6 Topics. Free from Activeloop & Towards AI

Publication

Uncovering K-means Clustering for Spatial Analysis
Latest   Machine Learning

Uncovering K-means Clustering for Spatial Analysis

Last Updated on August 6, 2024 by Editorial Team

Author(s): Stephen Chege-Tierra Insights

Originally published on Towards AI.

Created by the author with DALL E-3

β€œDef- Underrated-adjective rated or valued too low”- Merriam Webster.

Underrated, unappreciated or underhyped are terms that get thrown around to suggest something that does not get the recognition it deserves. Sometimes it is used to describe someone who does not get the public attention he deserves despite being very effective in their profession, this could be a person’s biased opinion.

For example, I think that NBA basketballer Leonard Kawhi is the most underrated and criminally underhyped player of all time. Rapper Nathan John Feuerstein, also known as NF is highly underrated as both do not fit the perception of modern-day images of athletes and rappers.

The same could be said about some machine learning algorithms which are not talked about with excitement as they should be, as we are reaching the golden age of Artificial Intelligence and machine learning where some algorithms will be propped up while others may fall by the wayside of irrelevance due to this fact.

One such algorithm is K means which is known as an unsupervised algorithm and has become widely used but has not reached the popularity of random forest and K nearest- as I continue writing and researching on machine learning algorithms and their impact on the spatial sector- let us have a look at k means and what it offers to GIS pros.

What is K Means Clustering

K-Means is an unsupervised machine learning approach that divides the unlabeled dataset into various clusters. The purpose of this article is to examine the principles and operation of k-mean clustering as well as its application especially when it comes to geospatial analysis and its implication

Unsupervised machine learning algorithm as it is commonly referred to is the process of teaching an algorithm to work on unlabeled, unclassified data without human intervention. In this scenario, the machine’s task is to arrange unsorted data based on parallels, patterns, and variances without any prior data training.

K stands for clustering, which divides data points into K clusters based on how far apart they are from each other’s centres. The cluster centroid in the space is first randomly assigned.

To process the learning data, the K-means algorithm in data mining starts with a first group of randomly selected centroids, which are used as the beginning points for every cluster, and then performs iterative (repetitive) calculations to optimize the positions of the centroids.

How it Works

A cluster’s centroid is a set of characteristic values that together define the groups that are formed. The type of group that each cluster represents can be qualitatively interpreted by looking at the centroid feature weights.

Data assignment: The centroid, or centre collection of features, creates and defines each cluster. The closest centroid for each data point is then determined using a distance function of choice.

Update of the centroids: Following the assignment of all data points, the centroids are recalculated by averaging all the data points assigned to that cluster.

Repetition: Until a certain stopping condition is satisfied, such as no changes are made to clusters, the total distance is minimized, or a maximum iteration threshold is achieved, this assignment and update process is repeated.

K means for Spatial Analysis

Geographical data can be divided into k distinct clusters using the iterative K-means clustering algorithm. This is done by repeatedly assigning each data point to the closest centroid, recalculating the centroids as the mean of the assigned points, and repeating these steps until the centroids stabilize. This allows for the identification and interpretation of spatial patterns, such as market segments, urban land use types, environmental zones, and public health hotspots, while taking into account variables like distance metrics, data scaling, and geographic constraints to guarantee insightful and useful information.

Because of its scalability, it can manage enormous volumes of spatial data and is therefore appropriate for a variety of applications at both local and global sizes. GIS experts can find hidden insights in spatial data by utilizing K-means’ advantages, which will ultimately result in superior decision-making and outcomes for a variety of spatial analytic tasks.

It can be used for: –

  1. Development and Urban Planning

-Land Use Analysis: K-means assists city planners with resource allocation and zoning restrictions by classifying metropolitan areas according to land use types (residential, commercial, and industrial).

-Smart City Initiatives: K-means facilitates the development of smart city projects by improving infrastructure and services by grouping sensor data (from sensors measuring pollution or traffic, as example).

2. Disaster Management

Risk assessment: By identifying high-risk locations through K-means clustering of historical disaster data, disaster preparedness and mitigation planning are aided.

Resource Allocation: When responding to a disaster, grouping the impacted areas helps to prioritize the distribution of resources and rescue efforts.

3. Public health

illness Outbreak Detection: Public health professionals can identify regions with high illness incidence by clustering health data. This allows for focused treatments and effective resource distribution.

Healthcare Accessibility: By identifying underserved areas and examining the spatial distribution of healthcare services, K-means helps guide policy for improved healthcare access.

4. Real Estate

Property Valuation: Accurate property valuation and market analysis are aided by clustering property data according to features such as location, size, and amenities.

Development Planning: By using spatial clustering, real estate developers can pinpoint new trends and possible hotspots for development.

5. Transportation and Logistics

Route Optimization: By helping to cluster delivery points, K-means facilitates more effective routing and lowers transportation expenses.

Traffic Management: Cities can enhance traffic flow and better control congestion by clustering traffic data.

Snippet

Open your Google Earth engine

/ import the satellite data from the European Space Agency
var S2 = ee.ImageCollection("COPERNICUS/S2");

//filter for Dubai
S2 = S2.filterBounds(Dubai);
print(S2);
//filter for date
S2 = S2.filterDate("2020-01-01", "2020-05-11");
print(S2);
var image = ee.Image(S2.first());
print(image)
var image = ee.Image(S2.first());
print(image)
//Map.addLayer(image,{min:0,max:3000,bands:"B4,B3,B2"}, "Dubai");
Map.addLayer(image,{min:0,max:3000,bands:"B8,B4,B3"}, "Dubai");
// Create training dataset.
var training = image.sample({
region: Dubai,
scale: 20,
numPixels: 5000
});
// Start unsupervised clusterering algorithm and train it.
var kmeans = ee.Clusterer.wekaKMeans(5).train(training);
// Cluster the input using the trained clusterer.
var result = image.cluster(kmeans);
// Display the clusters with random colors.
Map.addLayer(result.randomVisualizer(), {}, 'Unsupervised K-means Classification');
// Export the image to Drive
Export.image.toDrive({
image: result,
description: 'kmeans_Dubai',
scale: 20,
region: Dubai
});

If you are enjoying this article please consider supporting my work and fuel my creativity by buying me a coffee, as I’m not eligible for the Medium Partner Program but your contribution makes all the difference, any amount will do, Thanks.

hKhtVVhttps://buymeacoffee.com/stephenchege

Conclusion

K-means clustering has a significant impact on spatial analysis by providing a flexible and effective tool for finding patterns, maximizing resources, and making defensible decisions in a variety of contexts, including business strategy, public health, and environmental monitoring in addition to urban planning. It is a priceless tool in today’s data-driven decision-making processes due to its efficiency in managing huge spatial datasets and delivering insightful analysis.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓