K-Means Clustering in the network security domain.

3 min readAug 12, 2021

As most of us know K-Means clustering is an unsupervised clustering algorithm approach for performing the classification of the data points or the objects, it is different from KNN algorithm (which is a supervised technique) in which clusters are fixed, but here K-Means works on the data, it tries to form as many clusters it can form , from the inputted data points , not just mapping data points in the fixed cluster (as like in KNN algorithm).

It classifies in the below given fashion…

This is how K-Means label the given data into clusters through centroids.

In K-Means Clustering K denotes the no. of clusters formed.

Initially we randomly take the no. of clusters, but as soon as we go ahead no. of clusters changes, this is done by the Elbow Method for selection of optimal K clusters .

Final No. of clusters are not fixed initially in the K-means, for this, we have the Elbow method, as we know if k increases, average distortion will decrease, each cluster will have fewer constituent instances, and the instances will be closer to their respective centroids. However, the improvements in average distortion will decline as k increases. The value of k at which improvement in distortion declines the most is called the elbow, at which we should stop dividing the data into further clusters.

A very useful approach for clustering in the Security domain, but how?

Have you heard about the Botnets?

Let me simplify, it is a short of robot network (which are computers) which are infected by the malware and are under the control of a single attacking party i.e. bot-herder/bot master, the command and control server(C2) is the bot master’s server (which’s binding ip with the host name in DNS keep on changing so frequently so that nobody can trace the host machine). Through C2, bot master distribute malware through infected sites, social media or spam emails, and infect the other machines and their network which eventually called Botnet.

Through some network monitoring tool like Wireshark, we collect the data of the network.

On that data we apply K-means clustering to form the K-centroids which helps us to study the infected number of systems through the graph, rate of the systems getting affected with the malware attack, and various other important networking conclusions we can derive from those graphs form using the K-means clustering.

Hence, K-Means clustering is an efficient algorithm to solve problems in the network and security domain, other useful domains for this algorithm can be for Human resource management team, where employees, and customers can be clustered or grouped based upon some desired parameters.

But for the detecting the network security domain, this technique is useful to detect faults and frauds in attacks like spamming, phishing and DDoS attacks.

Thank you….

Happy Learning🙂…

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Shristi Agarwal

1 Follower

4 Following

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

260

I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

DataDrivenInvestor

Austin Starks

I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

It literally took one try. I was shocked.

Sep 15, 2024

242

Lists

Staff picks

827 stories1648 saves

Stories to Help You Level-Up at Work

19 stories948 saves

Self-Improvement 101

20 stories3355 saves

Productivity 101

20 stories2818 saves

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jessica Stillman

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Oct 30, 2024

732

How I Am Using a Lifetime 100% Free Server

Harendra

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

Oct 26, 2024

170

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

D.H. Jang

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…

Nov 3, 2024

Data Science All Algorithm Cheatsheet 2025

Artificial Intelligence in Plain English

Ritesh Gupta

Data Science All Algorithm Cheatsheet 2025

Stories, strategies, and secrets to choosing the perfect algorithm.

Jan 5

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams