top of page
  • joydeepml2020

Machine Learning in detecting malicious traffic of IOT based healthcare system

Updated: Sep 30, 2021

Internet of things has emerged as a key area of research in both academic and industrial communinity. IOT in a nutshell is a collection of objects which are connected to the internet and has the ability to communicate with each other. In IOT, we are deallng with the connectivity of objects which has contraints on power,compute and secuirity. Smart systems are made by combining the power of these embedded objects with the power of analytics in edge or cloud. our smart cities,smart watch etc. are nothing but bunch of connected devices with analytical capabilities.

With the increase of the IOT devices, security is one of the most active area of research. In recent times the malacious attacks on the IOT newtwork has become very popular. As these IOT devices operates in constrainted power,bandwidth, memory environment, traditional IT network security or Intrution Detection systems(IDS) can't be implemented for IOT networks. Hence there is a need for IDS specifically for IOTs.

In this project, we are going to train a machine learning model which can detect a malicious attack on the IOT network.

Healthcare sytems are very sensitive area and compromising the security of those system can take life. For example changing the volume and injection rate of the infusion pump of a ICU patient can lead to catastropic ending. MQTT and COAP protocols are widely used in IOT network, but geting the dataset to train machine learning is kind of challenge. A team of researchers from has simulated a dataset which replicates the IOT sensor data in ICU setup. My work is inspired by this paper A Framework for Malicious Traffic Detection in IoT Healthcare Environment.

The authors of the paper used the below set up and proposed a datatset and a frame work. They used IOT Flock tool to generate the IOT traffics as per the below setup and also open sources python frame works to convert the capture data into dataset. Please refer the complete paper for more details on the dataset creation.

The authors of the papers has simulated the complete ICU IOT network using IOT Flock. The steps are the following steps are being performed:

The below figure is the sensor network for generating the dataset(both normal and attacked)

Two kinds of sensors used in the set up, patient monitoring and environment mornitoring along with two charecteristics of the devices. The first charecteristics is data profile and the second one is time profile. Below are the details as per the research paper:

The data profile is used after consulting the literature of each sensors where as time profiles are used stochasticly.

Machine Learning project: Now we understand the data and problem,let us understand how to solve this problem. The objective of the problem is given a network traffic we have to classify it to a attacked traffic or normal traffic.

Posing the machine the problem: We can pose this problem as a binary supervised classification problem. Given the features i.e sensors reading / network traffic captured we have classify it to malicious or not.

Using google colab:

from google.colab import drive 

>> Mounted at /content/drive

# Importing the necessary  libraries to load and process the data set
# importing the necessary libraries to load and perform EDA
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns
# Loading the csv datasets
dataset_attack = pd.read_csv('/content/drive/MyDrive/ML_Projects_IOT/Detections of malacious attacks in healthcare/ICUDatasetProcessed/Attack.csv')
dataset_environmentMonitor = pd.read_csv("/content/drive/MyDrive/ML_Projects_IOT/Detections of malacious attacks in healthcare/ICUDatasetProcessed/environmentMonitoring.csv")
dataset_patientMonitor = pd.read_csv("/content/drive/MyDrive/ML_Projects_IOT/Detections of malacious attacks in healthcare/ICUDatasetProcessed/patientMonitoring.csv")
# Checking the features of the dataset 

Checking the features of each dataframe:

The first thing we have to do it to identify that features in all the 3 CSV are same or not. As we are going to merge this 3 CSV files into one CSV file to create our final dataset. we need to make sure that all the dataframes has the same feature name and all sensors data are captured in the same sequence.

After running the following code, we found that all the dataset are similar.

# Checking the environmentMoitors with Columns_attack
columns_patientMonitor ==columns_attack

Understanding the features of a perticular csv file using the following code:


Checking the shape of the datasets :


(80126, 52)

We have 80126 compromised datapoints and 52 features. In the research paper, the authors proposed the following features[’frame.time_delta’, ’tcp.time_delta’, ’tcp.flags.ack’, ’tcp.flags.push’, ’tcp.flags.reset’, ’mqtt.hdrflags’, ’mqtt.msgtype’, ’mqtt.qos’, ’mqtt.retain’, ’mqtt.ver’]

Skipping some of the steps, for the complete code please refer the github repo.

# Visualising the features and label

Understanding the distribution of the dataset,


After running the following code, we found that there are 108568 datapoints of label 0 and 80126 datapoints of label 1.

The dataset is not severly imbalanced.


Plotting the heat map to understand the collinearity

plt.figure(figsize = (20,8))

We found few features are negetively corelated.

Spilting the dataset into features and targets:

X = final_data_set.iloc[:,:10]

Few features are objective type.Hence before applying machine learning algorithims, we need to convert them into float

The below helper function will convert the string values into float.

import re
def change_string(value):
  value = re.sub("\D","",value)
  return float(value)


The below code will split the dataset into training and test. 70 % will be kept as traning data and 30% will be test data.

# importing the libraries for building logistic regression
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.3,random_state=42)

Creating the pipe line, standardising the features and then passing the data into logistic regression model with default parameters. We are not doing any hyper parameter tunning.

model_0 = Pipeline([


The model is trained on the training dataset. The output is given below

Predicting the model performance on the test data

baseline_score_model_0 = model_0.score(x_test,y_test)
print(f"The accuracy of our model with logistics regression is {baseline_score_model_0*100:.2f}%")

The accuracy of our base line model is 95.34%

After building the base line model, ensemble models like Random Forest will be used

Creating another model with random forest.

from sklearn.ensemble import RandomForestClassifier

The randomForestClassifier is trained on the training data.

Predicting the values on the test dataset


Evaluating our model on the test data

from sklearn import metrics

The accuracy of the model is

As the dataset is not imbalance, accuracy can be taken into account for evaluating the performance of the model.

For better clarification, we will also use other performance metrices. We will use accuracy,precission,recall,f1score. Below function will generate the mentioned score for the test data.

from sklearn.metrics import accuracy_score,precision_recall_fscore_support

def model_evaluation_metric(y_true,y_pred):

  # model accuracy 
  model_accuracy = accuracy_score(y_true,y_pred)
  # Calculate the precision,recall, f1 
  model_precision,model_recall,model_f1,_= precision_recall_fscore_support(y_true,y_pred,average="weighted") 

  model_results = {"accuracy": model_accuracy,
                   "precision": model_precision,
                  "recall" : model_recall,
                   "f1 score" : model_f1
  return model_results         
model_2_results = model_evaluation_metric(y_true=y_test,y_pred=y_predict)

Printing the values of performance metrics as a dictionary:


Confusion matix will be also a good metrics as we can visualise the false positive and false negetives. The below utility function will create the confusion matrix.

import itertools
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import confusion_matrix

def make_confusion_matrix(y_true, y_pred, classes=None, figsize=(10, 10), text_size=15, norm=False, savefig=False): 
  """Makes a labelled confusion matrix comparing predictions and ground truth labels.
  If classes is passed, confusion matrix will be labelled, if not, integer class values
  will be used.
    y_true: Array of truth labels (must be same shape as y_pred).
    y_pred: Array of predicted labels (must be same shape as y_true).
    classes: Array of class labels (e.g. string form). If `None`, integer labels are used.
    figsize: Size of output figure (default=(10, 10)).
    text_size: Size of output figure text (default=15).
    norm: normalize values or not (default=False).
    savefig: save confusion matrix to file (default=False).
    A labelled confusion matrix plot comparing y_true and y_pred.
  Example usage:
    make_confusion_matrix(y_true=test_labels, # ground truth test labels
                          y_pred=y_preds, # predicted labels
                          classes=class_names, # array of class label names
                          figsize=(15, 15),
  # Create the confustion matrix
  cm = confusion_matrix(y_true, y_pred)
  cm_norm = cm.astype("float") / cm.sum(axis=1)[:, np.newaxis] # normalize it
  n_classes = cm.shape[0] # find the number of classes we're dealing with

  # Plot the figure and make it pretty
  fig, ax = plt.subplots(figsize=figsize)
  cax = ax.matshow(cm, # colors will represent how 'correct' a class is, darker == better

  # Are there a list of classes?
  if classes:
    labels = classes
    labels = np.arange(cm.shape[0])
  # Label the axes
  ax.set(title="Confusion Matrix",
         xlabel="Predicted label",
         ylabel="True label",
         xticks=np.arange(n_classes), # create enough axis slots for each class
         xticklabels=labels, # axes will labeled with class names (if they exist) or ints
  # Make x-axis labels appear on bottom

  # Set the threshold for different colors
  threshold = (cm.max() + cm.min()) / 2.

  # Plot the text on each cell
  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    if norm:
      plt.text(j, i, f"{cm[i, j]} ({cm_norm[i, j]*100:.1f}%)",
              color="white" if cm[i, j] > threshold else "black",
      plt.text(j, i, f"{cm[i, j]}",
              color="white" if cm[i, j] > threshold else "black",

  # Save the figure to the current working directory
  if savefig:

After executing the custom function, we can get the confusion metric as below

We can see that few attacked scenerios (around 125) are predicted as normal. The model performed well but to make a full proof system we need to reduce 125 (FP) to a much lower number. Further scope to improve the performance of the model by reducing the number from 125 to a number which close to zero. The system is able to detect 99.9 % of the attacks.

Future improvement and discussion can be implementation of deeplearning algorithims and also possible deployement option. Generaly speaking this systems should be deployed at edge device. Possible edge device deployement strategies can be explored.

The complete code is available in my github

Reference :

Hussain, F.; Abbas, S.G.; Shah, G.A.; Pires, I.M.; Fayyaz, U.U.; Shahzad, F.; Garcia, N.M.; Zdravevski, E. A Framework for Malicious Traffic Detection in IoT Healthcare Environment. Sensors2021, 21, 3025.

64 views0 comments

Recent Posts

See All

Introduction to transformer- part II

In the last article, we introducted the advatnges of the transformer based models and also understood different components of a transformer based models from a black box perspective. In this article,


bottom of page