Skip to content
Snippets Groups Projects
user avatar
serly_moghadas authored
958a8533

Introduction

This repository contains the source code for our paper "Spotting Deep Neural Network Vulnerabilities in Mobile Traffic Forecasting with an Explainable AI Lens" published on Infocom 2022.

Description

Dependencies

This repository uses Anaconda. All the scripts are written in python and in order to run the scripts, you need to have jupyterlab or jupyternotebook installed on your machine. After installing Anaconda and Jupyter, the required libararies and their compatable versions can be extracted from xai.yml file or you can directly create a conda environment which will download and install all the required packages by running conda env create -f xai.yml in the command line.
After creating the environment, make sure to activate it with:
conda activate xai
Then, you can run the notebooks with:
jupyter notebook
This command will generate a link and if you right click on it and open it, it will open a page on your browser and you can navigate and open the desired notebook. NOTE: If you are running on a remote server, opening the link won't show anything. For this to work, after doing all the above, open a new terminal and type this command:
ssh -N -L yourremoteport:localhost:yourremoteport name@ip
Now if you open the link, you can see all the folders in the remote machine and navigate through it and open the notebooks.

Datasets

We use two datasets: Milan dataset and EU Metropolitan Area (EUMA) dataset: Each dataset provides the temporally aggregated internet activity of each cell withing it's region. Milan dataset is structured as 100100 grid area with each cell containing the internet activity as a proxy to load. The load in each cell is captured in 1 Nov 2013- 1 Jan 2014 further this load is temporally aggregated every 10 minutes. More information regarding this dataset is available in their paper.
The EUMA dataset is structured as 48
120 grid area and contains more recent data captured in 2019 for 15 days. The load is direclty measured and is temporally aggregated every minute in each cell. Milan dataset is publicly available and can be accessed and downloaded from this link. After downloading the dataset, use the script extract_bs.py , to extract the internet activity from the rest of data and seperate them for each cell. Unfortunately we cannot make EUMA dataset public, nevertheless, besides minor adjustments, the scripts work for both cities.

Methodology

Directory structure and running order

There are 3 Notebooks files in the main directory.

Train.ipynb

This notebook, trains models and saves the models in the "Trainede_models" folder. The parameters of training can be selected in the "Parameters" cell. There are two different loss functions to train the models with, namely "Capacity forecasting" and "Traffic Forecasting". Other Parametrs can also be adjusted, i.e. "nr" the grid size of nr*nr centered at "cell" , lookback and alpha.

LRP.ipynb

This notebook run the relevance mapping propagation algorithm on the trained models by the test data.

Attack.ipynb

So far, we are using FGSM attack as the baseline attack for perturbing all the cells. But inorder to validate our explainability tool, we have inject some version of the FGSM attack only to the most or least relevant cells defined by the modified LRP tool. We introduce and implement three attack strategies: 1- Sum traffic injection (Denial of Service attack) 2- A subset of DoS attack - (Top 3 Max) 3- Max traffic injection

Sum traffic injection

In this strategy, at each time instance and at each history (T), the sum of all the injected traffic at all the 25 cells at history T added by FGSM algorithm is calculated. This value is added (injected) to the most or least relevant cell at each time instance and at each history (T).