Skip to content
Snippets Groups Projects

Introduction

This repository contains the source code for our two papers

  • "Spotting Deep Neural Network Vulnerabilities in Mobile Traffic Forecasting with an Explainable AI Lens" published on IEEE Conference on Computer Communications.
  • "DeExp: Revealing Model Vulnerabilities for Spatio-Temporal Mobile Traffic Forecasting with Explainable AI." accepted for publication on IEEE Transactions on Mobile Computing. \ You can cite us as:

Moghadas Gholian, S., Fiandrino, C., Collet, A., Attanasio, G., Fiore, M., & Widmer, J. (2023). Spotting Deep Neural Network Vulnerabilities in Mobile Traffic Forecasting with an Explainable AI Lens. In IEEE International Conference on Computer Communications.

Moghadas Gholian, S., Fiandrino, C., Vallina-Rodriguez, N., Fiore, M. and Widmer, J., 2025. DeExp: Revealing Model Vulnerabilities for Spatio-Temporal Mobile Traffic Forecasting with Explainable AI. IEEE Transactions on Mobile Computing, pp.1-18.

Dependencies

  • Python 3.8
  • Anaconda.
  • jupyterlab or any IDE that supports .ipynb files.

Cloning the environment

The required libararies and their compatable versions can be extracted from xai.yml file or you can directly create a conda environment which will download and install all the required packages by running conda env create -f xai.yml in the command line.

Datasets

We use two datasets:
Milan dataset and EU Metropolitan Area (EUMA) dataset: Unfortunately we cannot make EUMA dataset public, nevertheless, the scripts work for both cities.

  • Each dataset provides the temporally aggregated internet activity of each cell withing it's region.
  • Milan dataset is structured as 100x100 grid area where different CDR data is captured from various base stations in the region and distributed using Voronoi-tessellation technique among all the cells.
  • We only extract internet activity from these cells which is proxy of load used in each cell. The load in each cell is captured during 1 Nov 2013- 1 Jan 2014 further this load is temporally aggregated every 10 minutes. More information regarding this dataset is available here.
  • The EUMA dataset is structured as 48x120 grid area and contains more recent data captured in 2019 for 15 days. The load is direclty measured and is temporally aggregated every minute in each cell.
  • Milan dataset is publicly available and can be accessed and downloaded from here.
  • After downloading the dataset, use the script extract_bs.py , to extract the internet activity from the rest of data and seperate them for each cell.

Methodology

  • From each region, we select a 21x21 grid in a way that the distributions of the load at each 5x5 region in that 21x21 area are similar.
  • The 21x21 grid makes a total of 441 cells and in each cell we construct a 5x5 grid centered at that cell. Then, we use two state-of-the-art ML architectures for the training:
    • Capacity forecasting model: Aims at forecasting the future traffic of the center cell with the goal of allocating sufficient resources for the operator to jointly minimize overprovisioning and penalty for non-served demands (Service Level Agreement (SLA) violations)
    • Traffic forecasting model: Aims at forecasting the future traffic of the center cell of the 5x5 grid with the goal of minimizing mean absolute error.

For any questions, reach out to serly.moghadas@imdea.org