Distributed Systems and Algorithms in Smart Cities

Ph.D. dissertation of Attila Mátyás Nagy
Supervisor: Simon Vilmos

Dissertation, Theses booklet (EN), Theses booklet (HU)

Abstract

Nowadays, due to the urbanization processes, the big cities’ road network became more crowded. On the road network more and more traffic congestions can be observed, which are leading to environmental degradation and significantly reducing the quality of life of every city inhabitant. Traffic jams also have severe economic costs.

For the big cities, it is crucial to reduce the number of emerging traffic congestions and to moderate the congestions’ negative effects. Thanks to the rapid development of the road infrastructure, more and more traffic data is becoming available, which can be utilized to optimize the traffic control management. Smart methods could be used in intelligent transportation systems to eliminate traffic congestions. This dissertation is focused on the analysis of traffic data collected from smart cities’ infrastructure and the development of machine learning-based prediction and anomaly detection methods for traffic management systems.

In my dissertation, I first dealt with the propagation patterns of traffic congestions. My goal was to create a traffic prediction model, which can incorporate real-time traffic congestion data to improve prediction accuracy on frequently congested roads. To accomplish this, I first designed a novel algorithm, which finds frequent congestion patterns with linear time complexity in massive traffic datasets. This is used together with another method of mine, which effectively determines the expected propagation times and probabilities of the identified propagation patterns. Finally, I created a hybrid prediction model that can accurately forecast the next traffic states integrating congestion information. My studies showed that the use of congestion data significantly improves the accuracy of the forecasts. The accurate traffic forecast can support traffic management systems in managing the road network and allocating resources systematically.

Since traffic incidents are one of the major causes of traffic jams, my next goal was to develop a real-time automatic incident detection algorithm that identifies incidents more reliable than other methods from the literature. My method uses a new transient-based approach, which focuses on the changes in traffic. I also created a public incident dataset using real traffic data as the available datasets were too small to build an accurate method. Besides the detection method, I designed another algorithm to discover the propagation of incidents’ effect. In order to reduce the negative effects of unexpected traffic incidents, it is essential for intelligent city management systems to be able to respond as quickly as possible to unexpected situations. The evaluations showed that the proposed methods can provide fast and reliable information for intelligent city management systems.

In the last chapter, I focused on the effective handling of smart cities’ massive datasets. Currently, the efficient and accurate processing of enormous time-series datasets poses a particular challenge to data scientists. I presented a multivariate extension of Symbolic Aggregate Approximation (SAX), which allows expressing a multivariate time series with one sequence of symbols. Since I also provided a distance function and a parameter optimization strategy for my method, it can be applied for clustering and classification tasks in smart cities, decreasing the enormous data set storage and speeding up the big data processing.