Tesi di Dottorato
Permanent URI for this communityTesi di Dottorato
Browse
2 results
Search Results
Item Ensemble learning techniques for cyber security applications(2017-07-13) Pisani, Francesco Sergio; Crupi, Felice; Folino, GianluigiCyber security involves protecting information and systems from major cyber threats; frequently, some high-level techniques, such as for instance data mining techniques, are be used to efficiently fight, alleviate the effect or to prevent the action of the cybercriminals. In particular, classification can be efficiently used for many cyber security application, i.e. in intrusion detection systems, in the analysis of the user behavior, risk and attack analysis, etc. However, the complexity and the diversity of modern systems opened a wide range of new issues difficult to address. In fact, security softwares have to deal with missing data, privacy limitation and heterogeneous sources. Therefore, it would be really unlikely a single classification algorithm will perform well for all the types of data, especially in presence of changes and with constraints of real time and scalability. To this aim, this thesis proposes a framework based on the ensemble paradigm to cope with these problems. Ensemble is a learning paradigm where multiple learners are trained for the same task by a learning algorithm, and the predictions of the learners are combined for dealing with new unseen instances. The ensemble method helps to reduce the variance of the error, the bias, and the dependence from a single dataset; furthermore, it can be build in an incremental way and it is apt to distributed implementations. It is also particularly suitable for distributed intrusion detection, because it permits to build a network profile by combining different classifiers that together provide complementary information. However, the phase of building of the ensemble could be computationally expensive as when new data arrives, it is necessary to restart the training phase. For this reason, the framework is based on Genetic Programming to evolve a function for combining the classifiers composing the ensemble, having some attractive characteristics. First, the models composing the ensemble can be trained only on a portion of the training set, and then they can be combined and used without any extra phase of training. Moreover the models can be specialized for a single class and they can be designed to handle the difficult problems of unbalanced classes and missing data. In case of changes in the data, the function can be recomputed in an incrementally way, with a moderate computational effort and, in a streaming environment, drift strategies can be used to update the models. In addition, all the phases of the algorithm are distributed and can exploits the advantages of running on parallel/ distributed architectures to cope with real time constraints. The framework is oriented and specialized towards cyber security applications. For this reason, the algorithm is designed to work with missing data, unbalanced classes, models specialized on some tasks and model working with streaming data. Two typical scenarios in the cyber security domain are provided and some experiment are conducted on artificial and real datasets to test the effectiveness of the approach. The first scenario deals with user behavior. The actions taken by users could lead to data breaches and the damages could have a very high cost. The second scenario deals with intrusion detection system. In this research area, the ensemble paradigm is a very new technique and the researcher must completely understand the advantages of this solution.Item User behavioral problems in complex social networks(2019-06-20) Perna, Diego; Tagarelli, Andrea; Crupi, FeliceOver the past two decades, we witnessed the advent and the rapid growth of numerous social networking platforms. Their pervasive diffusion dramatically changed the way we communicate and socialize with each other. They introduce new paradigms and impose new constraints within their scope. On the other hand, online social networks (OSNs) provide scientists an unprecedented opportunity to observe, in a controlled way, human behaviors. The goal of the research project described in this thesis is to design and develop tools in the context of network science and machine learning, to analyze, characterize and ultimately describe user behaviors in OSNs. After a brief review of network-science centrality measures and ranking algorithms, we examine the role of trust in OSNs, by proposing a new inference method for controversial situations. Afterward, we delve into social boundary spanning theory and define a ranking algorithm to rank and consequently identify users characterized by alternate behavior across OSNs. The second part of this thesis deals with machine-learning-based approaches to solve problems of learning a ranking function to identify lurkers and bots in OSNs. In the last part of this thesis, we discuss methods and techniques on how to learn a new representational space of entities in a multilayer social network.