Browsing by Author "Flesca, Sergio"

Now showing 1 - 4 of 4

Methodologies and Applications for Big Data Analytics
(Università della Calabria, 2020-05-02) Cassavia, Nunziato; Crupi, Felice; Flesca, Sergio; Masciari, Elio
Due to the emerging Big Data paradigm, driven by the increase availability of users generated data, traditional data management techniques are inadequate in many real life scenarios. The availability of huge amounts of data pertaining to user social interactions calls for advanced analysis strategies in order to extract meaningful information. Furthermore, heterogeneity and high speed of user generated data require suitable data storage and management and a huge amount of computing power. This dissertation presents a Big Data framework able to enhances user quest for information by exploiting previous knowledge about their social environment. Moreover an introduction to Big Data and NoSQL systems is provided and two basic architecture for Big Data analysis are presented. The framework that enhances user quest, leverages the extent of influence that the users are potentially subject to and the influence they may exert on other users. User influence spread, across the network, is dynamically computed as well to improve user search strategy by providing specific suggestions, represented as tailored faceted features. The approach is tested in an important application scenario such as tourist recommendation where several experiment have been performed to assess system scalability and data read/write efficiency. The study of this system and of advanced analysis on Big Data has shown the need for a huge computing power. To this end an high performance computing system named CoremunitiTM is presented. This system represents a P2P solution for solving complex works by using the idling computational resources of users connected to this network. Users help each other by asking the network computational resources when they face high computing demanding tasks. Differently from many proposals available for volunteer computing, users providing their resources are rewarded with tangible credits. This approach is tested in an interesting scenario as 3D rendering where its efficiency has been compared with "traditional" commercial solutions like cloud platforms and render farms showing shorter task completion times at low cost.
Methodologies and Applications for Big Data Analytics
(Università della Calabria, 2020-05-02) Cassavia, Nunziato; Crupi, Felice; Flesca, Sergio; Masciari, Elio;
Due to the emerging Big Data paradigm, driven by the increase availability of users generated data, traditional data management techniques are inadequate in many real life scenarios. The availability of huge amounts of data pertaining to user social interactions calls for advanced analysis strategies in order to extract meaningful information. Furthermore, heterogeneity and high speed of user generated data require suitable data storage and management and a huge amount of computing power. This dissertation presents a Big Data framework able to enhances user quest for information by exploiting previous knowledge about their social environment. Moreover an introduction to Big Data and NoSQL systems is provided and two basic architecture for Big Data analysis are presented. The framework that enhances user quest, leverages the extent of influence that the users are potentially subject to and the influence they may exert on other users. User influence spread, across the network, is dynamically computed as well to improve user search strategy by providing specific suggestions, represented as tailored faceted features. The approach is tested in an important application scenario such as tourist recommendation where several experiment have been performed to assess system scalability and data read/write efficiency. The study of this system and of advanced analysis on Big Data has shown the need for a huge computing power. To this end an high performance computing system named CoremunitiTM is presented. This system represents a P2P solution for solving complex works by using the idling computational resources of users connected to this network. Users help each other by asking the network computational resources when they face high computing demanding tasks. Differently from many proposals available for volunteer computing, users providing their resources are rewarded with tangible credits. This approach is tested in an interesting scenario as 3D rendering where its efficiency has been compared with "traditional" commercial solutions like cloud platforms and render farms showing shorter task completion times at low cost.
A novel cooperative framework for Web 3.0 investigating recommendation and process mining issues
(2013-11-25) Bevacqua, Antonio; Flesca, Sergio; Greco, Sergio
Querying Inconsistent Data: Repairs and Consistent Answers
(2012-11-09) Parisi, Francesco; Flesca, Sergio; Talia, Domenico
In this dissertation we provide an extensive survey of the techniques for repairing and querying inconsistent relational databases. We distinguish four parameters for classifying and comparing of the existing techniques. First, we discern two repairing paradigms, namely the tuple-based and the attribute-based repairing paradigm. According to the former paradigm a re- pair for a database is obtained by inserting and=or deleting tuples, whereas according to the latter a repair is obtained by (also) modifying attribute values within tuples. Second, we distinguish several repair semantics which entail di®erent orders among the set of consistent database instances that can be obtained for an inconsistent database with respect to a given set of integrity constraints. Third, we classify the techniques on the basis of the classes of queries considered for computing consistent answers. Finally, we compare the di®erent approaches in literature on basis of the classes of integrity constraints which are assumed to be de¯ned on the database. 2) We investigate the problem of repairing and extracting reliable information from data violating a given set of aggregate constraints. These constraints consist of linear inequalities on aggregate-sum queries issued on measure values stored in the database. This syntactic form enables meaningful con- straints to be expressed. Indeed, aggregate constraints frequently occur in many real-life scenarios where guaranteeing the consistency of numerical data is mandatory. We consider database repairs consisting of sets of value-update opera- tions aiming at re-constructing the correct measure values of inconsistent data. We adopt two di®erent criteria for determining whether a set of update operations repairing data can be considered \reasonable" or not: set-minimal semantics and card-minimal semantics. Both these semantics aim at preserving the information represented in the source data as much as possible. They correspond to di®erent repairing strategies which turn out to be well-suited for di®erent application scenarios. We provide the complexity characterization of three fundamental prob- lems: (i) repairability: is there at least one (possible not minimal) repair for the given database with respect to the speci¯ed constraints? (ii) repair checking: given a set of update operations, is it a minimal repair? (iii) consistent query answer: is a given query true in every minimal repair? 3) We provide a method for computing card-minimal repairs for a database in presence of steady aggregate constraints, a restricted but expressive class of aggregate constraints. Under steady aggregate constraints, an instance of the problem of computing a card-minimal repair can be transformed into an instance of a Mixed-Integer Linear Programming (MILP) problem. Thus, standard techniques and optimizations addressing MILP problems can be re-used for computing a repairs. On the basis of this data-repairing framework, we propose an architecture providing robust data acquisition facilities from input documents contain- ing tabular data. We exploit integrity constraints de¯ned on the input data to support the detection and the repair of inconsistencies in the data arising from errors occurring in the acquisition phase performed on input data.