Скачать книгу - Clustering

This is the first book to take a truly comprehensive look at clustering. It begins with an introduction to cluster analysis and goes on to explore: proximity measures; hierarchical clustering; partition clustering; neural network-based clustering; kernel-based clustering; sequential data clustering; large-scale data clustering; data visualization and high-dimensional data clustering; and cluster validation. The authors assume no previous background in clustering and their generous inclusion of examples and references help make the subject matter comprehensible for readers of varying levels and backgrounds.

Financial Models with Levy Processes and Volatility Clustering

Автор: Frank J. Fabozzi

Год издания:

An in-depth guide to understanding probability distributions and financial modeling for the purposes of investment management In Financial Models with Levy Processes and Volatility Clustering, the expert author team provides a framework to model the behavior of stock returns in both a univariate and a multivariate setting, providing you with practical applications to option pricing and portfolio management. They also explain the reasons for working with non-normal distribution in financial modeling and the best methodologies for employing it. The book's framework includes the basics of probability distributions and explains the alpha-stable distribution and the tempered stable distribution. The authors also explore discrete time option pricing models, beginning with the classical normal model with volatility clustering to more recent models that consider both volatility clustering and heavy tails. Reviews the basics of probability distributions Analyzes a continuous time option pricing model (the so-called exponential Levy model) Defines a discrete time model with volatility clustering and how to price options using Monte Carlo methods Studies two multivariate settings that are suitable to explain joint extreme events Financial Models with Levy Processes and Volatility Clustering is a thorough guide to classical probability distribution methods and brand new methodologies for financial modeling.

Spectral Clustering and Biclustering. Learning Large Graphs and Contingency Tables

Автор: Marianna Bolla

Год издания:

Explores regular structures in graphs and contingency tables by spectral theory and statistical methods This book bridges the gap between graph theory and statistics by giving answers to the demanding questions which arise when statisticians are confronted with large weighted graphs or rectangular arrays. Classical and modern statistical methods applicable to biological, social, communication networks, or microarrays are presented together with the theoretical background and proofs. This book is suitable for a one-semester course for graduate students in data mining, multivariate statistics, or applied graph theory; but by skipping the proofs, the algorithms can also be used by specialists who just want to retrieve information from their data when analysing communication, social, or biological networks. Spectral Clustering and Biclustering: Provides a unified treatment for edge-weighted graphs and contingency tables via methods of multivariate statistical analysis (factoring, clustering, and biclustering). Uses spectral embedding and relaxation to estimate multiway cuts of edge-weighted graphs and bicuts of contingency tables. Goes beyond the expanders by describing the structure of dense graphs with a small spectral gap via the structural eigenvalues and eigen-subspaces of the normalized modularity matrix. Treats graphs like statistical data by combining methods of graph theory and statistics. Establishes a common outline structure for the contents of each algorithm, applicable to networks and microarrays, with unified notions and principles.

Co-Clustering

Автор: Gerard Govaert

Год издания:

Cluster or co-cluster analyses are important tools in a variety of scientific areas. The introduction of this book presents a state of the art of already well-established, as well as more recent methods of co-clustering. The authors mainly deal with the two-mode partitioning under different approaches, but pay particular attention to a probabilistic approach. Chapter 1 concerns clustering in general and the model-based clustering in particular. The authors briefly review the classical clustering methods and focus on the mixture model. They present and discuss the use of different mixtures adapted to different types of data. The algorithms used are described and related works with different classical methods are presented and commented upon. This chapter is useful in tackling the problem of co-clustering under the mixture approach. Chapter 2 is devoted to the latent block model proposed in the mixture approach context. The authors discuss this model in detail and present its interest regarding co-clustering. Various algorithms are presented in a general context. Chapter 3 focuses on binary and categorical data. It presents, in detail, the appropriated latent block mixture models. Variants of these models and algorithms are presented and illustrated using examples. Chapter 4 focuses on contingency data. Mutual information, phi-squared and model-based co-clustering are studied. Models, algorithms and connections among different approaches are described and illustrated. Chapter 5 presents the case of continuous data. In the same way, the different approaches used in the previous chapters are extended to this situation. Contents 1. Cluster Analysis. 2. Model-Based Co-Clustering. 3. Co-Clustering of Binary and Categorical Data. 4. Co-Clustering of Contingency Tables. 5. Co-Clustering of Continuous Data. About the Authors Gerard Govaert is Professor at the University of Technology of Compiegne, France. He is also a member of the CNRS Laboratory Heudiasyc (Heuristic and diagnostic of complex systems). His research interests include latent structure modeling, model selection, model-based cluster analysis, block clustering and statistical pattern recognition. He is one of the authors of the MIXMOD (MIXtureMODelling) software. Mohamed Nadif is Professor at the University of Paris-Descartes, France, where he is a member of LIPADE (Paris Descartes computer science laboratory) in the Mathematics and Computer Science department. His research interests include machine learning, data mining, model-based cluster analysis, co-clustering, factorization and data analysis. Cluster Analysis is an important tool in a variety of scientific areas. Chapter 1 briefly presents a state of the art of already well-established as well more recent methods. The hierarchical, partitioning and fuzzy approaches will be discussed amongst others. The authors review the difficulty of these classical methods in tackling the high dimensionality, sparsity and scalability. Chapter 2 discusses the interests of coclustering, presenting different approaches and defining a co-cluster. The authors focus on co-clustering as a simultaneous clustering and discuss the cases of binary, continuous and co-occurrence data. The criteria and algorithms are described and illustrated on simulated and real data. Chapter 3 considers co-clustering as a model-based co-clustering. A latent block model is defined for different kinds of data. The estimation of parameters and co-clustering is tackled under two approaches: maximum likelihood and classification maximum likelihood. Hard and soft algorithms are described and applied on simulated and real data. Chapter 4 considers co-clustering as a matrix approximation. The trifactorization approach is considered and algorithms based on update rules are described. Links with numerical and probabi

Advances in Fuzzy Clustering and its Applications

Автор: Witold Pedrycz

Год издания:

A comprehensive, coherent, and in depth presentation of the state of the art in fuzzy clustering. Fuzzy clustering is now a mature and vibrant area of research with highly innovative advanced applications. Encapsulating this through presenting a careful selection of research contributions, this book addresses timely and relevant concepts and methods, whilst identifying major challenges and recent developments in the area. Split into five clear sections, Fundamentals, Visualization, Algorithms and Computational Aspects, Real-Time and Dynamic Clustering, and Applications and Case Studies, the book covers a wealth of novel, original and fully updated material, and in particular offers: a focus on the algorithmic and computational augmentations of fuzzy clustering and its effectiveness in handling high dimensional problems, distributed problem solving and uncertainty management. presentations of the important and relevant phases of cluster design, including the role of information granules, fuzzy sets in the realization of human-centricity facet of data analysis, as well as system modelling demonstrations of how the results facilitate further detailed development of models, and enhance interpretation aspects a carefully organized illustrative series of applications and case studies in which fuzzy clustering plays a pivotal role This book will be of key interest to engineers associated with fuzzy control, bioinformatics, data mining, image processing, and pattern recognition, while computer engineers, students and researchers, in most engineering disciplines, will find this an invaluable resource and research tool.

Knowledge-Based Clustering

Автор: Witold Pedrycz

Год издания:

A comprehensive coverage of emerging and current technology dealing with heterogeneous sources of information, including data, design hints, reinforcement signals from external datasets, and related topics Covers all necessary prerequisites, and if necessary,additional explanations of more advanced topics, to make abstract concepts more tangible Includes illustrative material andwell-known experimentsto offer hands-on experience