一、论文题目
Multi-View Attribute Graph Convolution Networks for Clustering
二、Abstract
(1) 本文提出的问题
1.在GNN中,现有方法无法将可学习的权重分配给邻域中的不同节点,并且由于忽略节点属性和图重建而缺乏鲁棒性;
2.大多数多视图 GNN 主要关注多图的情况,而设计用于解决多视图属性的图结构数据的 GNN 仍处于探索阶段。
(2)本文的创新之处
1.提出了一个多属性图卷积网络(MAGCN)用于聚类任务;
2.MAGCN 设计有双路径编码器,可映射图嵌入特征并学习视图一致性信息;
(1)第一个路径编码器:开发多视图属性图注意力网络以减少噪声/冗余信息,并学习多视图图数据的图嵌入特征;
(2)第二个路径编码器:开发一致的嵌入编码器捕获不同视图之间的几何关系和概率分布的一致性,从而自适应地为多视图属性找到一致的聚类嵌入空间;
三、Introduction
(一)Multi-view clustering
1.定义:
是机器学习中的一项基本任务。其目的是整合多个特征,并在不同视图之间找到一致的信息;
2.研究现状:
在欧洲领域取得了可观的研究成果;
3.存在挑战:
不再适合非欧洲领域的数据,也不适合目前一些深入研究的数据,比如社交网络、引文网络形成的图数据;
4.解决方法:
Graph embedding技术,可以有效地探索基于图结构的数据;
(二)Graph Embedding
1.定义:
将图数据转换为低维、紧凑、连续的特征空间,通常通过matrix-factorization、random-walk、GNN来实现;
2.研究现状:
(1)由于GNN的效率和inductive learning能力,GNN成为了当前最受欢迎的方法;
(2)GNN 通过应用多个图卷积层通过非线性变换和聚合函数收集节点邻居的信息来计算图节点的嵌入,通过此方式,既可以获得图数据的拓扑信息,也可获取节点的属性信息;
3.存在挑战:
虽然上述 GNN 可以有效地处理单视图图数据,但它们不适用于多视图图数据;
Multi-view和Singal-view的区别????????
multi-view就是说,在一个图数据中,每个节点具有属性信息,且属性值有多个。
4.解决方法:
研究如何将 GNN 用于多图结构的多视图数据,也就是使用Multi-view GNN模型;
(三)Multi-view GNN
1.定义:
2.研究现状:
3.存在挑战:
(1)他们无法为邻域中的不同节点分配可学习的指定不同权重;
(2)他们忽略了可以对节点属性和图结构进行重构以提高鲁棒性的问题;
(3)对于不同视图之间的一致性关系,没有明确考虑相似距离度量;
(4)现有的多视图 GNN 方法主要关注多图的情况,而忽略了同样重要的属性多样性,
4.解决方法:
本文的Motivation,即MAGCN。
(四)MAGCN
1.Key Contributions:
(1)提出了一种新颖的多视图属性图卷积网络,用于对多视图属性的图结构数据进行聚类;
(2)我们开发了具有注意机制的多视图属性图卷积编码器,以减少多视图图数据的噪声/冗余。 此外,考虑重建节点属性和图结构以提高鲁棒性;
(3)一致性嵌入编码器旨在通过探索不同视图的几何关系和概率分布一致性来提取多个视图之间的一致性信息。
四、Related Work
(一)Learning Graph node embedding with GNNS
1.基本思想:
Existing GNNs models in processing graph-structured data belong to a set ofgraph message-passing architecturesthat use differentaggregation schemesfor a node toaggregate feature messages from its neighborsin the graph.
2.代表模型:
(1)Graph Convolutional Networks(GCN) scale linearly in the number of graph edges and learn hidden layer representations that encode both local graph structure and features of nodes;
(2)Graph Attention Networks(GAT) enable specifying different weights to different nodes in neighborhood, by stacking self-attention layers in which nodes are able to attend over their neighborhoods’ features;
(3)Graph SAGEconcatenates the node’s feature with persified pooled neighborhood information and effectively trades off performance and runtime by sampling node neighborhoods;
(4)Message Passing Neural Networksfurther incorporate edge information when doing the aggregation.
(二) Multi-view Graph node embedding
1.研究现状:
(1)Spatiotemporal multi-graph convolution network, encodes the non-Euclidean correlations among regions using multiple graphs and explicitly captures them using multi graph convolution encoder;
(2) In application accounting for social networks,Multi-GCNincorporates non-redundant information from multiple views into the learning process;
(3) [Ma et al., 2018] utilizemultiview graph auto-encoder, which integrates heterogeneous, noisy, nonlinear-related information to learn accurate similarity measures especially when labels are scarce;
2.存在的挑战:
(1)Those multi-view GNNs cannot allocate learnable specifying different weights to different nodes in neighborhood;
(2) The clustering performance of them is limited as they do not consider the structure and distribution consistency for clustering embedding;
(3)Existing multi-view GNNs mainly focus on the case of multiple graphs and neglect the equally important attribute persity;
五、Proposed Methodology
(一)Natation
1.图表示:,
其中表示节点的集合,
表示边的集合;
2.图中节点的m-th view attribute feature表示为:,其中
表示节点
的特征向量,
表示the number of views;
(二)The Framework of MAGCN
整个想法:
(1)First,encode multi-view graph datainto graph embedding
by multi-view attribute graph convolution encoders.
(2)Then,fedinto consistent embedding encoders and obtain a consistent clustering embedding
.
(3)The clustering process is eventually conducted on the ideal embedding intrinsic description space which is computed by;
(三)Multi-view Attribute Graph Convolution Encoder
1.Encoding
- The first pathway encoders map multi-view node attribute matrix and graph structure into graph embedding space. Specifically,
for the m-th view
, It maps graphand m-th view attributes
to d-dimensional graph embedding features
by the following
GCN encoder model
:,where
is the
relevance coefficient matrix
with added self-connection. As for, when
,
is the initial m-th view attribute matrix
and when
,
is the final graph embedding feature representation
.
2.To determine the relevance between nodes and their neighbors, we use a attention mechanism with shared parameters among nodes。In themulti-view encoder layer, the learnable relevance matrix
is defined as :
,where
and
represent the trainable parameters related to their own nodes and neighbor nodes, respectively.
refers to theelement-wise multiplicationwith broadcasting capability. We normalize
to get the final relevance coefficient
, so
is computed by:
, where
is the set of all nodes adjacent to node
.
- We consider the
preserves basically all information about multi-view node attribute matrix
and graph structure
.
2.Decoding
- In the decoding process, we use
the same number of layers as encoders for decoders
, and each decoder layer tries to reverse its corresponding encoder layer. In other words, the decoding process is the inverse of the encoding process; - The final decoded output is reconstructed node attribute matrix
and the reconstructed graph structure
,
, the GCN decoder model is :
, so
, otherwise,
is implemented by an
inner product decoder
ofand
, specifically,
, where
is the inner product operator.
3. The reconstruction loss
The reconstruction lossof reconstructed multi-view node attribute matri
and reconstructed graph structure
can be computed by following:
(四)Consistent Embedding Encoders
1.Geometric relationship consistency
1.is mapped into low-dimensional space
,
contains almost all the original information so that it is not suitable for multi-view integrating directly. Then, we use consistent clustering layer to learn a common clustering embedding
which is adaptively integrated by all the
.
2.Assumeand
are the low-dimensional space feature matrices of view
and
obtained from consistent embedding encoders。Then we can use them to compute a geometric relationship similarity score as
, where
is a similarity function.
can be measured by the Manhattan Distance, Euclidean distance, cosine similarity, etc。So the loss function ofgeometric relationship consistency
is:
2.The consistency of the probability distribution
1.The auxiliary distributionof
with trade-off parameters
as follows:
2.计算细节:
(五)Task for Clustering
1.The total loss function of the proposed MAGCN is eventully formulated as:
2.Then we predict the cluster of each node from auxiliary distribution。For node
, its cluster can be calculated by
, in which the index with the highest probability value is the i’s cluster。Hence we could obtain the cluster label of node
as:
文章出处登录后可见!