本节我们将继续介绍粗糙集有关的概念。
上节我们介绍了知识粒度的度量,本节将介绍知识粒度的矩阵表示形式。
我们先简单介绍矩阵的相关概念。
矩阵
先看矩阵的和,差。
矩阵的和:
若$A=(a_){m \times n}$,$B=(b){m \times n}$是两个$m \times n$的矩阵,则两个矩阵的和$C=(c)_{m \times n}$为
\[
C = A+B \quad \Longrightarrow \quad c_{ij}=a_{ij}+b_{ij}
\]
\[
=\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn} \\
\end{bmatrix} +
\begin{bmatrix}
b_{11} & b_{12} & \cdots & b_{1n} \\
b_{21} & b_{22} & \cdots & b_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
b_{m1} & b_{m2} & \cdots & b_{mn} \\
\end{bmatrix}
\]
\[
=\begin{bmatrix}
a_{11}+b_{11} & a_{12}+b_{12} & \cdots & a_{1n}+b_{1n} \\
a_{21}+b_{21} & a_{22}+b_{22} & \cdots & a_{2n}+b_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1}+b_{m1} & a_{m2}+b_{m2} & \cdots & a_{mn}+b_{mn} \\
\end{bmatrix}
\]
类似的,两个矩阵的差:
\[
C = A-B \quad \Longrightarrow \quad c_{ij}=a_{ij}-b_{ij}
\]
\[
= \begin{bmatrix}
a_{11}-b_{11} & a_{12}-b_{12} & \cdots & a_{1n}-b_{1n} \\
a_{21}-b_{21} & a_{22}-b_{22} & \cdots & a_{2n}-b_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1}-b_{m1} & a_{m2}-b_{m2} & \cdots & a_{mn}-b_{mn} \\
\end{bmatrix}
\]
矩阵的转置:
\[
A= \begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{n1} & a_{n2} & \cdots & a_{nn} \\
\end{bmatrix}
\]
则矩阵$A$的转置矩阵$A^T$为:
\[
A^T= \begin{bmatrix}
a_{11} & a_{21} & \cdots & a_{n1} \\
a_{12} & a_{22} & \cdots & a_{n2} \\
\vdots & \vdots & \ddots & \vdots \\
a_{1n} & a_{2n} & \cdots & a_{nn} \\
\end{bmatrix}
\]
最后来看矩阵的乘积:
若$A=(a_){m \times n}$,$B=(b){n \times p}$是两个矩阵
则两个矩阵的乘积$A \times B =C=(c)_{m \times p}$ 为:
\[
C = A \times B \quad \Longrightarrow \quad (c_{ij})_{m \times p}=(\sum_{k=1}^{n} a_{ik}\cdot b_{kj})_{m \times p}
\]
\[
= \begin{bmatrix}
\sum_{k=1}^{n} a_{1k}b_{k1} & \sum_{k=1}^{n}a_{1k}b_{k2} & \cdots & \sum_{k=1}^{n} a_{1k}b_{kp} \\
\sum_{k=1}^{n} a_{2k}b_{k1} & \sum_{k=1}^{n}a_{2k}b_{k2} & \cdots & \sum_{k=1}^{n} a_{2k}b_{kp} \\
\vdots & \vdots & \ddots & \vdots \\
\sum_{k=1}^{n} a_{mk}b_{k1} & \sum_{k=1}^{n}a_{mk}b_{k2} & \cdots & \sum_{k=1}^{n} a_{mk}b_{kp} \\
\end{bmatrix}
\]
知识粒度的矩阵表现形式
我们依旧使用该表
| \(U\) |
\(a\) |
\(b\) |
\(c\) |
\(e\) |
\(f\) |
\(d\) |
| 1 |
0 |
1 |
1 |
1 |
0 |
1 |
| 2 |
1 |
1 |
0 |
1 |
0 |
1 |
| 3 |
1 |
0 |
0 |
0 |
1 |
0 |
| 4 |
1 |
1 |
0 |
1 |
0 |
1 |
| 5 |
1 |
0 |
0 |
0 |
1 |
0 |
| 6 |
0 |
1 |
1 |
1 |
1 |
0 |
| 7 |
0 |
1 |
1 |
1 |
1 |
0 |
| 8 |
1 |
0 |
0 |
1 |
0 |
1 |
| 9 |
1 |
0 |
0 |
1 |
0 |
0 |
等价关系矩阵的定义如下:
设$S=(U,A=C \bigcup D,V,f)$是一个决策信息系统,论域$U={u_{1},u_{2},...,u_ }$,$n$是论域内元素个数,\(U/C=\{X_{1},X_{2},...,X_{m}\}\),$R_$是论域$U$的等价关系。则等价关系矩阵$U_^{R_} = (m_)_{n \times n}$定义如下:
\[
m_{ij}
=\begin{cases}
1 & (u_{i},u_{j}) \in R_{C} \\
0 & (u_{i},u_{j}) \notin R_{C}
\end{cases}
\]
其中,\({1 \leq i,j \leq n}\)。
基于矩阵的知识粒度如下:
设$S=(U,A=C \bigcup D,V,f)$是一个决策信息系统,$U_^{R_} = (m_)_{n \times n}$是等价关系矩阵,条件属性$C$基于矩阵的知识粒度定义如下:
\[
GP_{U}(C)=\frac{sum\left(M_{U}^{R_{C}}\right)}{|U|^{2}}=\overline{M_{U}^{R_{C}}}
\]
其中,$sum\left(M_{R_}\right)$是等价矩阵内$1$的个数总和,$\overline{M_{R_}}$是矩阵内所有元素的均值。
依旧上表,我们可以计算$GP_(C)$:
\[
GP_{U}(C)=\overline{M_{U}^{R_{C}}}=\frac{1}{81}\times\operatorname{sum}(\left[\begin{array}{ccccccccc}
{1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {1} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {1}
\end{array}\right])=\frac{17}{81}
\]
这和我们在上节计算得到的结果是一致的。
类似的,相对知识粒度的定义如下:
若$S=(U,A=C \bigcup D,V,f)$是一个决策信息系统,\(U_{U}^{R_{C}}\),$U_^{R_{C \bigcup D}}$是等价关系矩阵,则决策属性$D$关于条件属性$C$基于矩阵的相对知识粒度定义如下:
\[
G P_{U}(D\mid C)=\overline{U_{U}^{R_{C}}}-\overline{U_{U}^{R_{C \bigcup D}}}
\]
根据上表,我们可以计算$GP_(D \mid C)$:
\[
GP_{U}(D \mid C)=\overline{U_{U}^{R_{C}}}-\overline{U_{U}^{R_{C \bigcup D}}}
\]
\[
=\frac{1}{81}\times\operatorname{sum}(\left[\begin{array}{ccccccccc}
{1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {1} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {1}
\end{array}\right] - \left[\begin{array}{ccccccccc}
{1} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} & {0} \\
{0} & {0} & {1} & {0} & {1} & {0} & {0} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {1} & {1} & {0} & {0} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {1} & {0} \\
{0} & {0} & {0} & {0} & {0} & {0} & {0} & {0} & {1}
\end{array}\right]) =\frac{2}{81}
\]
这与我们之前计算的结果是一致的。
类似的,基于矩阵的内外部属性重要度的定义如下:
内部属性重要度:
若$S=(U,A=C \bigcup D,V,f)$是一个决策信息系统,$B\subseteq
C$,且$U_{R_}$,\(U_{U}^{R_{B-\{a\} }}\),\(U_{U}^{R_{B \bigcup D}}\),$U_{R_{(B -{a}) \bigcup D}}\(都是等价关系矩阵,\)\forall a \in B$,则属性$a$关于条件属性$B$相对于决策属性集$D$的基于矩阵的相对知识粒度定义如下:
\[
\operatorname{Sig}_{U}^{inner }(a, B, D)=GP_{U}(D \mid B-\{a\})-GP_{U}(D \mid B)
\]
\[
=\{ GP_{U}(B-\{a\})-GP_{U}((B-\{a\}) \bigcup D) \}-\{GP_{U}(B)-GP_{U}(B \bigcup D) \}
\]
\[
=\overline{M_{U}^{R_{B-\{a \}}}}-\overline{M_{U}^{R_{(B -\{a\}) \bigcup D}}}-\overline{M_{U}^{R_{B}}}+\overline{M_{U}^{R_{B \bigcup D}}}
\]
外部属性重要度:
若$S=(U,A=C \bigcup D,V,f)$是一个决策信息系统,\(B\subseteq C\),且$U_{R_}$,\(U_{U}^{R_{B \bigcup D}}\),\(U_{U}^{R_{B \bigcup \{a\} }}\),$U_{R_{(B \bigcup {a}) \bigcup D}}\(都是等价关系矩阵,\)\forall a \in (C-B)$,则属性$a$关于条件属性$B$相对于决策属性集$D$的基于矩阵的相对知识粒度定义如下:
\[
\operatorname{Sig}_{U}^{outer }(a, B, D)=GP_{U}(D \mid B)-GP_{U}(D \mid B \bigcup \{a\})
\]
\[
=\{ GP_{U}(B)-GP_{U}(B\bigcup D)\} - \{ GP_{U}(B \bigcup \{a\})-GP_{U}((B\bigcup \{a\}) \bigcup D) \}
\]
\[
=\overline{M_{U}^{R_{B}}}-\overline{M_{U}^{R_{B \bigcup D}}}-\overline{M_{U}^{R_{B \bigcup \{a \} }}}+\overline{M_{U}^{R_{(B \bigcup \{a\}) \bigcup D}}}
\]
参考上节的案例,如果使用矩阵表示的话,结果是一样的,但是基于矩阵的方式在面对大规模数据集是可能不是好的选择。
本文参考了:
- 景运革. 基于知识粒度的动态属性约简算法研究[D].西南交通大学,2017.