离散数学与结构课程笔记

设 $a,b$ 是代表集合的字母，规定 $(a \in b)$，$(a = b)$ 是合式公式 (well-formed formula)。$\phi$ 是合式公式，记作 $\phi \in \operatorname{WFF}$。
如果 $\phi \in \operatorname{WFF}$，那么规定 $(\lnot \phi) \in \operatorname{WFF}$，
如果 $\phi,\psi \in \operatorname{WFF}$，那么规定 $(\phi \land \psi),(\phi \lor \psi) \in \operatorname{WFF}$。
如果 $\phi,\psi \in \operatorname{WFF}$，那么规定 $(\phi \Rightarrow \psi),(\phi \Leftrightarrow \psi) \in \operatorname{WFF}$。
如果 $\phi(x) \in \operatorname{WFF}$，那么规定 $(\forall x,\phi(x)),(\exist x,\phi(x)) \in \operatorname{WFF}$。
其余均不是合式公式。

Zermelo–Fraenkel 集合论

外延公理：两个集合相等，当且仅当它们的所有元素都相同。
$$
(\forall X,(\forall Y,((\forall x,((x \in X) \Leftrightarrow (x \in Y))) \Leftrightarrow (X = Y))))
$$
意义：规定了 $=,\in$ 的含义，保证了集合的唯一性。
分离公理模式：对任何集合 $Y$，可以通过不包含 $\forall x,\exist x, X$ 的合式公式 $\phi(x)$，分离出一个新的集合 $X \coloneqq {x \in Y : \phi(x)}$，称为 $Y$ 的子集，记作 $X \sube Y$。
$$
(\forall Y,(\exist X,(\forall x,((x \in X)\Leftrightarrow((x \in Y)\land \phi(x))))))
$$
意义：避开了罗素悖论，定义了子集的含义。
定理：不存在万有集，即包括一切集合的集合。
证明：若存在，设为 $W$。由分离公理模式，存在集合 ${S \in W: (S \notin S)}$，这会导致罗素悖论。
注记：将 $((\lnot Y) \in X)$ 记作 $(Y \notin X)$。
引理：存在空集，记作 $\varnothing$。
$$
(\exist X, (\forall x, (x \notin X)))
$$
说明：事实上，我们首先需要一个公理来保证真的存在一个集合，即无穷公理。我们先承认真的存在一个集合，设为 $W$，则由分离公理模式，存在集合 ${S \in W: ((\lnot S) = S)}$，即为 $\varnothing$。
配对公理：对任何集合 $X,Y$，存在集合 ${X,Y}$。
$$
(\forall X,(\forall Y, (\exist Z, (\forall u, ((u \in Z)\Leftrightarrow((u = X)\or(u = Y)))))))
$$
意义：可以导出无序对和有序对的概念。
并集公理：对任何集合（集族）$I$，存在集合 $A$，使得 $A$ 中的元素恰为 $I$ 中一切集合的一切元素，记作 $A = \bigcup I$。
$$
\forall I,(\exist A,(\forall x,((x \in A)\Leftrightarrow(\exist i,((i \in I)\land(x \in i))))))
$$
结合配对公理可以进行取并操作，将 $\bigcup{A,B}$ 记作 $A \cup B$。
命题：对任何集合（集族）$I$，由分离公理模式，存在集合 $X \coloneqq {x : (\forall i,((i \in I)\Rightarrow(x \in i)))}$，记作 $X = \bigcap I$。
结合配对公理可以进行取交操作，将 $\bigcap{A,B}$ 记作 $A \cap B$。
类似地可定义不交并 $A \sqcup B$、差集 $A \setminus B$、对称差 $A \oplus B$ 等常用的集合操作。
设 $x,y$ 是集合，定义有序对 $(x,y) \coloneqq {{x},{x,y}}$。
幂集公理：对任何集合 $X$，存在其一切子集组成的集合，称为 $X$ 的幂集，记作 $2^X$。
$$
(\forall X,(\exist Y,(\forall Z,((Z \sube X)\Leftrightarrow(Z \in Y)))))
$$
命题：设 $X,Y$ 是集合，则存在笛卡尔积 $X \times Y \coloneqq {(x,y):x \in X,y \in Y}$。
证明：$\forall x \in X, y \in Y$，
$$
{x},{x,y} \sube X \cup Y \
\Rightarrow {x},{x,y} \in 2^{X \cup Y} \
\Rightarrow (x,y) = {{x},{x,y}} \sube 2^{2^{X \cup Y}} \
\Rightarrow X \times Y \coloneqq {(x,y):x \in X,y \in Y} \sube 2^{2^{X \cup Y}} \
$$
然后用分离公理。
称 $R \sube X \times Y$ 为集合 $X,Y$ 之间的一个关系，将 $(x,y) \in R$ 记作 $xRy$。
特别地，$R \sube X \times X$ 称为集合 $X$ 上的一个关系。
映射是指集合 $X,Y$ 之间的一个关系 $R$，满足 $\forall x \in X$，$\exist! y \in Y$ 满足 $xRy$，记作 $f:X \rightarrow Y$，$x \mapsto y$ 或 $f(x) = y$。
注记：$\exist!$ 表示存在唯一，可用轻易合式公式表示。
选择公理：对任何不包含空集的（无限）集合（集族），均存在一个选择映射，它从每个集合中选择一个元素。
$$
(\forall X,((\varnothing \notin X) \Rightarrow (\exist f:X \rightarrow \bigcup X, (\forall A \in X,(f(A) \in A)))))
$$
正则公理：对任何非空集合 $X$，都存在一个 $y \in X$ 使得 $y \cap X = \varnothing$。
$$
(\forall X,((X \ne \varnothing)\Rightarrow(\exist y,((y \in X) \land (y \cap X = \varnothing)))))
$$
推论：不存在以自身为元素之一的集合。
证明：假设存在 $X \in X$，由配对公理可得 $X \in {X}$，因此 $X \in X \cap {X}$，但由正则公理 $X \cap {X} = \varnothing$，矛盾！
替换公理模式：对任何集合 $X$，如果 $\forall x \in X$，满足合式公式 $\phi(x,y,p)$ 的 $y$ 唯一，那么这些 $y$ 构成集合。
$$
(\forall X,(\forall x,((x \in X)\Rightarrow(\exist!y,\phi(x,y,p)))) \Rightarrow (\exist Y,(\forall y,((y \in Y)\Leftrightarrow(\exist x,((x \in X)\land\phi(x,y,p)))))))
$$
意义：任何可定义函数下集合的像是集合。
无穷公理：存在归纳集
$$
(\exist X,((\varnothing \in X) \land (\forall x,((x \in X) \Rightarrow (x^+ \in X)))))
$$
其中后继 $x^+ \coloneqq x \cup {x}$。
定理：设 $X$ 是归纳集，则 $X$ 所有归纳子集的交良定且唯一，称为自然数集，记作 $\omega$ 或 $\N$。
$$
0 = \varnothing \
1 = {\varnothing} = {0} \
2 = {\varnothing, {\varnothing}} = {0,1} \
3 = {\varnothing, {\varnothing}, {\varnothing, {\varnothing}}} = {0,1,2} \
\cdots
$$
Peano 公理：自然数集 $\omega$ 和 $0$ 满足：
- $0$ 是自然数；
- 每个自然数都有唯一后继；
- $0$ 不是任何自然数的后继；
- 不同自然数有不同后继；
- $\omega$ 满足归纳原理：设集合 $A \sube \omega$ 满足 $0 \in A$，且 $\forall n \in \omega$，$n \in A \Rightarrow n^+ \in A$，那么 $A = \omega$。

序

良序原理：自然数集的任何一个非空子集都有最小数，即它不大于任何一个该子集中的自然数。
设 $R$ 是集合 $X$ 上的一个关系，若它满足自反性、反对称性和传递性，就称它是一个偏序，记作 $\le$，此时称 $X$ 为偏序集。若 $x \le y \land x \ne y$，记作 $x < y$。
若 $X$ 中所有元素对都可比，则 $\le$ 成为全序，称 $X$ 为全序集。偏序集的全序子集称为链，而元素两两不可比的子集称为反链。
集合 $X$ 上的一个序称为良序，若其任意一个非空子集都有最小元素。

基数

设集合 $A,B$，若存在一个双射 $f:A\to B$，则称 $A,B$ 对等，记作 $A \sim B$。如果 $f$ 只是单射，则称 $A$ 受制于 $B$，记作 $A \preceq B$。若 $A \preceq B \land A \not\sim B$，记作 $A \prec B$。
能够和某个自然数建立对等关系的集合称为有限集，否则称为无限集。
引理：集合 $A$ 为无限集当且仅当 $A$ 与其某个真子集对等。
证明：充分性显然，下证必要性。
显然 $A$ 非空，故取 $a_0 \in A$。依选择公理，定义映射 $f:\N \to A$，让 $f(0) = a_0$，$f(i)(i>0)$ 为集合 $A \setminus {f(0),f(1),\cdots,f(i-1)}$ 中任意一个元素 $a_i$。
考虑映射 $g:A\to A\setminus {a_0}$：
$$
g(x) = \begin{cases}a_{i+1}\quad\exist i \in \N, x=f(i) \ x \quad \forall i \in \N, x \ne f(i)\end{cases}
$$
则 $g$ 是双射。
推论：$\N$ 是最小的无限集。
推论：$\N \sim \Z \sim \Z \times \Z \sim \Q$。
关键：“蛇形游走”证明 $\Z \sim \Z \times \Z$。
推论：$\R \sim (0,1)\sim (0,1] \sim [0,1]$。
关键：考虑函数 $f(x) = \frac{\arctan x}{\pi} + \frac12$ 证明 $\R \sim (0,1)$。
引理：设集合 $X$ 非空，若映射 $f:2^X \to 2^X$ 为单调映射，即满足 $A \sube B \sube X \Rightarrow f(A) \sube f(B)$，则 $\exist T \sube X$，$f(T) = T$。
证明：取 $T = \bigcup {A \sube X:A \sube f(A)}$，则
$$
T \sube \bigcup {f(A):A \sube f(A)}
$$
又 $\forall A \in {A \sube X:A \sube f(A)}$，有 $A \sube T$，故 $f(A) \sube f(T)$，则
$$
\bigcup {f(A):A \sube f(A)} \sube f(T)
$$
故 $T \sube f(T)$。
再作用一次 $f$ 得到 $f(T) \sube f(f(T))$，则 $f(T)$ 也是满足条件的一个 $A$，故 $f(T) \sube T$。
故 $f(T) = T$。
Schröder–Bernstein 定理：若集合 $X$ 与集合 $Y$ 的一个子集对等，且集合 $Y$ 与集合 $X$ 的一个子集对等，则 $X \sim Y$。
证明：设有单射 $f:X \to Y$，$g:Y \to X$。考虑映射 $F:2^X \to 2^X$，$A \mapsto X \setminus g(Y \setminus f(A))$，易证其单调，于是 $\exist T \sube X$，$F(T) = T$。
考虑映射 $h:X \to Y$：
$$
h(x) = \begin{cases} f(x) \quad x \in T \ g^{-1}(x) \quad x \notin T\end{cases}
$$
则 $h$ 是双射。
Cantor 定理：若 $A \ne \varnothing$，则 $A \not\sim 2^A$。
证明：假设存在双射 $f:A \to 2^A$，考虑集合 $S = {a \in A: a \notin f(a)} \in 2^A$，则 $\exist s \in A$，$f(s) = S$，但 $s \in S \Leftrightarrow s \notin f(s) = S$，矛盾！
集合 $A$ 的基数记作 $\operatorname{card}(A)$。
若某集合和自然数存在一一对应，就称它可数，其基数为可数基数 $\omega$。
定理：$[0,1]$ 不是可数的。
关键：Cantor 对角线法。
若某集合和 $(0, 1)$ 存在一一对应，其基数为连续基数 $\mathfrak c$。
定理：$\operatorname{card}(2^\N) = \mathfrak c$。
关键：证明 $(0,1] \preceq 2^\N$，考虑实数的二进制小数表示（取字典序较大者为唯一表示）；证明 $2^\N \preceq (0,1]$，考虑实数的三进制小数表示（二进制小数表示则不为单射）。
连续统假设：$\omega,\mathfrak c$ 之间不存在其他基数。

数理逻辑

Propositional Logic

Alphabet:
- proposition symbols / atomic propositions: $p_0,p_1,p_2,\cdots$ and $\bot$,
- not: $(\lnot)$,
- connectives $\square$: $\land,(\lor),\to,(\leftrightarrow)$,
- auxiliary symbols: $(,)$,
- abbreviations:
  $$
  \lnot \phi \coloneqq (\phi \to\bot) \ \phi \lor \psi \coloneqq \lnot(\lnot \phi \land \lnot \psi) \ \phi \leftrightarrow \psi \coloneqq (\phi \to \psi) \and (\psi \to \phi)
  $$
The set $PROP$ of propositions is the smallest set with the properties:
- $\forall i \in \N$, $p_i \in PROP$,
- $\bot \in PROP$,
- $\phi \in PROP \Rightarrow \lnot \phi \in PROP$,
- $\phi,\psi \in PROP \Rightarrow (\phi\square\psi) \in PROP$.

Semantics

A mapping $v : PROP \rightarrow {0,1}$ is a valuation if
$$
v(\bot) = 0 \
v(\phi \land \psi) = \min(v(\phi), v(\psi)) \
v(\phi \lor \psi) = \max(v(\phi), v(\psi)) \
v(\phi \to \psi) = 0 \Leftrightarrow v(\phi) = 1 \operatorname{and} v(\psi) = 0 \
$$
$\vDash \phi$ means $v(\phi) \equiv 1$ for all valuations $v$.
Let $\Gamma$ be a set of propositions, $\Gamma \vDash \varphi$ means $(\forall \psi \in \Gamma, v(\psi) = 1) \Rightarrow v(\phi) = 1$ for all valuations $v$.

Natural Deduction

Derivations:
$$
\frac{\phi \quad \psi}{\phi \land \psi} (\land I)\quad \frac{\phi \land \psi}{\phi} (\land E_1) \quad \frac{\phi \land \psi}{\psi} (\land E_2) \
\frac{[\phi] \cdots \psi}{\phi \to \psi}(\to I) \quad \frac{\phi \quad \phi \to \psi}{\psi} (\to E) \
\frac{\bot}{\phi} (\bot) \quad \frac{[\neg \phi] \cdots \bot}{\phi} (\text {RAA})
$$
Let $\Gamma$ be a set of propositions, $\Gamma \vdash \phi$ means $\phi$ is derivable from $\Gamma$.
if $\Gamma = \varnothing$, we write $\vdash \phi$, which means $\phi$ is a theorem.

Completeness

Soundness: $\Gamma \vdash \phi \Rightarrow \Gamma \vDash \phi$.
Completeness Theorem: $\Gamma \vdash \phi \Leftrightarrow \Gamma \vDash \phi$.

Predicate Logic

Alphabet:
- predicate symbols: $P_1,\cdots,P_n$,
- function symbols: $f_1,\cdots,f_m$,
- constant symbols: $c_i(\forall i \in I)$,
- variables: $x_i(\forall i \in \N)$,
- bot: $\bot$,
- not: $(\lnot)$,
- connectives $\square$: $\land,(\lor),\to,(\leftrightarrow)$,
- quantifiers: $\forall,(\exist)$,
- auxiliary symbols: $(,)$,
- abbreviations:
  $$
  \lnot \phi \coloneqq (\phi \to\bot) \ \phi \lor \psi \coloneqq \lnot(\lnot \phi \land \lnot \psi) \ \phi \leftrightarrow \psi \coloneqq (\phi \to \psi) \and (\psi \to \phi) \
  \exist x_i, \phi \coloneqq \lnot(\forall x_i, \lnot \phi)
  $$
The set $TERM$ of terms is the smallest set with the properties:
- $\forall i \in I$, $c_i \in TERM$,
- $\forall i \in \N$, $x_i \in TERM$,
- $\forall i = 1,\cdots m$, $t_1,\cdots,t_{a_i} \in TERM \Rightarrow f_i(t_1,\cdots,t_{a_i}) \in TERM$.
The set $FORM$ of formulas is the smallest set with the properties:
- $\forall i = 1,\cdots,n$, $t_1,\cdots,t_{r_i} \in TERM \Rightarrow P_i(t_1,\cdots,t_{r_i}) \in FORM$,
- $\bot \in FORM$,
- $\phi \in FORM\Rightarrow \lnot \phi \in FORM$,
- $\phi,\psi \in FORM\Rightarrow (\phi\square\psi) \in FORM$.
- $\phi \in FORM \Rightarrow (\forall x_i, \phi),(\exist x_i, \phi) \in FORM$.
Free variables: $(\forall x_i, \phi)$ contains $x_i$ bound.
Substitution:
$$
y[t/x] \coloneqq \begin{cases}y\quad y \ne x \ z \quad y = x\end{cases}
$$

Semantics

A mapping $v : PROP \rightarrow {0,1}$:
$$
v(\bot) = 0 \
v(\phi \land \psi) = \min(v(\phi), v(\psi)) \
v(\phi \lor \psi) = \max(v(\phi), v(\psi)) \
v(\phi \to \psi) = 0 \Leftrightarrow v(\phi) = 1 \operatorname{and} v(\psi) = 0 \
v(\forall x,\phi(x)) = \min_x(\phi(x)) \
v(\exist x, \phi(x)) = \max_x(\phi(x))
$$

Natural Deduction

We adopt all the rules of propositional logic and we add:
$$
\frac{\phi(x)}{\forall x,\phi(x)}(\forall I)\quad \text{where $x$ is bound.}
\
\frac{\forall x,\phi(x)}{\phi(t)}(\forall E)\quad \text{where $t$ is free.}
$$

一致性与完备性

Godel 第一定理：任何一致的形式系统，只要蕴涵皮亚诺算术公理，就可以在其中构造在体系中不能被证明的真命题，因此通过推理演绎不能得到所有真命题（即体系是不完备的）。
Godel 第二定理：任何逻辑自洽的形式系统，只要蕴涵皮亚诺算术公理，它就不能用于证明其本身的一致性。

代数

设 $A$ 是非空集合，映射 $A \times A \to A$ 称为 $A$ 上的一个代数运算。

群

对非空集合 $G$，若 $G$ 上代数运算满足结合律，就称 $G$ 是半群。
对半群 $G$，若 $G$ 中有幺元，就称 $G$ 是幺半群。
对幺半群 $G$，若 $G$ 中每个元素都有逆元，就称 $G$ 是群。
注记：这意味着群中有消去律。
对群 $G$，若 $G$ 上代数运算满足交换律，就称 $G$ 是交换群（或 Abel 群）。
对有限群 $G$，其元素个数称为阶，记作 $|G|$。

子群

对群 $G$，若 $G$ 的非空子集 $H$ 在 $G$ 上的代数运算下也构成一个群，就称 $H$ 是 $G$ 的子群，记作 $H \le G$。
$$
H \le G \Leftrightarrow \forall a,b \in H, ab^{-1} \in H
$$
对群 $G$，若 $G$ 的有限子集 $H$ 满足 $\forall a,b \in H, ab \in H$，则 $H \le G$。
对群 $G$，$G$ 中所有包含非空子集 $S$ 的子群中最小的称为 $S$ 生成的子群，记作 $\lang S\rang$。
$$
\lang S \rang = \left{\prod_{i=1}^{|S|} s_i^{m_i}: s_i \in S,m_i \in \Z\right}
$$
对群 $G$，设 $g \in G$，称最小的使 $g^n=1$ 的正整数 $n$ 为 $g$ 的阶，记为 $o(g)$。如果不存在这样的 $n$，则 $o(g) = \infty$。
称 $\lang g \rang$ 为 $g$ 生成的循环群。如果 $o(g)$ 有限，则
$$
\lang g \rang = {1,g,g^2,\cdots,g^{o(g)-1}}
$$
对群 $G$，$H \le G$，对 $a \in G$，称 $aH$ 为 $H$ 的一个左陪集，$Ha$ 为 $H$ 的一个右陪集。
$$
aH = bH \Leftrightarrow ab^{-1} \in H
$$
陪集分解：$H$ 的任意两个左（右）陪集相等或不交，$G$ 可以表示为若干个不交的左（右）陪集的并。
Lagrange 定理：对有限群 $G$，$H \le G$，则 $|H|\mid |G|$。
称 $H$ 在 $G$ 中的不同陪集个数为 $H$ 的指数，记作 $[G:H]$，则 $|G| = |H|[G:H]$。
定理：对素数 $p$，$p$ 的既约剩余系 $\Z_p^$ 是循环群。
证明：记 $g(d)$ 表示 $\Z_p^$ 中 $d$ 阶元个数。
若存在 $d$ 阶元 $a$，则
$$
\lang a \rang = {1,a,a^2,\cdots,a^{d-1}}
$$
又 $\lang a \rang$ 中 $d$ 个元素均满足 $x^d=1$，而 $x^d=1$ 的解至多只有 $d$ 个，故 $\lang a \rang$ 构成了 $x^d=1$ 的解集，因此所有 $d$ 阶元均在 $\lang a \rang$ 中，得到 $g(d) = \varphi(d)$，其中 $\varphi$ 是欧拉函数。
于是 $g(d)$ 要么为 $\varphi(d)$，要么为 $0$。由 Lagrange 定理，只有当 $d$ 为 $|\Z_p^|=p-1$ 的因数时，$g(d)$ 才可能不为 $0$。因此有
$$
p-1 = \sum_{d=1}^{p-1} g(d) = \sum_{d \mid p-1} g(d) \le \sum_{d \mid p-1} \varphi(d) = p – 1
$$
故当 $d | p-1$ 时，$g(d) = \varphi(d)$。特别地，$g(p-1) = \varphi(p-1)$，故存在 $p-1$ 阶元（称为原根），所以 $\Z_p^$ 是循环群。
Wilson 定理：对素数 $p$，$(p-1)! \equiv -1 \pmod p$。
证明：由 $\Z_p^*$ 是循环群，有
$$
(p-1)! = \prod_{i=0}^{p-2} g^i = g^{\frac{p-1}2}
$$
注意到 $g^{\frac{p-1}2}$ 是二阶元，而二阶元只有 $\varphi(2) = 1$ 个，它就是 $-1$。
对群 $G$，$C = {c \in G: \forall g \in G, cg = gc}$ 称为 $G$ 的中心，$C \le G$。
对 $x \in G$，$C_x = {gxg^{-1}:g \in G}$ 称为 $x$ 的共轭类。
有限群的类方程：$|G| = |C| + \sum |C_i|$，其中 $C_i$ 是所有不止一个元素的共轭类。

同态与同构

对群 $G,H$，映射 $\sigma:G\to H$ 称为同态，若
$$
\forall x,y \in G, \sigma(x)\sigma(y) = \sigma(xy)
$$
如果同态 $\sigma$ 是双射，就称它是同构，记作 $G \cong H$。
对同态 $\sigma : G \to H$，同态核 $\ker \sigma \coloneqq {g \in G: \sigma(g) = 1_H}$。
称 $\ker \sigma$ 的所有陪集构成的群为 $G$ 模 $\ker \sigma$ 的商群，记作 $G/\ker \sigma$。
对群 $G$，$H \le G$，称 $H$ 为正规子群，若
$$
\forall a \in G, aH = Ha
$$
记作 $H \lhd G$。
$$
H \lhd G \Leftrightarrow \forall g \in G, h \in H, ghg^{-1} \in H
$$
考虑典范同态 $\sigma : G \to G/H : g \mapsto gH$，则 $\ker \sigma = H$。
同态基本定理：对同态 $\sigma : G \to H$，$G / \ker \sigma \cong \operatorname{im} \sigma$。
循环群分类定理：无限循环群都同构于 $\Z$，$n$ 阶循环群都同构于 $\Z_n$。

直积

对群 $G_1,G_2$，$G = G_1 \times G_2$ 在运算 $(g_1,g_2)(g_1^\prime, g_2^\prime) \coloneqq (g_1g_1^\prime,g_2g_2^\prime)$ 下构成群，称 $G$ 为 $G_1,G_2$ 的外直积。
对群 $G$，$G_1,G_2$ 是 $G$ 的正规子群，且 $G = G_1G_2$，$G_1 \cap G_2 = {1}$，则称 $G$ 为 $G_1,G_2$ 的内直积。
外直积和内直积在同构意义下是一致的。

环

对非空集合 $R$，若 $R$ 上有两种代数运算加法 $+$ 和乘法 $\cdot$，且满足：
- $(R,+)$ 构成交换群；
- $(R,\cdot)$ 构成半群；
- 乘法分配律：$a(b+c) = ab+ac$，$(b+c)a = ba+ca$；
就称 $R$ 是环。
对环 $R$，若 $(R \setminus {0}, \cdot)$ 构成群，就称 $R$ 为除环。
对环 $R$，若乘法是可交换的，就称 $R$ 为交换环。
交换除环称为域。
无零因子的交换环称为整环。域一定是整环，有限整环是域。
对环 $R$，若 $R$ 的非空子集 $S$ 在 $R$ 上的代数运算下也构成一个环，就称 $S$ 是 $R$ 的子环。
$S$ 是 $R$ 的子环当且仅当 $\forall x,y \in S, x – y \in S, xy \in S$。

理想

对环 $R$，称加法子群 $S \sube R$ 为理想，若 $\forall r \in R$，$rS,Sr \sube S$。
对环 $R$，$R$ 中所有含 $a \in R$ 的最小理想称为 $a$ 生成的主理想，记作 $(a)$。
$$
(a) = \left{ \sum_{i=1}^k r_ias_i:r_i,s_i \in R, k \in \N\right}
$$
若环 $R$ 的理想都是主理想，就称 $R$ 是主理想环。
$\Z$ 是主理想整环。
对环 $R$，$I$ 是 $R$ 的理想，称 $I$ 的所有加法陪集构成的环为 $R$ 模 $I$ 的商环，记作 $R/I$。
考虑典范同态 $\sigma : R \to R/I : r \mapsto r + I$，则 $\ker \sigma = I$。
对环 $R$，$I,J$ 是 $R$ 的理想，定义理想的乘法
$$
IJ \coloneqq \left{ \sum_{i=1}^k a_ib_i:a_i \in I, b_i \in J, k \in \N \right}
$$
则 $IJ \sube I \cap J \sube I + J$ 且均为 $R$ 的理想。

唯一分解整环 (UFD)，主理想整环 (PID)，欧几里得整环 (ED)

$\Z(\sqrt{-5}) \coloneqq {m+n\sqrt{-5}:m,n \in \Z }$ 不是唯一分解整环。考虑
$$
9=3\cdot 3 = (2 + \sqrt{-5})(2-\sqrt{-5})
$$
对整环 $R$，$R$ 是唯一分解整环当且仅当 $R$ 满足因子链条件和素性条件。
- 因子链条件：$R$ 中不存在无限序列 $a_1,a_2,\cdots$ 使得 $a_{i+1}$ 是 $a_i$ 的真因子。
- 素性条件：$R$ 中的任何不可约元 $p$ 满足 $(p \mid ab) \Rightarrow (p \mid a) \or (p \mid b)$。
对整环 $R$，若存在 $\delta:R \setminus {0} \to \N$ 满足：
- 若 $a,b \in R$ 且 $b \ne 0$，则 $\exist q,r \in R$ 使得 $a = bq+r$，且或者 $r = 0$ 或者 $\delta(r) < \delta(b)$；
- 若 $a,b \in R \setminus {0}$，则 $\delta(a) \le \delta(ab)$；
则称 $R$ 是欧几里得整环。
定理：$\text{ED} \Rightarrow \text{PID} \Rightarrow \text{UFD}$。

有限域

对域 $F$，使得 $n \times 1 = 0$ 的最小正整数 $n$ 称为域的特征，记为 $\operatorname{char}(F)$。特别地，如果不存在这样的正整数，则 $\operatorname{char}(F) = 0$。
域的特征要么是 $0$，$\Q$ 是最小的特征为 $0$ 的域；要么是素数 $p$，$\Z_p$ 是最小的特征为 $p$ 的域。$\Q$ 和 $\Z_p$ 统称素域。
定理：对域 $F$，多项式环 $F[x]$ 是主理想整环。
关键：取 $F[x]$ 的非平凡理想 $I$ 中次数最小的非零多项式 $g(x)$，则 $I = (g(x))$。
定理：对域 $F$，$F[x]/(f(x))$ 是域当且仅当 $f(x)$ 是 $F[x]$ 中的不可约多项式，且有 $|F[x]/(f(x))| = |F|^{\deg f}$。
关键：$f(x)$ 可约可推出 $F[x]/(f(x))$ 非整环，故必要性成立。由 Bézout 定理可找到非零元的逆元，故充分性成立。$F[x]/(f(x))$ 的每个等价类都可唯一地用一个次数小于 $\deg f$ 的多项式作为代表元，故等式成立。
定理：存在且在同构意义下仅存在唯一阶为 $p^n$ 的有限域，其中 $p$ 为素数，$n \in \N$；不存在其他阶的有限域。

Information Theory

Entropy

Definition 1 (Entropy).
$$
H(X)= \mathbb E\left(\log \frac{1}{P(X)}\right) = \sum_{x \in \mathcal X} P(x)\log \frac{1}{P(x)}.
$$
Example (Bernoulli and Geometric distribution).

Let $X \sim \operatorname{Bernoulli}(p)$, with $P(1) = p$ and $P(0) = 1-p$. Then,
$$
h(p) \triangleq H(X) = p\log\frac 1p + (1-p) \log \frac 1{1-p}.
$$
Let $X$ be geometrically distributed, with $P(i) = p(1-p)^i, i = 0,1,\cdots$. Then,
$$
H(X) = \mathbb E\left(\log \frac 1{p(1-p)^X}\right) = \log \frac 1p + \mathbb E(X) \log \frac 1{1-p} = \log \frac 1p + \left(\frac 1p – 1\right) \log \frac 1{1-p} = \frac{h(p)}p.
$$

Definition 2 (Joint entropy).
$$
H(X^n)= H(X_1,X_2,\cdots,X_n)=\mathbb E\left(\log \frac{1}{P(X_1,X_2,\cdots,X_n)}\right) = \sum_{{x_1,x_2,\cdots,x_n} \in \mathcal X} P(x_1,x_2,\cdots,x_n)\log \frac{1}{P(x_1,x_2,\cdots,x_n)}.
$$
Definition 3 (Conditional entropy).
$$
H(X|Y)= \mathbb E\left(\log \frac{1}{P(X|Y)}\right) = \sum_{{x,y} \in \mathcal X} P(x,y)\log \frac{1}{P(x|y)}.
$$
Theorem 4 (Jensen’s inequality). Let $f$ be a convex function. Then,
$$
f(\mathbb E(X)) \le \mathbb E(f(X)).
$$
Theorem 5 (Properties of entropy).

Positivity:
$$
H(X) \ge 0,
$$
with equality iff $X$ is a constant.
Uniform distribution maximizes entropy: For finite $\mathcal X$, $H(X) \le \log |\mathcal X|$, with equality iff $X$ is uniform on $\mathcal X$.
Proof. Notice that $x \mapsto \log x$ is concave. Then apply Jensen’s inequality,
$$
H(X)= \mathbb E\left(\log \frac{1}{P(X)}\right) \le \log\mathbb E \left(\frac{1}{P(X)}\right) = \log |\mathcal X|.
$$
Conditioning reduces entropy: $H(X|Y) \le H(X)$, with equality iff $X$ and $Y$ are independent.
Proof. Notice that $x \mapsto x \log \frac 1x$ is concave. Then apply Jensen’s inequality,
$$
H(X|Y) = \sum_{x \in \mathcal X} \mathbb E_Y \left(P(x|Y) \log \frac{1}{P(x|Y)}\right) \le \sum_{x \in \mathcal X} P(x) \log \frac{1}{P(x)} = H(x).
$$
Chain rule:
$$
H(X,Y) = H(X) + H(Y|X) \le H(X) + H(Y); \quad
H(X^n) = \sum_{i=1}^n H(X_i|X^{i-1}) \le \sum_{i=1}^nH(X_i).
$$
Proof.
$$
H(X,Y) = \mathbb E\left(\log \frac{1}{P(X,Y)}\right) = \mathbb E\left(\log \frac{1}{P(X)P(Y|X)}\right) = \mathbb E\left(\log \frac{1}{P(X)}\right) + \mathbb E\left(\log \frac{1}{P(Y|X)}\right) = H(X) +H(Y|X).
$$

Divergence (Relative entropy)

Definition 1 (Kullback-Leibler (KL) Divergence).
$$
D(P | Q)=\sum_{x \in \mathcal{X}} P(x) \log \frac{P(x)}{Q(x)}= \mathbb E_{X\sim P(X)}\left(\log \frac{P(X)}{Q(X)}\right)= \mathbb E_{X\sim Q(X)}\left(\frac{P(X)}{Q(X)}\log \frac{P(X)}{Q(X)}\right).
$$

Example (Binary divergence). Consider $P = \operatorname{Bernoulli}(p)$ and $Q = \operatorname{Bernoulli}(q)$. Then,
$$
d(p|q) \triangleq D(P | Q) = p \log \frac pq + (1-p) \log \frac{1-p}{1-q}.
$$
Theorem 2 (Entropy vs Divergence). For finite $\mathcal X$,
$$
H(P) = \log |\mathcal X| – D(P|U_{\mathcal X}),
$$
where $U_{\mathcal X}$ is the uniform distribution on $\mathcal X$.

Proof.
$$
D(P|U_{\mathcal X}) = \mathbb E_{X\sim P(X)}\left(\log \frac{P(X)}{U_{\mathcal X}(X)}\right) = \mathbb E_{X\sim P(X)}\left(\log \frac{1}{U_{\mathcal X}(X)}\right) – \mathbb E_{X\sim P(X)}\left(\log \frac{1}{P(X)}\right) = \log|\mathcal X| – H(P).
$$
Definition 3 (Conditional divergence).
$$
D(P_{Y|X}|Q_{Y|X}|P_X) = \mathbb E_{X \sim P_X(X)}(D(P_{Y|X}|Q_{Y|X})) = \sum_{x \in \mathcal X}P_X(x)D(P_{Y|x}|Q_{Y|x}).
$$
Theorem 4 (Properties of divergence).

Information inequality:
$$
D(P|Q) \ge 0,
$$
with equality iff $P = Q$.
Proof. Notice that $x \mapsto x \log x$ is convex. Then apply Jensen’s inequality,
$$
D(P | Q)=\mathbb E_{X\sim Q(X)}\left(\frac{P(X)}{Q(X)}\log \frac{P(X)}{Q(X)}\right) \ge 1 \log 1 = 0.
$$
Chain rule:
$$
D(P_{X,Y}|Q_{X,Y}) = D(P_{Y|X}|Q_{Y|X}|P_X) + D(P_X|Q_X); \quad D(P_{X^n}| Q_{X^n}) = \sum_{i=1}^n D(P_{X_i|X^{i-1}}|Q_{X_i|X^{i-1}}|P_{X^{i-1}}).
$$
Proof.
$$
D(P_{X, Y} | Q_{X, Y})
=\mathbb{E}\left(\log \frac{P_{X, Y}(X, Y)}{Q_{X, Y}(X, Y)}\right)
=\mathbb{E}\left(\log \frac{P_{Y|X}(Y|X)}{Q_{Y|X}(Y|X)}\right)+\mathbb{E}\left(\log \frac{P_X(X)}{Q_X(X)}\right)
= D(P_{Y|X}|Q_{Y|X}|P_X) + D(P_X|Q_X).
$$
Conditional divergence can be expressed unconditionally:
$$
D(P_{Y|X}|Q_{Y|X}|P_X) = D(P_XP_{Y|X}|P_XQ_{Y|X}).
$$
Proof. Apply chain rule to $P_X = Q_X$.
Monotonicity:
$$
D(P_{X,Y}| Q_{X,Y}) \ge D(P_X|Q_X).
$$
Tensorization: Consider chain rule in the special case of $Q_{X^n} = \prod_{i=1}^n Q_{X_i}$,
$$
D(P_{X^n}| Q_{X^n}) = D(P_{X^n}| Q_{X_1}Q_{X_2}\cdots Q_{X_n})
= D(P_{X^n}| P_{X_1}P_{X_2}\cdots P_{X_n}) + \sum_{i=1}^n D(P_{X_i} | Q_{X_i}) \ge \sum_{i=1}^n D(P_{X_i} | Q_{X_i}),
$$
with equality iff $P_{X^n} = \prod_{i=1}^n P_{X_i}$.
Conditioning increases divergence: Let $P_Y = P_{Y|X} \circ P_X$ and $Q_Y = Q_{Y|X} \circ P_X$. Then,
$$
D(P_Y|Q_Y) \le D(P_{Y|X} |Q_{Y|X}|P_X),
$$
with equality iff $D(P_{X|Y} |Q_{X|Y}|P_Y) = 0$.
Proof.
$$
\begin{aligned}
D(P_{X, Y} | Q_{X, Y}) & =D(P_{Y | X} | Q_{Y | X} | P_X)+\underbrace{D(P_X | P_X)}{=0} \
& =D(P{X | Y} | Q_{X | Y} | P_Y)+D(P_Y | Q_Y) .
\end{aligned}
$$
Data-Processing Inequality (DPI) for KL divergence. Let $P_Y = P_{Y|X} \circ P_X$ and $Q_Y = P_{Y|X} \circ Q_X$. Then,
$$
D(P_Y| Q_Y) \le D(P_X | Q_X),
$$
with equality iff $D(P_{X|Y}| Q_{X|Y}|P_Y) = 0$.
Proof.
$$
\begin{aligned}
D(P_{X, Y} | Q_{X, Y}) & =\underbrace{D(P_{Y | X} | Q_{Y | X} | P_X)}{=0}+D(P_X | Q_X) \
& =D(P{X | Y} | Q_{X | Y} | P_Y)+D(P_Y | Q_Y) .
\end{aligned}
$$

Mutual information

Definition 1 (Mutual information).
$$
I(X;Y) = D(P_{X,Y}|P_XP_Y).
$$
Theorem 2 (Properties of mutual information).

Mutual information as conditional divergence:
$$
I(X;Y) = D(P_{Y|X}|P_Y|P_X).
$$
Symmetry:
$$
I(X;Y)=I(Y;X).
$$
Positivity:
$$
I(X;Y)\ge 0,
$$
with equality iff $X \perp !!!! \perp Y$.
More data, more information:
$$
I(X,Y;Z) \ge I(X;Z).
$$

Theorem 3 (Mutual information and entropy).
$$
I(X;Y) = H(X) – H(X|Y)=H(X)+H(Y)-H(X,Y).
$$
Definition 4 (Conditional mutual information).
$$
I(X;Y|Z) = D(P_{X,Y|Z}|P_{X|Z}P_{Y|Z}|P_Z)
$$
Theorem 5 (Further properties of mutual information).

Positivity:
$$
I(X;Z|Y) \ge 0,
$$
with equality iff $X \to Y \to Z$.
Chain rule:
$$
I(X,Y;Z) = I(X;Z)+I(Y;Z|X); \quad
I(X^n;Y) = \sum_{i=1}^nI(X_i;Y|X^{i-1}).
$$
DPI for mutual information: Let $X \to Y \to Z$. Then,
$$
I(X;Z) \le I(X;Y),
$$
with equality iff $X \to Z \to Y$.

Probability Theory

Chernoff bounds

Chebyshev’s and Markov’s Inequality

Theorem 1 (Chebyshev’s Inequality). Let $X$ be a random variable. Then $\forall a>0$,
$$
\mathbb P(|X-\mathbb E(X)| \ge a) \le \frac{\mathbb D(X)}{a^2}.
$$
Theorem 2 (Markov’s Inequality). Let $X$ be a non-negative random variable. Then $\forall a>0$,
$$
\mathbb P(X \ge a) \le \frac{\mathbb E(X)}{a}.
$$
Proof.
$$
\mathbb E(X) = \sum_{x \ge a} p(x)\cdot x + \sum_{0 \le x < a} p(x)\cdot x \ge \sum_{x \ge a} p(x)\cdot x \ge a\cdot \sum_{x \ge a} p(x) = a\cdot \mathbb P(X \ge a).
$$
Proof of Theorem 1. Apply Markov’s Inequality to $(X – \mathbb E(X))^2$. Notice that
$$
\mathbb D(X) = \mathbb E((X – \mathbb E(X))^2).
$$
Remark. Markov’s and Chebyshev’s Inequality are essentially tight for a general random variable.

Deviation of a sum on independent random variables

Theorem 3. Let $X_1,X_2,\cdots,X_n$ be independent random variables with $\mathbb E(X_i) = \mu_i$ and $\mathbb D(X_i) = \sigma_i^2$. Then $\forall a>0$,
$$
\mathbb P\left(\left|\sum_{i=1}^n X_i – \sum_{i=1}^n \mu_i\right| \ge a\right) \le \frac{\sum_{i=1}^n \sigma_i^2}{a^2}.
$$
In particular, for identically distributed random variables with expectation $\mu$ and variance $\sigma^2$,
$$
\mathbb{P}\left(\left|\frac{\sum_{i=1}^n X_i}{n}-\mu\right| \geq \epsilon\right) \leq \frac{\sigma^2}{n \epsilon^2}.
$$
Proof. Apply Chebyshev’s Inequality to $\sum_{i=1}^n X_i$. Notice that
$$
\mathbb D\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \mathbb D(X_i).
$$
Remark. Theorem 3 is tight when all the variables are just guaranteed to be pairwise independent.

Chernoff Bound

Theorem 4 (Chernoff Bound of independent Bernoulli trials). Let $X = \sum_{i=1}^n X_i$, where $X_i=1$ with probability $p_i$ and $X_i = 0$ with probability $1-p_i$, and all $X_i$ are independent. Let $\mu = \mathbb E(X) = \sum_{i=1}^n p_i$. Then,

Upper Tail: $\forall \delta > 0$,
$$
\mathbb P(X \ge (1+\delta)\mu) \le e^{-\mu\delta^2/(2+\delta)};
$$
Lower Tail: $\forall \delta \in (0,1)$,
$$
\mathbb P(X \le (1-\delta)\mu) \le e^{-\mu\delta^2/2}.
$$

Corollary 5. $\forall \delta \in (0,1)$,
$$
\mathbb P(|X – \mu| \ge \delta\mu) \le 2e^{-\mu\delta^2/3}.
$$

Example (coin tossing). Suppose we have a fair coin. Repeatedly toss the coin, and let $S_n$ be the number of heads from the first $n$ tosses.

Theorem 3 tells us that
$$
\mathbb{P}(|S_n/n-1/2| \geq \epsilon) \leq \frac{1}{4n \epsilon^2}.
$$
Taking $\epsilon = 1/4$, we obtain $\mathbb{P}(|S_n/n-1/2| \geq 1/4) \leq 4/n$.

But Corollary 5 tells us that
$$
\mathbb P(|S_n – n/2| \ge \delta n/2) \le 2e^{-n\delta^2/6}.
$$
Taking $\delta = 1/2$, we obtain $\mathbb{P}(|S_n/n-1/2| \geq 1/4) \leq 2e^{-n/24}$, which is a massive improvement.

Discrete Fourier transform (DFT)

Definition.

$\Z_N$:
$$
\mathcal F:X_k = \frac1N \sum_{n=0}^{N-1}e^{-i 2\pi kn/N}x_n = \frac1N \sum_{n=0}^{N-1}\omega_{N}^{-kn}x_n, \quad
\mathcal F^{-1}:x_n = \sum_{k=0}^{N-1} e^{i2\pi kn/N}X_k = \sum_{k=0}^{N-1} \omega_{N}^{kn}X_k.
$$
$(\Z_2)^n$:
$$
\mathcal F : X_x = \frac{1}{2^n} \sum_{a} (-1)^{\operatorname{bitcount}(x \and a)}x_a, \quad
\mathcal F^{-1} : x_a =\sum_{x} (-1)^{\operatorname{bitcount}(x \and a)}X_x.
$$

xht37's blog

离散数学与结构课程笔记

公理集合论

集合论的公理化

Zermelo–Fraenkel 集合论

序

基数

数理逻辑

Propositional Logic

Semantics

Natural Deduction

Completeness

Predicate Logic

Semantics

Natural Deduction

一致性与完备性

代数

群

子群

同态与同构

直积

环

理想

唯一分解整环 (UFD)，主理想整环 (PID)，欧几里得整环 (ED)

有限域

Information Theory

Entropy

Divergence (Relative entropy)

Mutual information

Probability Theory

Chernoff bounds

Chebyshev’s and Markov’s Inequality

Deviation of a sum on independent random variables

Chernoff Bound

Discrete Fourier transform (DFT)

发表评论取消回复

离散数学与结构 课程笔记

公理集合论

集合论的公理化

Zermelo–Fraenkel 集合论

序

基数

数理逻辑

Propositional Logic

Semantics

Natural Deduction

Completeness

Predicate Logic

Semantics

Natural Deduction

一致性与完备性

代数

群

子群

同态与同构

直积

环

理想

唯一分解整环 (UFD)，主理想整环 (PID)，欧几里得整环 (ED)

有限域

Information Theory

Entropy

Divergence (Relative entropy)

Mutual information

Probability Theory

Chernoff bounds

Chebyshev’s and Markov’s Inequality

Deviation of a sum on independent random variables

Chernoff Bound

Discrete Fourier transform (DFT)

发表评论 取消回复

离散数学与结构课程笔记

发表评论取消回复