定义1
设函数\(f\)在开集\(D\)上的每一点处存在偏导数:
\[ D_if(\boldsymbol{x}) = \frac{\partial f}{\partial x_i}(\boldsymbol{x}) \quad (i=1,2,\cdots,n) \]
称它们为\(f\)的一阶偏导函数,如果对这些偏导函数又可以取偏导数,得出的就是\(f\)的二阶偏导函数,依次可以定义三阶偏导数以及更高阶的偏导数。对于二阶偏导数,将一阶偏导函数\(\displaystyle \frac{\partial f}{\partial x_j}\)再对\(x_i\)求偏导数,即\(\displaystyle \frac{\partial f}{\partial x_i}\left(\frac{\partial f}{\partial x_j}\right)\)记作\(\displaystyle \frac{\partial^2 f}{\partial x_i \partial x_j}\),这里\(i,j\)独立地从\(1\)变到\(n\),如果\(i=j\),那么把\(\displaystyle \frac{\partial^2 f}{\partial x_i \partial x_i}\)记作\(\displaystyle \frac{\partial^2 f}{\partial x_i^2}(i=1,2,\cdots,n)\);如果\(i\ne j\),这类二阶偏导数称为混合偏导数。
定理1
设开集\(D \subset \mathbb{R}^2\),\(f: D \to \mathbb{R}\),如果\(\displaystyle \frac{\partial f}{\partial x},\frac{\partial f}{\partial y},\frac{\partial^2 f}{\partial y \partial x}\)在\((x_0,y_0)\)的某个邻域上存在,且\(\displaystyle \frac{\partial^2 f}{\partial y\partial x}\)在\((x_0,y_0)\)处连续,那么\(\displaystyle \frac{\partial^2 f}{\partial x \partial y}\)在\((x_0,y_0)\)处存在,而且
\[ \frac{\partial^2 f}{\partial x \partial y} = \frac{\partial^2 f}{\partial y \partial x} \]
证:记
\[
\varphi(h, k) = f(x_0 + h, y_0 + k) - f(x_0+h, y_0) - f(x_0, y_0+k)
+ f(x_0, y_0)
\]
令
\[
g(x) = f(x, y_0+k) - f(x, y_0)
\]
从而由微分中值定理可知
\[
\begin{aligned}
\varphi(h, k) & = g(x_0 + h) - g(x_0) \\
& = g^\prime(x_0 + \theta_1 h)h \\
& = \left(\frac{\partial f}{\partial x}(x_0 + \theta_1 h, y_0 +
k) - \frac{\partial f}{\partial x}(x_0 + \theta_1 h, y_0) \right)h \\
& = \frac{\partial^2 f}{\partial y \partial x}(x_0 + \theta_1h,
y_0 + \theta_2k)hk
\end{aligned}
\]
由于\(\displaystyle \frac{\partial^2
f}{\partial y \partial x}\)在\((x_0,
y_0)\)处连续,从而有
\[
\lim \limits_{h\to 0, k \to 0} \frac{\varphi(h ,k)}{hk} =
\frac{\partial^2 f}{\partial y \partial x}(x_0, y_0)
\]
而又有
\[
\lim \limits_{k \to 0} \frac{\varphi(h ,k)}{hk} = \lim \limits_{k
\to 0} \frac{1}{h} \left( \frac{f(x_0+h, y_0+k) - f(x_0+h, y_0)}{k} -
\frac{f(x_0, y_0+k) - f(x_0, y_0)}{k}\right) = \frac{1}{h}\left(
\frac{\partial f}{\partial y}(x_0 +h, y_0) - \frac{\partial f}{\partial
y}(x_0, y_0)\right)
\]
所以
\[
\lim \limits_{h \to 0, k \to 0} \frac{\varphi(h ,k)}{hk} = \lim
\limits_{h \to 0} \frac{1}{h}\left( \frac{\partial f}{\partial y}(x_0
+h, y_0) - \frac{\partial f}{\partial y}(x_0, y_0)\right) =
\frac{\partial^2 f}{\partial x \partial y}(x_0, y_0)
\]
所以\(\displaystyle \frac{\partial^2
f}{\partial x \partial y}(x_0, y_0)\)存在,而且
\[
\frac{\partial^2 f}{\partial x \partial y}(x_0, y_0) =
\frac{\partial^2 f}{\partial y \partial x}(x_0, y_0)
\]
Q.E.D.
定理2
设定义在凸区域\(D \subset \mathbb{R}^n\)上的函数\(f\)可微,则对任何两点\(\boldsymbol{a}, \boldsymbol{b} \in D\),在由\(\boldsymbol{a},\boldsymbol{b}\)确定的线段上存在一点\(\boldsymbol{\xi}\),使得
\[ f(\boldsymbol{b}) - f(\boldsymbol{a}) = Jf(\boldsymbol{\xi})(\boldsymbol{b} - \boldsymbol{a}) \]
证:由\(\boldsymbol{a}\)与\(\boldsymbol{b}\)确定的线段上的点可表示为\(\boldsymbol{a} + t(\boldsymbol{b} -
\boldsymbol{a})\),这里\(t \in [0,
1]\),令
\[
\varphi(t) = f(\boldsymbol{a} + t(\boldsymbol{b} - \boldsymbol{a}))
\]
那么\(\varphi\)是\([0,
1]\)上的可微函数,由单变量的微分中值定理可知,存在\(\theta \in (0,1)\),使得
\[
\varphi(1) - \varphi(0) = \varphi^\prime(\theta)
\]
即
\[
f(\boldsymbol{b}) - f(\boldsymbol{a}) =
\boldsymbol{J}f(\boldsymbol{a} + \theta (\boldsymbol{b} -
\boldsymbol{a}))(\boldsymbol{b} - \boldsymbol{a})
\]
再令\(\boldsymbol{\xi} = \boldsymbol{a} +
\theta(\boldsymbol{b} - \boldsymbol{a})\)即证得结论。
Q.E.D.
定理3
设\(D\)是\(\mathbb{R}^n\)中的区域,如果对任意的\(\boldsymbol{x} \in D\),有
\[ \frac{\partial f}{\partial x_1}(\boldsymbol{x}) = \cdots = \frac{\partial f}{\partial x_n}(\boldsymbol{x}) = 0 \]
那么\(f\)在\(D\)上为一个常数。
证:如果\(D\)是凸区域,则由定理1立即得出结论。如果\(D\)不是凸区域,任取\(\boldsymbol{x}_0 \in D\),令
\[
\begin{aligned}
A = \{ \boldsymbol{x} \in D: f(\boldsymbol{x}) =
f(\boldsymbol{x_0})\} \\
B = \{ \boldsymbol{x} \in D: f(\boldsymbol{x}) \ne
f(\boldsymbol{x_0}) \}
\end{aligned}
\]
显然\(A\)非空,而\(D=A \cup B\),由于\(D\)是连通开集,若能证明\(A,B\)是开集,则由点列极限六的定理3可知,\(B =
\varnothing\),从而证得结论。为了证明\(A\)是开集,任取\(\boldsymbol{a} \in A \subset D\),存在\(B_{\boldsymbol{r}}(\boldsymbol{a}) \in
D\),由于\(B_{\boldsymbol{r}}(\boldsymbol{a})\)是凸区域,从而\(f\)在\(B_{\boldsymbol{r}}(\boldsymbol{a})\)上是常数,且对任意的\(\boldsymbol{x} \in
B_{\boldsymbol{r}}(\boldsymbol{a})\),有
\[
f(\boldsymbol{x}) = f(\boldsymbol{a}) = f(\boldsymbol{x}_0)
\]
从而\(B_{\boldsymbol{r}}(\boldsymbol{a})
\subset A\),也就说明\(A\)是开集。同样的方法也可以证明\(B\)是开集。再由上面分析可知命题成立。
Q.E.D.
定理4
设\(k,n\)是两个正整数,那么
\[ (x_1 + \cdots + x_n)^k = \sum_{\alpha_1 + \cdots + \alpha_n = k} \frac{k!}{\alpha_1!\cdots\alpha_n!}x_1^{\alpha_1} \cdots x_n^{\alpha_n} \]
这里\(\alpha_1,\cdots,\alpha_n\)是非负整数。如果记\(\boldsymbol{\alpha} = (\alpha_1, \cdots, \alpha_n)\),\(\boldsymbol{x}=(x_1,\cdots,x_n)\),且
\[ \begin{aligned} |\boldsymbol{\alpha}| &= \alpha_1 + \cdots + \alpha_n \\ \boldsymbol{\alpha}! &= \alpha_1!\cdots\alpha_n! \\ \boldsymbol{x}^{\boldsymbol{\alpha}} &= x_1^{\alpha_1} \cdots x_n^{\alpha_n} \end{aligned} \]
则上式可简写为
\[ (x_1 + \cdots + x_n)^k = \sum_{|\boldsymbol{\alpha}|=k}\frac{k!}{\boldsymbol{\alpha}!}\boldsymbol{x}^{\boldsymbol{\alpha}} \]
证:对加项的个数\(n\)作归纳。当\(n=2\)时,该定理就是二项式定理,固然成立。先设\(n-1\)时命题成立,那么当加项的个数为\(n\)时,有
\[
\begin{aligned}
(x_1 + \cdots + x_n)^k &= ((x_1 + \cdots + x_{n-1}) + x_n)^k \\
&= \sum_{\alpha_n=0}^k
\frac{k!}{\alpha_n!(k-\alpha_n)!}(x_1+\cdots+x_{n-1})^{k-\alpha_n}x_n^{\alpha_n}
\\
&= \sum_{\alpha_n=0}^k \frac{k!}{\alpha_n!(k-\alpha_n)!}
\sum_{\alpha_1 +
\alpha_{n-1}=k-\alpha_n}\frac{(k-\alpha_n)!}{\alpha_1!\cdots\alpha_{n-1}!}
x_1^{\alpha_1} \cdots x_{n-1}^{\alpha_{n-1}} x_n^{\alpha_n} \\
& = \sum_{\alpha_1 + \cdots + \alpha_n = k}
\frac{k!}{\alpha_1!\cdots\alpha_n!}x_1^{\alpha_1} \cdots x_n^{\alpha_n}
\end{aligned}
\]
Q.E.D.
定理5:Taylor公式
设\(D \subset \mathbb{R}^n\)是一个凸区域,\(f \in C^{m+1}(D)\),\(\boldsymbol{a}=(a_1,\cdots,a_n)\),\(\boldsymbol{a}+\boldsymbol{h} = (a_1+h_1,\cdots,a_n+h_n)\)是\(D\)中的两个点,则必存在\(\theta \in (0, 1)\),使得
\[ f(\boldsymbol{a} + \boldsymbol{h}) = \sum_{k=0}^m \sum_{|\boldsymbol{a}|=k} \frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a})}{\boldsymbol{\alpha}!} \boldsymbol{h}^{\boldsymbol{\alpha}} + \boldsymbol{R}_m \]
其中
\[ D^{\boldsymbol{\alpha}}f(\boldsymbol{a}) = \frac{\partial^{\alpha_1+\cdots+\alpha_n}f}{\partial x_1^{\alpha_1} \cdots \partial x_n^{\alpha_n}}(\boldsymbol{a}) \]
且
\[ \boldsymbol{R}_m = \sum_{|\boldsymbol{\alpha}|=m+1} \frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a} + \theta \boldsymbol{h})}{\boldsymbol{\alpha}!} \boldsymbol{h}^{\boldsymbol{\alpha}} \]
称为Lagrange余项。
证:固定\(\boldsymbol{a},\boldsymbol{h}\),设\(t \in [0,1]\),考虑\([0,1]\)上的函数\(\varphi(t) = f(\boldsymbol{a} + t
\boldsymbol{h})\),显然\(\varphi\)在\([0,1]\)上有\(m+1\)阶的连续导数,对\(\varphi\)用单变量函数的Taylor公式,得
\[
\varphi(1) = \varphi(0) + \varphi^\prime(0) +
\frac{1}{2!}\varphi^{\prime\prime}(0) + \cdots +
\frac{1}{m!}\varphi^{(m)}(0) + \frac{1}{(m+1)!}\varphi^{(m+1)}(\theta)
\tag{1}
\]
其中\(\theta \in (0, 1)\)。显然\(\varphi(1) = f(\boldsymbol{a} +
\boldsymbol{h})\),\(\varphi(0) =
f(\boldsymbol{a})\),根据复合函数的求导公式得
\[
\varphi^\prime(t) = \frac{\partial f}{\partial x_1}(\boldsymbol{a} +
t \boldsymbol{h})h_1 + \cdots + \frac{\partial f}{\partial
x_n}(\boldsymbol{a} + t \boldsymbol{h})h_n =
\left(h_1\frac{\partial}{\partial x_1} + \cdots +
h_n\frac{\partial}{\partial x_n}\right)f(\boldsymbol{a} + t
\boldsymbol{h})
\]
从而可得
\[
\begin{aligned}
\varphi^{\prime\prime}(t) &=
\left(h_1\frac{\partial}{\partial x_1} + \cdots +
h_n\frac{\partial}{\partial x_n}\right)^2f(\boldsymbol{a} +
t\boldsymbol{h}) \\
\cdots, \\
\varphi^{(m)}(t) &= \left(h_1\frac{\partial}{\partial x_1} +
\cdots + h_n\frac{\partial}{\partial x_n}\right)^mf(\boldsymbol{a} +
t\boldsymbol{h})
\end{aligned}
\]
根据定理4可知,
\[
\varphi^{(k)}(t) =
\sum_{|\boldsymbol{\alpha}|=k}\frac{k!}{\boldsymbol{\alpha}!}\frac{\partial^{\alpha_1}}{\partial
x_1^{\alpha_1}}\cdots \frac{\partial ^{\alpha_n}}{\partial
x_n^{\alpha_n}}f(\boldsymbol{a} +
t\boldsymbol{h})\boldsymbol{h}^{\boldsymbol{\alpha}} =
\sum_{|\boldsymbol{\alpha}|=k}
\frac{k!}{\boldsymbol{\alpha}!}D^{\boldsymbol{\alpha}}f(\boldsymbol{a} +
t\boldsymbol{h}) \boldsymbol{h}^{\boldsymbol{\alpha}}
\]
所以
\[
\varphi^{(k)}(0) = \sum_{|\boldsymbol{\alpha}|=k}
\frac{k!}{\boldsymbol{\alpha}!}D^{\boldsymbol{\alpha}}f(\boldsymbol{a})
\boldsymbol{h}^{\boldsymbol{\alpha}}
\]
将其代入(1)式,即得证明的结论。
Q.E.D.
特别地
Taylor公式的前三项写出来就是
\[
f(\boldsymbol{a} + \boldsymbol{h}) = f(\boldsymbol{a}) +
\frac{\partial f}{\partial x_1}(\boldsymbol{a})h_1 + \cdots +
\frac{\partial f}{\partial x_n}(\boldsymbol{a})h_n + \frac{1}{2}
\sum_{i,j=1}^n\frac{\partial^2 f}{\partial x_i \partial
x_j}(\boldsymbol{a})h_ih_j + \cdots
\]
如果记
\[
Hf(\boldsymbol{a}) = \begin{bmatrix}
\frac{\partial^2 f}{\partial x_1^2}(\boldsymbol{a}) &
\cdots & \frac{\partial^2 f}{\partial x_1 \partial
x_n}(\boldsymbol{a}) \\
\vdots & & \vdots \\
\frac{\partial^2 f}{\partial x_n \partial
x_1}(\boldsymbol{a}) & \cdots & \frac{\partial^2 f}{\partial
x_n^2}(\boldsymbol{a})
\end{bmatrix}
\]
那么上式可写成
\[
f(\boldsymbol{a} + \boldsymbol{h}) = f(\boldsymbol{a}) +
Jf(\boldsymbol{a})\boldsymbol{h} + \frac{1}{2}\boldsymbol{h}^T
Hf(\boldsymbol{a}) \boldsymbol{h} + \cdots
\]
这里\(Hf\)称为\(f\)的Hesse方阵,它是一个\(n\)阶对称方阵。
定理6
设\(D \subset \mathbb{R}^n\)是一个凸区域,\(f \in C^m(D)\),\(\boldsymbol{a}\)和\(\boldsymbol{a}+\boldsymbol{h}\)是\(D\)中的两个点,那么
\[ f(\boldsymbol{a} + \boldsymbol{h}) = \sum_{k=0}^m \sum_{|\boldsymbol{\alpha}|=k} \frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a})}{\boldsymbol{\alpha}!} \boldsymbol{h}^{\boldsymbol{\alpha}} + o(\Vert \boldsymbol{h} \Vert^m) \quad (\boldsymbol{h} \to \boldsymbol{0}) \]
证:由定理5可知
\[
f(\boldsymbol{a} + \boldsymbol{h}) = \sum_{k=0}^{m-1}
\sum_{|\boldsymbol{\alpha}|=k}
\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} +
\sum_{|\boldsymbol{\alpha}|=m}\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a}
+ \theta \boldsymbol{h})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} \tag{2}
\]
其中\(\theta \in (0, 1)\),因为\(f\)的\(m\)阶偏导数连续,所以
\[
\lim \limits_{\boldsymbol{h} \to \boldsymbol{0}}
D^{\boldsymbol{\alpha}}f(\boldsymbol{a} + \theta \boldsymbol{h}) =
D^{\boldsymbol{\alpha}}f(\boldsymbol{a}) \quad (|\boldsymbol{\alpha}|=m)
\]
从而有
\[
D^{\boldsymbol{\alpha}}f(\boldsymbol{a} + \theta \boldsymbol{h}) =
D^{\boldsymbol{\alpha}}f(\boldsymbol{a}) + o(1) \quad (\boldsymbol{h}
\to 0)
\]
所以
\[
\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a} + \theta
\boldsymbol{h})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} =
\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} +
o(\boldsymbol{h}^{\boldsymbol{\alpha}}) \quad (\boldsymbol{h} \to
\boldsymbol{0})
\]
当\(|\boldsymbol{\alpha}|=m\)时,有
\[
|\boldsymbol{h}^{\boldsymbol{\alpha}}| = |h_1^{\alpha_1} \cdots
h_n^{\alpha_n}| = |h_1|^{\alpha_1} \cdots |h_n|^{\alpha_n} \le \Vert
\boldsymbol{h} \Vert^{m}
\]
从而
\[
\sum_{|\boldsymbol{\alpha}| =
m}\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a} + \theta
\boldsymbol{h})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} = \sum_{|\boldsymbol{\alpha}| =
m}\frac{D^{\boldsymbol{\alpha}}f(\boldsymbol{a})}{\boldsymbol{\alpha}!}
\boldsymbol{h}^{\boldsymbol{\alpha}} + o(\Vert \boldsymbol{h} \Vert^{m})
\quad (\boldsymbol{h} \to \boldsymbol{0})
\]
将上式代入(2)式中,即证得命题成立。
Q.E.D.
定理7:拟微分平均值定理
设\(\boldsymbol{f}: [a,b] \to \mathbb{R}^m\)是\([a,b]\)上的连续映射,在开区间\((a,b)\)上可微,那么存在一点\(\xi \in (a,b)\)使得
\[ \Vert \boldsymbol{f}(b) - \boldsymbol{f}(a) \Vert \le \Vert J\boldsymbol{f}(\xi) \Vert (b-a) \]
证:设\(\boldsymbol{u} = \boldsymbol{f}(b)
- \boldsymbol{f}(a)\),利用\(\mathbb{R}^m\)中的内积来定义函数
\[
\varphi(t) = \left<\boldsymbol{u}, \boldsymbol{f}(t)\right>
\quad (a \le t \le b)
\]
易知\(\varphi\)是\([a,b]\)上的连续函数,并在开区间\((a,b)\)上可微,对\(\varphi\)使用微分中值定理,可知存在一点\(\xi \in (a,b)\)使得
\[
\varphi(b) - \varphi(a) = (b-a)\varphi^\prime(\xi) =
(b-a)\left<\boldsymbol{u}, J\boldsymbol{f}(\xi) \right>
\]
而
\[
\varphi(b) - \varphi(a) = \left< \boldsymbol{u},
\boldsymbol{f}(b) \right> - \left< \boldsymbol{u},
\boldsymbol{f}(a) \right> = \left< \boldsymbol{u},
\boldsymbol{f}(b) - \boldsymbol{f}(a)\right> = \left<
\boldsymbol{u}, \boldsymbol{u} \right> = \Vert \boldsymbol{u} \Vert^2
\]
由Cauchy-Schwarz不等式,可得
\[
\Vert \boldsymbol{u} \Vert^2 = (b-a)\left< \boldsymbol{u},
J\boldsymbol{f}(\xi) \right> \le (b-a)\Vert \boldsymbol{u} \Vert
\Vert J\boldsymbol{f}(\xi) \Vert
\]
当\(\boldsymbol{u} \ne
\boldsymbol{0}\)时,式子两边消去\(\Vert
\boldsymbol{u} \Vert\)即得证原命题;若\(\boldsymbol{u} =
\boldsymbol{0}\),命题自然成立。
Q.E.D.
定理8
设凸区域\(D \subset \mathbb{R}^n\),且映射\(\boldsymbol{f}: D \to \mathbb{R}^m\)在\(D\)上可微,则对任何\(\boldsymbol{a},\boldsymbol{b} \in D\),在由\(\boldsymbol{a}, \boldsymbol{b}\)所决定的线段上必有一点\(\boldsymbol{\xi}\),使得
\[ \boldsymbol{f}(\boldsymbol{b}) - \boldsymbol{f}(\boldsymbol{a}) \le \Vert J\boldsymbol{f}(\boldsymbol{\xi}) \Vert \Vert \boldsymbol{b} - \boldsymbol{a}\Vert \]
证:由\(\boldsymbol{a}\)与\(\boldsymbol{b}\)所决定的线段可表示为
\[
\boldsymbol{r}(t) = \boldsymbol{a} + t (\boldsymbol{b} -
\boldsymbol{a}) \quad (0 \le t \le 1)
\]
令
\[
\boldsymbol{g}(t) = \boldsymbol{f} \circ \boldsymbol{r}(t)
\]
映射\(g\)在\([0,1]\)上连续,在\((0,1)\)内可微,从而
\[
J\boldsymbol{g}(t) =
J\boldsymbol{f}(\boldsymbol{r}(t))(\boldsymbol{b} - \boldsymbol{a})
\]
由定理7可知存在\(\tau \in (0, 1)\),使得
\[
\Vert \boldsymbol{g}(1) - \boldsymbol{g}(0)\Vert = \Vert
J\boldsymbol{g}(\tau)\Vert
\]
即
\[
\Vert \boldsymbol{f}(\boldsymbol{b}) -
\boldsymbol{f}(\boldsymbol{a}) \Vert \le \Vert
J\boldsymbol{f}(\boldsymbol{r}(\tau))(\boldsymbol{b} - \boldsymbol{a})
\Vert
\]
令\(\boldsymbol{\xi} =
\boldsymbol{r}(\tau)\),可得
\[
\Vert \boldsymbol{f}(\boldsymbol{b}) -
\boldsymbol{f}(\boldsymbol{a}) \Vert \le \Vert
J\boldsymbol{f}(\boldsymbol{\xi})(\boldsymbol{b} - \boldsymbol{a}) \Vert
\le \Vert J\boldsymbol{f}(\boldsymbol{\xi}) \Vert \Vert \boldsymbol{b} -
\boldsymbol{a} \Vert
\]
Q.E.D.