趣味の研究

趣味の数学を公開しています。初めての方はaboutをご覧ください。

Inequality of information entropy, gamma, beta function, binomal coefficients

We think  of cross entropy for probability density function P(x), Q(x).

-\int P(x)\log(Q(x) )=E_P[-\log(Q(x)]

 Here, \log is natural loggarithm.

1) Theorem

Define 

G(s)=-\frac{1}{s}\log(E_P[Q(x)^s])

and assume Q(x)^s is integrable.

If s>0,

G(s)\leq E_P[-\log(Q(x) )]

and 

G(-s)\geq E_P[-\log(Q(x) )].

 

It is equivalent 

G(s_1)\leq E_P[-\log(Q(x) )]\leq G(s_2)    eq(1)

for s_2<0 <s_1

 

Furthermore,

s\rightarrow 0 then,

G(s)\rightarrow E_P[-\log(Q(x) )].

 

If P(x)=Q(x), we define H(X) as entropy 

G(s_1)\leq H(X) \leq G(s_2)   

for s_2<0 <s_1

 

2)Corollary

Corollary 1.

\frac{\Gamma(y+1)}{\Gamma(x+1)} \geq \exp\{ (y-x)(\psi(x)-1-\frac{1}{x} )\}{(\frac{y}{x})}^{y+1}

for x,y>0. \psi(x) is digamma function.

Especially x is integer n,

\Gamma(y+1) \geq {n!}\exp( (y-n)(H_{n-1}-1-\gamma) ){(\frac{y}{n})}^{y+1}

Here, H_n is harmonic number H_n=\sum_{k=1}^{n}\frac{1}{k}

 

Corollary 2.

\sum_{k=0}^n {\tbinom{n}{k}}^{1+s}\geq 2^{(1+s)n} {(\frac{2}{\pi en})}^{\frac{s}{2}} \exp(-sO(n^{-1}))

for s>-1

 

Corollary 3. 

\frac{B(s\alpha+\alpha-s, s\beta+\beta-s)}{B(\alpha, \beta)}\geq \exp(s(\alpha-1)\psi(\alpha)+s(\beta-1)\psi(\beta)-s(\alpha+\beta-2)\psi(\alpha+\beta) )

for s>-1

 

Proof of Theorem

E_P[Q(x)^{s}]=E_P[\exp(s\log(Q(x) )]

We apply Jensen's inequality, we derive

E_P[\exp(s\log(Q(x) )]\geq\exp(-sE_P[-\log(Q(x) )])

 

We take loggarithm of both sides, and devide by -s, we get the result.

Remark the inequality is reversed in accordance with the sign of s.

\log(E_P[Q(x)^s]) equals 0 if s=0.

The limit as s\rightarrow 0G(s)=-\frac{1}{s}\log(E_P[Q(x)^s]) equals the diffrential of \log(E_P[Q(x)^s]) with respect to s.

We diffrentiate \log(E_P[Q(x)^s]) with respect to s and substitute s=0, the formula equals E_P[-\log(Q(x) )].

 

From eq(1),

E_P[P(x)^s ]\geq \exp(-sH(X) )for s>-1

   eq(2)

Here, H(x) is entropy.

We derive some inequalities by using eq(2).

 

Proof of corollary 1

We apply eq(2) to gamma  distribution.

For x>0, \theta=1

P(x)=\frac{1}{\Gamma(k)}x^{k-1}\exp(-x).

For 1+s>0, we caliculate E_P[P(x)^s]

E_P[P(x)^s]=\int_0^{\infty} dx P(x)^{s+1}

Transform u=(1+s)x

\frac{1}{\Gamma(k)^{1+s}}\int_0^{\infty} du u^{(k-1)(1+s)}\exp(-u)=\frac{1}{\Gamma(k)^s}\frac{\Gamma( (k-1)(1+s)+1)}{\Gamma(k)}

We substitute H(X)=k+\log(\Gamma(k) )+(1-k)\psi(k) to eq(2),

\frac{\Gamma( (k-1)(1+s)+1)}{\Gamma(k)}\geq\exp(-sk+s(k-1)\psi(k) )(1+s)^{(k-1)(1+s)+1}  for s >-1.

Here, \psi is digamma function.

We put x=k-1, y+1=(k-1)(1+s)+1, then 

s=\frac{y}{x}-1.

We derive

\frac{\Gamma(y+1)}{\Gamma(x+1)} \geq \exp\{ (y-x)(\psi(x)-1-\frac{1}{x} )\}{(\frac{y}{x})}^{y+1}

for x,y>0.

Especially x is integer n,

\Gamma(y+1) \geq {n!}\exp( (y-n)(H_{n-1}-1-\gamma) ){(\frac{y}{n})}^{y+1}

Here, H_n is harmonic number H_n=\sum_{k=1}^{n}\frac{1}{k}

Next, we apply eq(2) to Rayleigh distribution.

Substitute \sigma =\frac{1}{\sqrt(2)}, and transform u=(1+s)x^2

P(x)^{1+s}dx=2^s{(1+s)}^{-\frac{s}{2}-1}u^{\frac{s}{2}}\exp(-u)du

Integrate fom 0 to \infty, and substitute H(x)=1+\frac{\gamma}{2}-\log(2) to  eq(2), we can derive

2^s{(1+s)}^{-\frac{s}{2}-1}\Gamma(\frac{s}{2}+1)\geq 2^s\exp(-s-\frac{s\gamma}{2}).

We put x=\frac{s}{2}+1, then

\Gamma(x)\geq (2x-1)^{x} \exp( -(x-1)(2+\gamma) )

for \frac{1}{2}<x

 

Proof of corollary 2

We apply eq(2) to binomal distribution, and we put  p=\frac{1}{2}

\sum_{k=0}^nP(k)^{1+s}=\sum_{k=0}^n {\tbinom{n}{k}}^{1+s}2^{-n(1+s)}

We substitute  H(X)=\frac{1}{2}\log(\frac{\pi en}{2})+O(n^{-1}) to eq(2),

\sum_{k=0}^n {\tbinom{n}{k}}^{1+s}\geq 2^{(1+s)n} {(\frac{2}{\pi en})}^{\frac{s}{2}} \exp(-sO(n^{-1}))

for s>-1

 

Proof of corollary 3

We apply eq(2) to beta distribution.

P(x)^{1+s}=\frac{x^{s\alpha-s+\alpha-1}{(1-x)}^{s\beta-s+\beta-1}}{B(\alpha, \beta)^{1+s}}

Integrate from 0 to 1,

\frac{B(s\alpha-s+\alpha, s\beta-s+\beta)}{B(\alpha, \beta)^{1+s}}

We substitute  H(X)=\log(B(\alpha,\beta)-(\alpha-1)\psi(\alpha)-(\beta-1)\psi(\beta)+(\alpha+\beta-2)\psi(\alpha+\beta) to eq(2),

\frac{B(s\alpha+\alpha-s, s\beta+\beta-s)}{B(\alpha, \beta)}\geq \exp(s(\alpha-1)\psi(\alpha)+s(\beta-1)\psi(\beta)-s(\alpha+\beta-2)\psi(\alpha+\beta) )

for s>-1