趣味の研究

趣味の数学を公開しています。初めての方はaboutをご覧ください。

Entropy inequalities

Definition.

Let h_p be Renyi entropy, 

h_p=\frac{1}{1-p}\log(\int_{\mathrm{R^d}}dx f(x)^p) for probability density function f(x)

h=h_1 is Shannon entropy.

 h_{\infty}=-\log\|f\|_{\infty} and \|f\|_{\infty}=\mathrm{ess} \sup f(x).

and

 N_p=\exp(\frac{2h_p}{d}).

 

We  derive some properties by assuming 

\det(K_X)\|f\|_{\infty}^2\leq C.

where K_x is covariant matrix and C is constant.

 

Theorem 1  ( joint entropy inequatliy)

Let X be probability variables in \mathrm{R^d} and p\geq 1.

Let \|f_k\|_{\infty}^2\det(K_{X})\leq C and \lambda_l be eigen values of K_X.

If \lambda_l\geq 1 for all l

h_p(X_1,X_2, \cdot\cdot\cdot, X_d)\geq\frac{1}{d}\sum_l h_p(X_l) -\frac{1}{2}\log(2\pi eC)

This inequaity is the extension of the inequality in discrete case

h_p(X_1,X_2, \cdot\cdot\cdot, X_d)\geq\frac{1}{d}\sum_l h_p(X_l).

 

Theorem 2 (d-dimentinal reverse EPI)

Let X_k be uncorrelated probability variables in \mathrm{R^d}.

Let \|f_k\|_{\infty}^2\det(K_{X_k})\leq C for all k, and let 

\lambda_{k;l} be eigen values of K_{X_k}.

 

If d = 1, 

 \sum_{k=1}^n {N_{p}(X_k)}\geq \frac{1}{2\pi eC}N_p(\sum_{k=1}^n X_k)

for p\geq 1.

If \lambda_{l,k}\geq 1 for all l,k and d\geq 2,

 \sum_{k=1}^n {N_{p}(X_k)}^d\geq \frac{1}{2\pi eC}N_p(\sum_{k=1}^n X_k)

for p\geq 1.

These are the reverse EPI.

 

Theorem 3.   (Renyi EPI for order p < 1)

Let X_k be independent probability variables in \mathrm{R^d}.

Let \|f_k\|_{\infty}^2\det(K_{X_k})\leq C for all k.

where, K_X is covariant matrix.

 Then,

for \frac{d}{d+2}\leq p<1

N_p(\sum_{k=1}^n X_k)\geq \frac{A_{p,d}}{eC^{\frac{1}{d}}}\sum_{k=1}^n N_{p}(X_k)

This inequality is the extension of Renyi EPI for p<1.

 where 

A_{p,d}=[{(\Gamma(\frac{1}{1-p}-\frac{d}{2})^{\frac{p}{1-p}}\Gamma(\frac{p}{1-p}) )}^{\frac{2}{d}}\pi]/[{(\Gamma(\frac{1}{1-p})^{\frac{p}{1-p}}\Gamma(\frac{p}{1-p}-\frac{d}{2}) )}^{\frac{2}{d}}{(\beta(1-p) )}]

\beta=\frac{1}{2p-d(1-p)}

 

Proposition 1.

h_p\geq -\log(\|f\|_{\infty})

N_p(X)\geq N_{\infty}(X)

This inequality holds either discrete case or continuous case.

We easily show these results by using f\leq\|f\|_{\infty}.

 

Proposition 2.(entropy upper bound)

For discrete or continuous d-dimentional probability variable X

\exp(2h_1(X) )\leq {(2\pi e)}^d\det(K_X).

 

Lemma 1. 

For continuous probability variable,

h_p\leq h_q if q\leq p.

 

Proof of Theorem 1.

Using Proposition2 and Lemma1, 

\sum_l \exp(2h_p(X_l) )\leq \sum_l\exp(2h_1(X_l) )\leq 2\pi e \sum_l K_{X,ll}=2\pi e \mathrm{Tr}(K_X)=2\pi e\sum_l \lambda_l                         eq(1)

By the asuumption \lambda_l\geq 1 and \|f_k\|_{\infty}^2\det(K_{X})\leq C,

\sum_l \lambda_l\leq d\Pi_l\lambda_l = d\det(K_X)\leq dC\|f\|_{\infty}^{-2}    eq(2)

By combining Proposition 1., eq(1) and eq(2), 

\sum_l \exp(2h_p(X_l) )\leq 2\pi eCd\exp(2h_p(X_1,X_2, \cdot\cdot\cdot, X_d) )

 Using Jensen's Inequality,

\frac{1}{d}\sum_l \exp(2h_p(X_l) )\geq \exp(\sum_l \frac{2h_p(X_l)}{d}).

We derive

\frac{1}{d}\sum_l h_p(X_l)\leq\frac{1}{2}\log(2\pi eC)+h_p(X_1,X_2, \cdot\cdot\cdot, X_d)

 

Proof of Theorem 2.

For Y=\sum_k X_k, using Proposition2.,

 N_p(Y)\leq N_1(Y)\leq 2\pi e\det(K_Y)^{\frac{1}{d}}

Using \frac{1}{d}\mathrm{Tr}A\geq {(\det A)}^\frac{1}{d}, and uncorrelated condition \mathrm{Tr}K_Y=\sum_k \mathrm{Tr}K_{X_k},

we derive

2\pi e\det(K_Y)^{\frac{1}{d}}\leq \frac{2\pi e}{d}\mathrm{Tr}K_Y=\frac{2\pi e}{d}\sum_k\mathrm{Tr}K_{X_k}=\frac{2\pi e}{d}\sum_k\sum_l\lambda_{k;l}

 

 By assumption \lambda_{k:l}\geq 1,

\sum_l\lambda_{k;l}\leq d\Pi_l\lambda_{k;l}=d\det(K_{X_k}).

By combining Proposition 1. and assumption \|f_k\|_{\infty}^2\det(K_{X_k})\leq C,

 N_p(Y)\leq 2\pi e\sum_k\det(K_{X_k})\leq 2\pi eC\sum_kN_{\infty}(X_k)^d\leq 2\pi eC\sum_k N_p(X_k)^d.

 

Proof of Theorem 3. 

Lemma 2.

For \frac{d}{d+2}\leq p<1,  

N_p(X)\leq A_{p,d}^{-1}\det(K_X)

 

We can derive this inequality from Proposition 1.3. in [4].

By Proposition 1.3. in [4], 

g(x)=B_p\int_{\mathrm{R^d}}d^dx\frac{1}{{(1+\beta(1-p)(\bf{x-\mu})^TK_x^{-1}(\bf{x-\mu }) )}^{\frac{1}{(1-p)}}} maximise Renyi entropy.

Using \int_{\mathrm{R^d}}d^dx=\frac{4\pi^{\frac{d}{2}}}{\Gamma(\frac{d}{2})}\int_0^{\infty}dx x^{d-1} and the definition of beta function,

we derive

\exp(h_{p,g})=\|g\|_p^{\frac{p}{1-p}}=\det(K_X)[{(\Gamma(\frac{1}{1-p})\Gamma(\frac{p}{1-p}-\frac{d}{2}) )}{(\beta(1-p) )}^{\frac{d}{2}}]/[{(\Gamma(\frac{1}{1-p}-\frac{d}{2})\Gamma(\frac{p}{1-p}) )}\pi{^\frac{d}{2}}]=A_{p,d}^{-\frac{d}{2}}\det(K_X)

where

\beta=\frac{1}{2p-d(1-p)}.

For any probability variable X with covariant matrix K_X,

\exp(2h_{g,p})\geq \exp(2h_p(X) )=A_{p,d}^{-d}\det(K_X).

 

Lemma 3.

Let Y be Y=\sum_k X_k for independent probability variable X_k.

N_{\infty}(Y) \geq\frac{1}{e}\sum_{k=1}^n N_{\infty}(X_k)

This Lemma is shown as Theorem 2.14. in [1].

 

By combining the assumption and Lemma 2., 

N_p(X_k)\leq A_{p,d}^{-1}\det(K_{X_k})^{\frac{1}{d}} \leq A_{p,d}^{-1}C^{\frac{1}{d}}N_{\infty}(X_k).

 

By combining Lemma 3. and Proposition 1, 

 \sum_k N_p(X_k) \leq A_{p,d}^{-1}C^{\frac{1}{d}}\sum_k N_{\infty}(X_k) \leq A_{p,d}^{-1}C^{\frac{1}{d}}eN_{\infty}(Y)\leq A_{p,d}^{-1}C^{\frac{1}{d}}e N_p(Y) .

 

References.

[1] A.Marsiglietti, V.Kostina2, P.Xu. "A lower bound on the differential entropy of log-concave random vectors with applications"

[2] E.Ram, I.Sason. "On Renyi Entropy Power Inequalities".

[3] A. Marsiglietti, V.Kostina."A lower bound on the differential entropy of log-concave random vectors with applications"

[4]O.Johnson , C.Vignat ."Some results concerning maximum Rényi entropy distributions"

 

The graph of probability bound

I show the graph of probability bound.

Theorem1

Let X \in \mathcal{R} be a continuous random variable with expected value \mu and variance \sigma^2.

Let f(X) be a probability density function.

Let m_k be m_k=\sup_{|x-\mu|\geq k\sigma} f(x).

Then, 

 Pr(|x-\mu|\geq k\sigma)\leq\alpha (m_k\sigma)

where

\alpha is a real root of 3-order equation.

N(x)=\frac{1}{2\pi}x^3+k x^2+e k^2 x -\frac{e}{m_k\sigma}=0.

 

Corollary 1.

Let X \in \mathcal{Z} be a discrete random variable with expected value \mu and variance \sigma^2.

Let p(X) be a probability mass function.

Let m_k be m_k=\sup_{\{x\leq\lceil {\mu-k\sigma} \rceil\}\cup\{x\geq\lfloor {\mu+k\sigma} \rfloor\}} p(x).

Then, 

 Pr(|x-\mu|\geq k\sigma)\leq\alpha (m_kr\sigma)+p(\lceil\mu-k\sigma\rceil)

where

\alpha is a real root of 3-order equation.

N(x)=\frac{1}{2\pi}x^3+\frac{k}{r} x^2+e{(\frac{k}{r})}^2 x -\frac{e}{m_kr\sigma}=0.

r=\sqrt{1+\frac{1}{12\sigma^2}}.

 

About the proof, please see the article on April 11th.

The example of normal distribution.

f:id:hobbymath:20180414214851p:plain

The example of Poisson distribution.

f:id:hobbymath:20180414221908p:plain

"Probability" is the result of actual normal distribution, "Chebyshev" is Chebyshev inequality, and "New bound" is the result of Theorem 1. and Corollary 1.

 

The python code is 

 

# -*- coding: utf-8 -*-
"""
Created on Tue Apr 10 22:09:53 2018

@author: HobbyMath
"""

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import sympy

k_max =50
sigma=np.sqrt(3)
mu=3
Pr = np.ones(k_max , dtype=float)
chebyshev= np.ones(k_max , dtype=float)
bound= np.ones(k_max , dtype=float)

t = sympy.Symbol('t')
for i in range(1,k_max + 1):
print(i)
delta = 0.1
k = i * delta + 1
pos = k * sigma + mu
pos2 = - k* sigma + mu

pos_int = np.floor(k * sigma + mu) + 1
pos2_int = np.floor(-k * sigma + mu)

##########normal distribution######
# m = stats.norm.pdf(x=pos, loc=mu, scale=sigma)
# Pr[i] = 1 - stats.norm.cdf(x=pos, loc=mu, scale=sigma) + stats.norm.cdf(x=pos2, loc=mu, scale=sigma)

#########Poisson distribution######
m = max(stats.poisson.pmf(pos_int, mu), stats.poisson.pmf(pos2_int - 1, mu))
Pr[i] = 1 - stats.poisson.cdf(pos, mu) + stats.poisson.cdf(pos2, mu)

#########binomal distribution######
# n=10
# p=0.5
# sigma = np.sqrt(n * p * (1 - p))
# mu=n * p
# m=max(stats.binom.pmf(pos_int, n, p), stats.binom.pmf(pos2_int - 1, n, p))
# Pr[i] = 1 - stats.binom.cdf(pos, n, p) + stats.binom.cdf(pos2, n, p)

chebyshev[i] = 1 / k**2
r = np.sqrt(1 + 1 / (12 * sigma ** 2))
#r = 1
eq = 0.5 / np.pi * t**3 + k * t**2 / r + np.e * k**2 / r**2 * t - np.e / (m * r * sigma)
tmp = np.array(sympy.solve(eq), dtype=complex)
# bound[i]=np.abs(tmp[0]) * m * r * sigma
bound[i]=np.abs(tmp[0]) * m * r * sigma + stats.poisson.pmf(pos2_int , mu)


plt.title("Poisson distribution(mu=1, sigma=1")
plt.ylabel("probability")
plt.xlabel("k")
#plt.yscale('log')

prange = np.arange(1 ,k_max)
x = prange.astype(np.float64) * delta + 1.0

p1 = plt.plot(x, Pr[prange])
p2 = plt.plot(x, chebyshev[prange])

p3 = plt.plot(x, bound[prange])

plt.legend((p1[0], p2[0], p3[0]), ("Probability", "Chebyshev", "New bound"), loc=5)

 

まとめ:Renyiエントロピーとモーメント

今回、考察の動機になったのは、Renyiエントロピーの指数関数\exp(h_p(X) )とq次のモーメントのq乗根E[|X|^q]^{\frac{1}{q}}がスケール変換 X\rightarrow sXに対して、全て同じ変換を受けるということでした。

そこで予想したのが

ある定数cが存在し、

c_{p,q}\exp(h_p(X) )\leq E[|X|^q] \leq c_{p,q}\exp(h_p(X) )

が成り立つのではないか?ということです。

左側の不等式は、一般的な連続確率変数に対して成り立ちそうにみえます。右側の不等式は一般には成り立ちません。

Renyiエントロピーp=\inftyは、確率密度関数の上界と結びついています。

p=1の場合は、Shannonのエントロピーになります。

特に、確率密度関数がlog-凹関数と仮定すれば、両側の不等式が成り立つことを示しました。


p=1, q=1,2の場合の左側の不等式を用いて、チェビシェフ不等式をより厳しくした上限を導き、q=2の場合の両側の不等式を用いて、Renyiエントロピーのentropy power inequalityを導きました。

チェビシェフ不等式の拡張(続き)

正規分布に対して、以前の記事で導いた不等式と、今回導いたlog凹関数版の不等式を比較してみます。

 

f:id:hobbymath:20180331223550p:plain

"1-CDF"が実際の正規分布の値、"New bound1"がr=3で計算した値、"New bound2(log-concavity)"がlog凹関数版の不等式でr=1で計算した値です。

log凹関数版の不等式は、kが大きいところでは、rの値を0に近づけた方が近似精度が高くなります。

一方で、kが小さいところでの精度は悪化します。

計算に用いたpythonのコードです。

 

# -*- coding: utf-8 -*-
"""
Created on Fri Mar 30 22:52:40 2018
@author: HobbyMath
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

k_max =50
sigma=1
mu=1
cdf = np.ones(k_max , dtype=float)
chebyshev= np.ones(k_max , dtype=float)
bound= np.ones(k_max , dtype=float)
bound2= np.ones(k_max , dtype=float)
bound3= np.ones(k_max , dtype=float)
for i in range(1,k_max + 1):
print(i)
delta = 0.1
k = i * delta + 1
##########normal distribution######

pos = k * sigma + mu
m = stats.norm.pdf(x=pos, loc=mu, scale=sigma)
cdf[i] = stats.norm.cdf(x=pos, loc=mu, scale=sigma)

chebyshev[i] = 1 / k**2

r = 3
bound[i]=(2 * m*sigma * k**3/ (2*r-1) ) **(1/(r+1)) / k**2

# r = 0.5
# bound2[i]=(2*m*sigma * k**3/ (k**2 * 0.5 + 2*r-1) ) **(1/(r+1)) / k**2
#
r = 1
bound2[i]=(2*m*sigma * k**3/ (k**2 * 0.5 + 2*r-1) ) **(1/(r+1)) / k**2
#bound[i]=(m*sigma * k**3/ (k + 2*r-1) ) **(1/(r+1)) / k**2


plt.title("normal distribution(mu=1, sigma=1)")
plt.ylabel("probability")
plt.xlabel("k")
#plt.yscale('log')

prange = np.arange(1 ,k_max)
x = prange.astype(np.float64) * delta + 1.0

p1 = plt.plot(x, 2 * (1-cdf[prange]))
#p2 = plt.plot(x, chebyshev[prange])

p2 = plt.plot(x, bound[prange])
p3 = plt.plot(x, bound2[prange])
#p4 = plt.plot(x, bound3[prange])
plt.legend((p1[0], p2[0], p3[0]), ("1-CDF", "New bound1(r=3)", "New bound2(log-concavtiy r=1)"), loc=5)

 

チェビシェフ不等式の拡張(続き)

定理

X \in \mathcal{R} を平均値 \mu、分散 \sigma^2の確率変数とする。

f(x)確率密度関数かつ対数凹関数であるとする。

f(x)\leq f(\mu)かすべての|x-\mu|\geq k\sigmaに対し成り立ち、

|x|\rightarrow \inftyの極限で

 f(x)x\rightarrow 0 に収束するものとする。

m(K)をm(k)=\max(f(\mu+k\sigma), f(\mu-k\sigma) )で定義する。

このとき、すべての r\geq \frac{1}{2}

 Pr(|x-\mu|\geq k\sigma)\leq \frac{1}{k^2}{[\frac{\sigma k^3 \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)}) + 2r-1}]}^{\frac{1}{r+1}}

か成り立つ。

また、0\leq r\leq \frac{1}{2}においては、

\log(\frac{f(\mu)}{m(k)}) + 2r-1\geq 0が成り立つkに対して、同様の不等式が成り立つ。

 

特に r=0,  r=\frac{1}{2}r=1の場合はそれぞれ

 Pr(|x-\mu|\geq k\sigma)\leq \frac{k\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{em(k)})}.

 Pr(|x-\mu|\geq k\sigma)\leq [{\frac{\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)})}]}^{\frac{2}{3}}.

 Pr(|x-\mu|\geq k\sigma)\leq [{\frac{\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{k\log(\frac{ef(\mu)}{m(k)})}]}^{\frac{1}{2}}

となる。

  

証明

仮定より、 0\leq \lambda\leq 1に対し、対数凹関数の不等式f(x+\lambda(y-x) )\geq f(x){(\frac{f(y)}{f(x)})}^{\lambda} 

が成り立つ。

両辺から f(x) を引き、\lambdaで除算して\lambda\rightarrow 0の極限をとると、

(y-x)\frac{df(x)}{dx}\geq f(x)(\log(f(y) )-\log(f(x) )

となる。

 y=\muとおくと、

(\mu-x)\frac{df(x)}{dx}\geq f(x)(\log(f(\mu) )-\log(f(x) )           eq(1)

この不等式から、f(x)\leq f(\mu) かつ、x-\mu\geq 0なるxに対し、\frac{df(x)}{dx}\leq 0 が成り立つ。

即ち、f(x)x-\mu\geq 0で単調減少する。

同様に、f(x)x-\mu\leq 0で単調増加する。

よって、 \sup_{|x-\mu|\geq k\sigma} f(x)=\max(f(\mu+k\sigma), f(\mu-k\sigma) )が成り立つ。

 |x-\mu|^{-2r}をeq(1)に乗算し、区間|x-\mu|\geq k\sigma積分して、左辺を部分積分すると、

\int_{|x-\mu|\geq k\sigma}dx (\mu-x)|x-\mu|^{-2r}\frac{df(x)}{dx}=(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\}+(1-2r)\int_{x-\mu\geq k\sigma}dx (x-\mu)^{2r} \{f(x) + f(-x) \}

また、右辺に対しては、

\int_{|x-\mu|\geq k\sigma}dx |x-\mu|^{-2r}f(x)(\log(f(\mu) )-\log(f(x) )\geq (\log(f(\mu) )-\log(m(k) )\int_{x-\mu\geq k\sigma}dx (f(x) + f(-x) ){(x-\mu)}^{-2r}   

を得る。

これらを合わせて、

(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\} \geq (\log(f(\mu) )-\log(m(k) +2r-1)Pr_k\int_{x-\mu\geq k\sigma}dx \frac{f(x) + f(-x)}{Pr_k}{(x-\mu)}^{-2r}   eq(2)

ここで、

Pr_k=Pr(|x-\mu|\geq k\sigma)

と置いた。

Prの定義から、 \int_{(x-\mu)\geq k\sigma}\frac{f(x) + f(-x)}{Pr_k}=1 かつ、

\frac{1}{x^r}x\geq 0で凸関数であり、\int_{(x-\mu) \geq k\sigma }dx (x-\mu)^2 \frac{f(x)+f(-x)}{Pr_k}\geq {(k\sigma)}^2

が成り立つ。

eq(2)の右辺の項に対して、Jensenの不等式を適用すると、

\int_{(x-\mu)\geq k\sigma}dx\frac{f(x)+f(-x)}{Pr_k}\frac{1}{(x-\mu)^{2r}}\geq\frac{1}{s_k^r}.

を得る。ここで、

s_k=\int_{(x-\mu)\geq k\sigma}dx \frac{f(x)+f(-x)}{Pr_k}(x-\mu)^2 dxであり、

s_k=\frac{1}{Pr_k}\int_{(x-\mu)\geq k\sigma}dx (f(x) + f(-x) )(x-\mu)^2 dx=\frac{1}{Pr_k}\int_{|x-\mu|\geq k\sigma}dx f(x){(x-\mu)}^2 dx\leq\frac{\sigma^2}{Pr_k}

が成り立つ。

まとめると、eq(2)は

(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\} \geq (\log(f(\mu) )-\log(m(k) +2r-1) \frac{Pr_k^{r+1}}{\sigma^{2r}}

となり、\log(\frac{f(\mu)}{m(k)})+2r-1\geq 0が成り立つならば、不等式を変形して、

 Pr(|x-\mu|\geq k\sigma)\leq \frac{1}{k^2}{(\frac{\sigma k^3 \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)})+2r-1})}^{\frac{1}{r+1}}.

 となって求める式を得る。

 

The extension of Chebyshev inequality 2

Theorem

Let X \in \mathcal{R} be a log-concave random variable with expected value \mu and variable \sigma^2.

Let f(X) be a probability distribution function and f(x)\leq f(\mu) hold for all |x-\mu|\geq k\sigma.

Let f(x)x\rightarrow 0  as |x|\rightarrow \infty.

Let m(K) be m(k)=\max(f(\mu+k\sigma), f(\mu-k\sigma) ).

Then, for any r\geq \frac{1}{2},

 Pr(|x-\mu|\geq k\sigma)\leq \frac{1}{k^2}{[\frac{\sigma k^3 \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)}) + 2r-1}]}^{\frac{1}{r+1}}.

For 0\leq r\leq \frac{1}{2}, if k satisfies the condition \log(\frac{f(\mu)}{m(k)}) + 2r-1\geq 0, the same inequality holds.

The simple examples are r=0,  r=\frac{1}{2}r=1.

We have

 Pr(|x-\mu|\geq k\sigma)\leq \frac{k\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{em(k)})}.

 Pr(|x-\mu|\geq k\sigma)\leq [{\frac{\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)})}]}^{\frac{2}{3}}.

 Pr(|x-\mu|\geq k\sigma)\leq [{\frac{\sigma \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{k\log(\frac{ef(\mu)}{m(k)})}]}^{\frac{1}{2}}.

  

Proof of Theorem 

By assumption of log-concavity, the inequality f(x+\lambda(y-x) )\geq f(x){(\frac{f(y)}{f(x)})}^{\lambda} follows on 0\leq \lambda\leq 1.

We subtract f(x) , devide by \lambda for this inequality and the limit \lambda\rightarrow 0, we have

(y-x)\frac{df(x)}{dx}\geq f(x)(\log(f(y) )-\log(f(x) ).

We put y=\mu, then we have

(\mu-x)\frac{df(x)}{dx}\geq f(x)(\log(f(\mu) )-\log(f(x) )           eq(1)

By this inequality, \frac{df(x)}{dx}\leq 0 holds for x which satisfies f(x)\leq f(\mu) and x-\mu\geq 0. So f(x) monotonically decreases on x-\mu\geq 0.

In the same way, we find f(x) monotonically increases on x-\mu\leq 0.

Then, we have \sup_{|x-\mu|\geq k\sigma} f(x)=\max(f(\mu+k\sigma), f(\mu-k\sigma) ).

We multiply |x-\mu|^{-2r} and integrate eq(1) on |x-\mu|\geq k\sigma, and apply integration by parts for LHS,

\int_{|x-\mu|\geq k\sigma}dx (\mu-x)|x-\mu|^{-2r}\frac{df(x)}{dx}=(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\}+(1-2r)\int_{x-\mu\geq k\sigma}dx (x-\mu)^{2r} \{f(x) + f(-x) \}

For RHS, we have

\int_{|x-\mu|\geq k\sigma}dx |x-\mu|^{-2r}f(x)(\log(f(\mu) )-\log(f(x) )\geq (\log(f(\mu) )-\log(m(k) )\int_{x-\mu\geq k\sigma}dx (f(x) + f(-x) ){(x-\mu)}^{-2r}   

From these results, we have

(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\} \geq (\log(f(\mu) )-\log(m(k) +2r-1)Pr_k\int_{x-\mu\geq k\sigma}dx \frac{f(x) + f(-x)}{Pr_k}{(x-\mu)}^{-2r}   eq(2)

Here, we put Pr_k=Pr(|x-\mu|\geq k\sigma).

By definition of Pr, \int_{(x-\mu)\geq k\sigma}\frac{f(x) + f(-x)}{Pr_k}=1 , and

\frac{1}{x^r} is convex function for x\geq 0,  and \int_{(x-\mu) \geq k\sigma }dx (x-\mu)^2 \frac{f(x)+f(-x)}{Pr_k}\geq {(k\sigma)}^2.

 

We can apply Jesen's inequality for the RHS term of eq(2).

\int_{(x-\mu)\geq k\sigma}dx\frac{f(x)+f(-x)}{Pr_k}\frac{1}{(x-\mu)^{2r}}\geq\frac{1}{s_k^r}.

We define as s_k=\int_{(x-\mu)\geq k\sigma}dx \frac{f(x)+f(-x)}{Pr_k}(x-\mu)^2 dx.

s_k satisfies 

s_k=\frac{1}{Pr_k}\int_{(x-\mu)\geq k\sigma}dx (f(x) + f(-x) )(x-\mu)^2 dx=\frac{1}{Pr_k}\int_{|x-\mu|\geq k\sigma}dx f(x){(x-\mu)}^2 dx\leq\frac{\sigma^2}{Pr_k}

Then we have

(k\sigma)^{1-2r}\{f(\mu+k\sigma) + f(\mu-k\sigma)\} \geq (\log(f(\mu) )-\log(m(k) +2r-1) \frac{Pr_k^{r+1}}{\sigma^{2r}}

If the condition  \log(\frac{f(\mu)}{m(k)}) + 2r-1\geq 0 holds, deforming this inequality, we have

 Pr(|x-\mu|\geq k\sigma)\leq \frac{1}{k^2}{(\frac{\sigma k^3 \{f(\mu+k\sigma) + f(\mu-k\sigma)\}}{\log(\frac{f(\mu)}{m(k)})+2r-1})}^{\frac{1}{r+1}}.