Liu YL,Xu XG,Guo YW et al.Pores-preserving face cleaning based on improved empirical mode decomposition.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(3):557-567 May 2009 Pores-Preserving Face Cleaning Based on Improved Empirical Mode Decomposition Yan-Li Liul,2(刘艳丽),Xiao-Gang Xu(徐晓刚)2,Yam-Wen Guo3(郭延文),Jin Wang2(王进) Xin Duan!(段鑫),Xi Chen1(陈曦),and Qun-Sheng Peng1,2(彭群生),Senior Member,CCF State Key Lab of CAD CG,Zhejiang University,Hangzhou 310027,China 2Department of Mathematics,Zhejiang University,Hangzhou 310027,China 2Department of Equipment Automatization,Dalian Naval Academy,Dalian 116026,China 3State Key Lab of Novel Software Technology,Nanjing University,Nanjing 210000,China E-mail:liuyanli@cad.zju.edu.cn:ywguo@nju.edu.cn:jwang@cad.zju.edu.cn Received May 29,2008;revised February 28,2009 Abstract In this paper,we propose a novel method of cleaning up facial imperfections such as bumps and blemishes that may detract from a pleasing digital portrait.Contrasting with traditional methods which tend to blur facial details, our method fully retains fine scale skin textures (pores etc.)of the subject.Our key idea is to find a quantity,namely normalized local energy,to capture different characteristics of fine scale details and distractions,based on empirical mode decomposition,and then build a quantitative measurement of facial skin appearance which characterizes both imperfections and facial details in a unified framework.Finally,we use the quantitative measurement as a guide to enhance facial skin.We also introduce a few high-level,intuitive parameters for controlling the amount of enhancement.In addition,an adaptive local mean and neighborhood limited empirical mode decomposition algorithm is also developed to improve in two respects the performance of empirical mode decomposition.It can effectively avoid the gray spots effect commonly associated with traditional empirical mode decomposition when dealing with high-nonstationary images. Keywords image enhancement,empirical mode decomposition,normalized local energy Introduction to preserve the former and remove the latter make face cleaning a challenging problem.For approaches to Facial skin not only exhibits various fine scale de- eliminating distractions with denoising techniques-3, tails,e.g.,pores,but also sometimes exhibits distrac- which are widely adopted by commercial software,fa- tions such as blemishes,bumps,acne scarring etc.From cial details such as pores are often accounted as noise many existing photo enhancement softwares that can and thereby removed. be used to remove facial distractions,a common char- In this paper,based on empirical mode decomposi- acteristic is observed:the resulting face is often over- tion](EMD),we set a quantity,i.e.,normalized local smoothed,of which the natural pores disappear com- energy (NLE),to reveal the characteristics of fine scale pletely.However,human skin is perception sensitive, and distractions.Building upon NLE,we further de- in which fine scale texture accounts for an important velop a quantitative characterization of facial skin ap- part in overall appearance.Over-smoothed face not pearance,namely imperfect degree,to depict the visual only looks unnatural but also dissolves the individua- perception of facial skin.Finally,we employ imperfect lity of a person.In this paper,we focus our attention degree as a guide to enhance facial skin.In addition on retaining fine skin texture features,e.g.,pores struc- we also propose an algorithm called adaptive local mean ture of the skin,while simultaneously removing facial and neighborhood limited empirical mode decomposi- imperfections such as scars,bumps and blemishes. tion (ALNEMD)to reduce the gray spots effect com- Since the pores and some distractions of imperfect monly associated with traditional EMD when dealing faces all belong to high frequency component of images with high nonstationary images. Regular Paper This work is supported by the National Natural Science Foundation of China under Grant Nos.60403038 and 60703084,the Natural Science Foundation of Jiangsu Province under Grant No.BK2007571,and the Natural Science Foundation of Liaoning Province under Grant No.20082176
Liu YL, Xu XG, Guo YW et al. Pores-preserving face cleaning based on improved empirical mode decomposition. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(3): 557–567 May 2009 Pores-Preserving Face Cleaning Based on Improved Empirical Mode Decomposition Yan-Li Liu1,2 (刘艳丽), Xiao-Gang Xu (徐晓刚) 2 , Yan-Wen Guo3 (郭延文), Jin Wang1 (王 进) Xin Duan1 (段 鑫), Xi Chen1 (陈 曦), and Qun-Sheng Peng1,2 (彭群生), Senior Member, CCF 1State Key Lab of CAD & CG, Zhejiang University, Hangzhou 310027, China 2Department of Mathematics, Zhejiang University, Hangzhou 310027, China 2Department of Equipment Automatization, Dalian Naval Academy, Dalian 116026, China 3State Key Lab of Novel Software Technology, Nanjing University, Nanjing 210000, China E-mail: liuyanli@cad.zju.edu.cn; ywguo@nju.edu.cn; jwang@cad.zju.edu.cn Received May 29, 2008; revised February 28, 2009. Abstract In this paper, we propose a novel method of cleaning up facial imperfections such as bumps and blemishes that may detract from a pleasing digital portrait. Contrasting with traditional methods which tend to blur facial details, our method fully retains fine scale skin textures (pores etc.) of the subject. Our key idea is to find a quantity, namely normalized local energy, to capture different characteristics of fine scale details and distractions, based on empirical mode decomposition, and then build a quantitative measurement of facial skin appearance which characterizes both imperfections and facial details in a unified framework. Finally, we use the quantitative measurement as a guide to enhance facial skin. We also introduce a few high-level, intuitive parameters for controlling the amount of enhancement. In addition, an adaptive local mean and neighborhood limited empirical mode decomposition algorithm is also developed to improve in two respects the performance of empirical mode decomposition. It can effectively avoid the gray spots effect commonly associated with traditional empirical mode decomposition when dealing with high-nonstationary images. Keywords image enhancement, empirical mode decomposition, normalized local energy 1 Introduction Facial skin not only exhibits various fine scale details, e.g., pores, but also sometimes exhibits distractions such as blemishes, bumps, acne scarring etc. From many existing photo enhancement softwares that can be used to remove facial distractions, a common characteristic is observed: the resulting face is often oversmoothed, of which the natural pores disappear completely. However, human skin is perception sensitive, in which fine scale texture accounts for an important part in overall appearance. Over-smoothed face not only looks unnatural but also dissolves the individuality of a person. In this paper, we focus our attention on retaining fine skin texture features, e.g., pores structure of the skin, while simultaneously removing facial imperfections such as scars, bumps and blemishes. Since the pores and some distractions of imperfect faces all belong to high frequency component of images, to preserve the former and remove the latter make face cleaning a challenging problem. For approaches to eliminating distractions with denoising techniques[1−3] , which are widely adopted by commercial software, facial details such as pores are often accounted as noise and thereby removed. In this paper, based on empirical mode decomposition[4] (EMD), we set a quantity, i.e., normalized local energy (NLE), to reveal the characteristics of fine scale and distractions. Building upon NLE, we further develop a quantitative characterization of facial skin appearance, namely imperfect degree, to depict the visual perception of facial skin. Finally, we employ imperfect degree as a guide to enhance facial skin. In addition, we also propose an algorithm called adaptive local mean and neighborhood limited empirical mode decomposition (ALNEMD) to reduce the gray spots effect commonly associated with traditional EMD when dealing with high nonstationary images. Regular Paper This work is supported by the National Natural Science Foundation of China under Grant Nos. 60403038 and 60703084, the Natural Science Foundation of Jiangsu Province under Grant No. BK2007571, and the Natural Science Foundation of Liaoning Province under Grant No. 20082176
558 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 In the process of facial image cleaning,our algorithm In the field of face recognition,Lin et al.o]used only requests the user to tune a few high level.intuitive scale-invariant feature transform (SIFT)framework to parameters to interactively control the amount of en- detect irregular skin details.Later,Pierrard et al.] hancement.The advantage of our technique is that employed 3D morphable model reconstruction to rec- while effectively removing facial imperfections,it does ognize facial moles.Their techniques are effective in not blur fine scale facial details.Another important detecting moles with relatively fixed pattern,for exam- feature of our technique is:it is a general model which ple,circular shape.However,as these methods detect characterizes both imperfections and facial details in a and localize facial moles in spatial domain,they are unified framework,thereby it does not require user to not suitable for detecting and localizing general skin ir- interactively mark imperfections on facial images. regularities,due to a large number of possible spatial The rest of this paper is structured as follows.Sec- variations of distractions.Instead of explicitly detect- tion 2 reviews some related work about face cleaning ing and localizing distractions,our new approach sug- and EMD.Section 3 describes our improved EMD al- gests a quantitative characterization of both pores and gorithm.The algorithm of face cleaning is introduced distractions,and uses it to enhance facial skin. in detail in Section 4.Experimental results are demon- Among the current image filtering techniques,bilat- strated in Section 5.Finally.Section 6 concludes the eral filteringlil is one of the most powerful filters and whole paper and highlights future work. has been used in various fields.For the typical case in which the spatial and intensity weighting functions are 21 Related Work and Background Gaussian,there are two important parameters,namely geometric spread and photometric spread.While small 2.1 Related Work geometric spread corresponds to filtering small intensity Face Cleaning.The problem of facial image editing changes,large photometric spread would preserve edges has received much attention both from the computer with large discontinuity.For face cleaning,of which the graphics and image processing communities.As a re- aim is to smooth large discontinuity and preserve small sult,many techniques have been developed to achieve scale texture,the settings for the two parameters are various facial appearance effects.Leyvand et al.5]pro- difficult:when the photometric spread is set to a very posed an approach to enhanceing the facial attractive- large value in order to smooth imperfections,the bilate- ness by adjusting facial features.Nguyen et al.6]pre- ral filter nearly degenerates into a Gaussian filter.In sented a layer extraction method to remove and syn- such a case,the smoothing effect of bilateral filter only thesize beard.Using reflectance transfer,Peers et al7 relies on the geometric spread.Apparently,the geo- produced face relighting in the post-production process. metric spread cannot be set to a large value,otherwise Most recently,Bitouks introduced an algorithm that the pores of skin will be removed.On the other hand, can automatically replace faces in photographs.More- the edges of bumps or scars in the skin are left aside if over.the algorithm also allows the user to interactively it is set to a small value. edit the illumination and colors of face. To the best of our knowledge,the most related work However,little research work has been published on to ours is developed by Matsui et al.2.This work also face cleaning in literature.Nevertheless,in practice, aims at removing spots and at the same time preserv- several methods have been employed to clean or en- ing skin natural roughness.The method uses e-filters to hance facial images.While cleaning face with denois- decompose image into several different frequency com- ing techniques is popular,an alternative approach to ponents.It assumes spots are of medium frequency serving this problem is the so-called interactive cut- and pores of high frequency,and then discards medium and-paste method,of which poison image editing] frequency component and retains the high frequency and healing brush in Adobe Photoshop are well-known component.The major drawback of this approach is: This approach repairs imperfections on the facial im- since spots may exist also in low frequency and high ages by seamless cloning,with little effects on preserv frequency components,it cannot remove spots com- ing skin details.Moreover,for facial images with many pletely.Moreover,the discarded layers are determined imperfections,it would incur a lot of user interactions by several parameters,which are not intuitive,hence to mark out source and destination areas.Since a great requiring heavy user interactions.The users also need deal of skill is demanded to achieve satisfactory results to use unsharp mask to avoid image blurring,which it is a suitable tool only for trained designers.Our goal might magnify image noise. is to develop a technique with a few high level,intuitive EMD.Traditional energy-frequency analysis parameters that can be easily adopted by naive users. are based on Fourier transformation and wavelet
558 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 In the process of facial image cleaning, our algorithm only requests the user to tune a few high level, intuitive parameters to interactively control the amount of enhancement. The advantage of our technique is that while effectively removing facial imperfections, it does not blur fine scale facial details. Another important feature of our technique is: it is a general model which characterizes both imperfections and facial details in a unified framework, thereby it does not require user to interactively mark imperfections on facial images. The rest of this paper is structured as follows. Section 2 reviews some related work about face cleaning and EMD. Section 3 describes our improved EMD algorithm. The algorithm of face cleaning is introduced in detail in Section 4. Experimental results are demonstrated in Section 5. Finally, Section 6 concludes the whole paper and highlights future work. 2 Related Work and Background 2.1 Related Work Face Cleaning. The problem of facial image editing has received much attention both from the computer graphics and image processing communities. As a result, many techniques have been developed to achieve various facial appearance effects. Leyvand et al.[5] proposed an approach to enhanceing the facial attractiveness by adjusting facial features. Nguyen et al. [6] presented a layer extraction method to remove and synthesize beard. Using reflectance transfer, Peers et al. [7] produced face relighting in the post-production process. Most recently, Bitouk[8] introduced an algorithm that can automatically replace faces in photographs. Moreover, the algorithm also allows the user to interactively edit the illumination and colors of face. However, little research work has been published on face cleaning in literature. Nevertheless, in practice, several methods have been employed to clean or enhance facial images. While cleaning face with denoising techniques is popular, an alternative approach to serving this problem is the so-called interactive cutand-paste method, of which poison image editing[9] and healing brush in Adobe Photoshop are well-known. This approach repairs imperfections on the facial images by seamless cloning, with little effects on preserving skin details. Moreover, for facial images with many imperfections, it would incur a lot of user interactions to mark out source and destination areas. Since a great deal of skill is demanded to achieve satisfactory results, it is a suitable tool only for trained designers. Our goal is to develop a technique with a few high level, intuitive parameters that can be easily adopted by naive users. In the field of face recognition, Lin et al.[10] used scale-invariant feature transform (SIFT) framework to detect irregular skin details. Later, Pierrard et al.[11] employed 3D morphable model reconstruction to recognize facial moles. Their techniques are effective in detecting moles with relatively fixed pattern, for example, circular shape. However, as these methods detect and localize facial moles in spatial domain, they are not suitable for detecting and localizing general skin irregularities, due to a large number of possible spatial variations of distractions. Instead of explicitly detecting and localizing distractions, our new approach suggests a quantitative characterization of both pores and distractions, and uses it to enhance facial skin. Among the current image filtering techniques, bilateral filtering[1] is one of the most powerful filters and has been used in various fields. For the typical case in which the spatial and intensity weighting functions are Gaussian, there are two important parameters, namely geometric spread and photometric spread. While small geometric spread corresponds to filtering small intensity changes, large photometric spread would preserve edges with large discontinuity. For face cleaning, of which the aim is to smooth large discontinuity and preserve small scale texture, the settings for the two parameters are difficult: when the photometric spread is set to a very large value in order to smooth imperfections, the bilateral filter nearly degenerates into a Gaussian filter. In such a case, the smoothing effect of bilateral filter only relies on the geometric spread. Apparently, the geometric spread cannot be set to a large value, otherwise the pores of skin will be removed. On the other hand, the edges of bumps or scars in the skin are left aside if it is set to a small value. To the best of our knowledge, the most related work to ours is developed by Matsui et al.[12]. This work also aims at removing spots and at the same time preserving skin natural roughness. The method uses ε-filters to decompose image into several different frequency components. It assumes spots are of medium frequency and pores of high frequency, and then discards medium frequency component and retains the high frequency component. The major drawback of this approach is: since spots may exist also in low frequency and high frequency components, it cannot remove spots completely. Moreover, the discarded layers are determined by several parameters, which are not intuitive, hence requiring heavy user interactions. The users also need to use unsharp mask to avoid image blurring, which might magnify image noise. EMD. Traditional energy-frequency analysis are based on Fourier transformation and wavelet
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 559 transformation.The Fourier transform is designed to simplicity,we call it directional BEMD.The methods of work with liner and stationary signals.The wavelet the second category are genuine sense 2D EMD which transform,on the other hand,is well-suited to handle use radial basis functions(RBF)or cubic spline to in- non-stationary data,but is poor at processing non- terpolate envelopeslis-201.Although they have been linear data.However,few energy-frequency data are successfully used in some cases,these algorithms suffer truly linear and stationary.To address this problem from the problem of overshoot or undershoot because the Hilbert-Huang transform (HHT)has been recently of their interpolation scheme.In this paper,we pro- developedl4l.The HHT comprises two steps:EMD pose two solutions,namely limiting minimum frequency generates a finite number of intrinsic mode functions and adaptive local mean,to improve the performance (IMF),and then the Hilbert transform is applied to of EMD. IMF.The combination of IMF and its Hilbert trans- form forms an analytic signal,which can be used to 2.21 Background of EMD generate a "time-frequency-energy"representation of The principle of EMD is to identify the intrinsic os- the data. cillatory modes embedded in the signals based on their In this paper,we choose EMD and HHT rather than characteristic time scales,and then to decompose the wavelet transform due to the following considerations. signals into a sum of IMFs.In order to permit phys- First,imperfect facial images are in general nonlinear ically meaningful instantaneous frequency to be de- and nonstationary in nature,EMD is suitable to pro- fined over it,an IMF must satisfy the following two cess these data.Second,based on the local time scale conditionsl4l:first,the numbers of extrema and zero- of the data,EMD decomposes a signal into IMFs adap- crossings must differ by at most one;second,it is sym- tively without using a prior basis which do not neces- metric with respect to the local mean of the data.Ide- sarily match the varying nature of the signals.With- ally,the requirement of the second condition should be out the need of pre-specifying a decomposition level as "the local mean of the data is zero".Since it is difficult requested by wavelet decomposition to extract out all to define a local averaging time scale in general case,in the image's oscillatory modes,EMD allows us to use practice,the local mean of the data is usually replaced all the oscillatory modes to further define imperfect de- by the mean of the envelopes defined by interpolating gree.Thus,by exploiting EMD,we can obtain adap- the local maxima and local minima. tive multi-resolution representation of image.Due to EMD is an iterative process.To illustrate its con- unpredictable appearances of various kinds of distrac- cept,here we describe the algorithm of EMD in 1D tions,such an adaptive multi-resolution representation case.Given a signal x(t),the algorithm of EMD can be is favorable.Third,the IMF is generally in good agree- summarized as follows. ment with intuitive and physical signal interpretations, 1)Identify all the local extrema of x(t) and has well-defined instantaneous frequency,allowing 2)Interpolate between local minima (resp.local us to define local energy to capture the characteris- maxima),ending up with the envelope emin(t)(resp. tics of distractions and pores.On the contrary.the emax(t)). local energy extracted by Fourier or wavelet3-15]suf 3)Compute the mean m(t)=(emin(t)+emax(t))/2. fers from the problem of energy leakage,due to the 4)Extract the detail d(t)=x(t)-m(t).Check prop- infinite or limited finite length of the basis.The energy erty of d(t): leakage makes the quantitative definition of the energy- if it meets the above two conditions,an IMF frequency-space distribution of image difficult is derived,and replace x(t)with the residual r(t)= Due to the nature of image data,when process- x(t)-d(t): ing images with idea of EMD,bidimensional empir- if it is not an IMF,replace r(t)with d(t). ical mode decomposition (BEMD)is necessary.Ac- 5)Repeat Steps 1)~4)until the residual satisfies cordingly,to perform 2D space-frequency analysis of some stopping criterion. IMFs,Riesz transform,an isotropic generalization of Finally,x(t)is represented as the sum of a finite the Hilbert transform to multiple dimensions is also number of IMFs (both amplitude and frequency mod- needed[16).For BEMD,existing methods can be di- ulated)and a residual trend.The process of iteratively vided into two categories.The first category of the extracting an IMF is also called sifting process.One can approaches divide an image into one-dimensional data. observe from the above formulation that there is no pre- Then the 1D EMD is applied to a limited number of ori- fixed basis involving in the sifting process,instead,the entations:two (horizontal and vertical)or morel171.For decomposition proceeds depending on the data itself
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 559 transformation. The Fourier transform is designed to work with liner and stationary signals. The wavelet transform, on the other hand, is well-suited to handle non-stationary data, but is poor at processing nonlinear data. However, few energy-frequency data are truly linear and stationary. To address this problem, the Hilbert-Huang transform (HHT) has been recently developed[4]. The HHT comprises two steps: EMD generates a finite number of intrinsic mode functions (IMF), and then the Hilbert transform is applied to IMF. The combination of IMF and its Hilbert transform forms an analytic signal, which can be used to generate a “time-frequency-energy” representation of the data. In this paper, we choose EMD and HHT rather than wavelet transform due to the following considerations. First, imperfect facial images are in general nonlinear and nonstationary in nature, EMD is suitable to process these data. Second, based on the local time scale of the data, EMD decomposes a signal into IMFs adaptively without using a prior basis which do not necessarily match the varying nature of the signals. Without the need of pre-specifying a decomposition level as requested by wavelet decomposition to extract out all the image’s oscillatory modes, EMD allows us to use all the oscillatory modes to further define imperfect degree. Thus, by exploiting EMD, we can obtain adaptive multi-resolution representation of image. Due to unpredictable appearances of various kinds of distractions, such an adaptive multi-resolution representation is favorable. Third, the IMF is generally in good agreement with intuitive and physical signal interpretations, and has well-defined instantaneous frequency, allowing us to define local energy to capture the characteristics of distractions and pores. On the contrary, the local energy extracted by Fourier or wavelet[13−15] suffers from the problem of energy leakage, due to the infinite or limited finite length of the basis. The energy leakage makes the quantitative definition of the energyfrequency-space distribution of image difficult. Due to the nature of image data, when processing images with idea of EMD, bidimensional empirical mode decomposition (BEMD) is necessary. Accordingly, to perform 2D space-frequency analysis of IMFs, Riesz transform, an isotropic generalization of the Hilbert transform to multiple dimensions is also needed[16]. For BEMD, existing methods can be divided into two categories. The first category of the approaches divide an image into one-dimensional data. Then the 1D EMD is applied to a limited number of orientations: two (horizontal and vertical) or more[17]. For simplicity, we call it directional BEMD. The methods of the second category are genuine sense 2D EMD which use radial basis functions (RBF) or cubic spline to interpolate envelopes[18−20]. Although they have been successfully used in some cases, these algorithms suffer from the problem of overshoot or undershoot because of their interpolation scheme. In this paper, we propose two solutions, namely limiting minimum frequency and adaptive local mean, to improve the performance of EMD. 2.2 Background of EMD The principle of EMD is to identify the intrinsic oscillatory modes embedded in the signals based on their characteristic time scales, and then to decompose the signals into a sum of IMFs. In order to permit physically meaningful instantaneous frequency to be de- fined over it, an IMF must satisfy the following two conditions[4]: first, the numbers of extrema and zerocrossings must differ by at most one; second, it is symmetric with respect to the local mean of the data. Ideally, the requirement of the second condition should be “the local mean of the data is zero”. Since it is difficult to define a local averaging time scale in general case, in practice, the local mean of the data is usually replaced by the mean of the envelopes defined by interpolating the local maxima and local minima. EMD is an iterative process. To illustrate its concept, here we describe the algorithm of EMD in 1D case. Given a signal x(t), the algorithm of EMD can be summarized as follows. 1) Identify all the local extrema of x(t). 2) Interpolate between local minima (resp. local maxima), ending up with the envelope emin(t) (resp. emax(t)). 3) Compute the mean m(t) = (emin(t) + emax(t))/2. 4) Extract the detail d(t) = x(t)−m(t). Check property of d(t): • if it meets the above two conditions, an IMF is derived, and replace x(t) with the residual r(t) = x(t) − d(t); • if it is not an IMF, replace x(t) with d(t). 5) Repeat Steps 1)∼4) until the residual satisfies some stopping criterion. Finally, x(t) is represented as the sum of a finite number of IMFs (both amplitude and frequency modulated) and a residual trend. The process of iteratively extracting an IMF is also called sifting process. One can observe from the above formulation that there is no pre- fixed basis involving in the sifting process, instead, the decomposition proceeds depending on the data itself
560 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 3 Improved EMD Sk=2k+1+1. .Initialize hk.o(p)=rk-1(p),11. Theoretically,there are two factors incurring the drawback of classic EMD mentioned in the above Detect all local extreme points of hk_1.Estab- lish the local maximum point set and local minimum section.First,since there is no bandwidth constraint during the decomposition process in classical EMD, point set respectively.For every pixel p of hk.1,if over-iterating on the whole signal for a better local ap- I(p)is strictly higher or lower than all its 8-connected neighbors,it is a local extreme point.Otherwise,it is proximation has the drawback of contaminating other parts of the signal,in particular,in uniformizing the not a local extreme point. Compute the geometric distance (Euclidean dis- amplitude and in over-decomposing it by spreading out its components over adjacent modes21.Second,clas- tance)between every two adjacent local maximum sical EMD uses the mean of envelopes interpolated by points row-wisely and then column-wisely in For the local maxima and minima respectively as a sub- example,suppose two adjacent local maximum points stitute for the local mean of data.Unfortunately,the in the same column are (1,y),(2,y)and 1 2, interpolation method,for example spline fitting,often then the distance S between them is z2-z1.If S>S, overshoots or undershoots the strongly nonlinear and then put the pixels(1+ix S,y),i=1,2,...,(S/S)] nonstationary data.Such overshoots and undershoots into The purpose of supplementing is to make the distance between every two adjacent local maxi- would be amplified through many times of spline fitting in the sifting process,and eventually result in diver- mum points not more than S,ensuring the distribution gence of IMF.Therefore,unless a better approximation of extreme points is not too sparse.Furthermore,to re- duce the boundary effects,we add extreme surrounding to local mean is conducted,EMD cannot be improved significantly. the original image boundaries by mirror symmetry.The For each of the above two problems,we propose a so- same operations are also performed on. For every pixel p,compute its local mean lution to get around it.The first solution is related with time-frequency uncertainty principle.As it is known, emean,1-1(p)using the adaptive local mean algorithm de- scribed below. uncertainty principle formulated as ·hk,l(p)=hk,l-1(p)-Cmean,.l-1(p): 7f×7t≥1/2 (1) If the stop criterion is met4,imf(p)=hk.(p), and stop the sifting process;else I=I+1. defines the relation between the width of frequency win- 3)rk(p)=rk-1(p)-imfk(p). dow Vf and that of the time window Vt.According 4)If the r(p)is a monotonic or constant signal, to (1),if we set Vt s a (a 0),then it must have terminate the decomposition process.Otherwise go to Vf >b(b=1/2a).That is,if we limit the maximal Step 2). temporal resolution,a minimal frequency resolution can The image I is finally expressed as the sum of finite be confined correspondingly.In the paper,we specify IMFs and a residue,as shown in the following equation: a maximum neighborhood in sifting process,as a re- sult,the minimal frequency of the resulting IMF is de- fined.Our solution to the second problem is:instead of I(p)= im时fk(p)+rK(p) (2) k=1 endeavoring to develop a fitting method which always cause overshoot or undershoot,we directly use the local The adaptive local mean algorithm is shown in Ta- mean of the image data to ascertain symmetry.To do ble 1:the function NumErtrema(p)returns the num- so,the only issue is to define a local space scale.Fortu- ber of local extreme points in p.r.Symmetry(p) nately,the above specified maximum neighborhood is tests whether the local extreme points in p.r are spa- a good candidate for defining such a local space scale. tially symmetric,and is defined as follows.The col- Taking computation cost and locality into account,an umn and row of p equally divide pr into four quad- adaptive local mean algorithm is developed. rants.Symmetry(p)is true if the following two condi- Integrating the above two strategies into one frame- tions are both satisfied:first,in each of the four quad- work.for a given image I with size of M x N in pixels, rants one local extreme point can be found;second, our ALNEMD algorithm is formulated as the following among these four local extreme points,two of them are steps. local maximum points and the other two are local min- 1)Initialize ro(p)=I(p). imum points.Otherwise Symmetry(p)is false.The 2)Extract k-th IMF imf&. threshold of ur is used to guarantee that the number ·Set the size of maximum neighborhood Sx×Sk, of local extreme points is not too few and typically set
560 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 3 Improved EMD Theoretically, there are two factors incurring the drawback of classic EMD mentioned in the above section. First, since there is no bandwidth constraint during the decomposition process in classical EMD, over-iterating on the whole signal for a better local approximation has the drawback of contaminating other parts of the signal, in particular, in uniformizing the amplitude and in over-decomposing it by spreading out its components over adjacent modes[21]. Second, classical EMD uses the mean of envelopes interpolated by the local maxima and minima respectively as a substitute for the local mean of data. Unfortunately, the interpolation method, for example spline fitting, often overshoots or undershoots the strongly nonlinear and nonstationary data. Such overshoots and undershoots would be amplified through many times of spline fitting in the sifting process, and eventually result in divergence of IMF. Therefore, unless a better approximation to local mean is conducted, EMD cannot be improved significantly. For each of the above two problems, we propose a solution to get around it. The first solution is related with time-frequency uncertainty principle. As it is known, uncertainty principle formulated as ∇f × ∇t > 1/2 (1) defines the relation between the width of frequency window ∇f and that of the time window ∇t. According to (1), if we set ∇t 6 a (a > 0), then it must have ∇f > b (b = 1/2a). That is, if we limit the maximal temporal resolution, a minimal frequency resolution can be confined correspondingly. In the paper, we specify a maximum neighborhood in sifting process, as a result, the minimal frequency of the resulting IMF is de- fined. Our solution to the second problem is: instead of endeavoring to develop a fitting method which always cause overshoot or undershoot, we directly use the local mean of the image data to ascertain symmetry. To do so, the only issue is to define a local space scale. Fortunately, the above specified maximum neighborhood is a good candidate for defining such a local space scale. Taking computation cost and locality into account, an adaptive local mean algorithm is developed. Integrating the above two strategies into one framework, for a given image I with size of M × N in pixels, our ALNEMD algorithm is formulated as the following steps. 1) Initialize r0(p) = I(p). 2) Extract k-th IMF imf k . • Set the size of maximum neighborhood Sk × Sk, Sk = 2k+1 + 1. • Initialize hk,0(p) = rk−1(p), l = 1. • Detect all local extreme points of hk,l−1. Establish the local maximum point set A and local minimum point set B respectively. For every pixel p of hk,l−1, if I(p) is strictly higher or lower than all its 8-connected neighbors, it is a local extreme point. Otherwise, it is not a local extreme point. • Compute the geometric distance (Euclidean distance) between every two adjacent local maximum points row-wisely and then column-wisely in A . For example, suppose two adjacent local maximum points in the same column are (x1, y), (x2, y) and x1 < x2, then the distance S between them is x2 −x1. If S > Sl , then put the pixels (x1+i×Sl , y), i = 1, 2, . . . , b(S/Sl)c into A . The purpose of supplementing A is to make the distance between every two adjacent local maximum points not more than Sl , ensuring the distribution of extreme points is not too sparse. Furthermore, to reduce the boundary effects, we add extreme surrounding the original image boundaries by mirror symmetry. The same operations are also performed on B. • For every pixel p, compute its local mean emean,l−1(p) using the adaptive local mean algorithm described below. • hk,l(p) = hk,l−1(p) − emean,l−1(p). • If the stop criterion is met[4] , imf k (p) = hk,l(p), and stop the sifting process; else l = l + 1. 3) rk(p) = rk−1(p) − imf k (p). 4) If the rk(p) is a monotonic or constant signal, terminate the decomposition process. Otherwise go to Step 2). The image I is finally expressed as the sum of finite IMFs and a residue, as shown in the following equation: I(p) = X K k=1 imf k (p) + rK(p). (2) The adaptive local mean algorithm is shown in Table 1: the function NumExtrema(p) returns the number of local extreme points in Ωp,T . Symmetry(p) tests whether the local extreme points in Ωp,T are spatially symmetric, and is defined as follows. The column and row of p equally divide Ωp,T into four quadrants. Symmetry(p) is true if the following two conditions are both satisfied: first, in each of the four quadrants one local extreme point can be found; second, among these four local extreme points, two of them are local maximum points and the other two are local minimum points. Otherwise Symmetry(p) is false. The threshold of µT is used to guarantee that the number of local extreme points is not too few and typically set
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 561 as(T2-1)/3. 4.1 Extract the Local Property Compared with interpolation-based EMD,in our adaptive local mean algorithm,the approximation to Since our characterization and cleaning operation the local mean of image data is much more accurate, are only performed on skin parts of face,for a given therefore the problem of overshoot or undershoot of image,we first need to know which regions on the im- interpolation-based EMD is reduced.On the other age are skin parts.This information can be obtained hand,by setting a maximum neighborhood in sifting either by requiring users to indicate the skin regions, process,the minimal frequency of IMF is kept un- or by using facial feature detection algorithms to au- der control and hence image overdecomposition is pro- tomatize the process.In the paper,we manually mark hibited.Integrating both adaptive local mean algo- out the contour of the face,eyes and lips that are es- rithm and neighborhood limitation schemes,more ac- sential to the character of individuals which we do not curate frequency and amplitude can be achieved by intend to adjust.In the following,the operations are our method,laying a foundation to make quantitative all performed on the skin parts.Assuming that we energy-space analysis of facial images. have decomposed I into K IMFs with the ALNEMD algorithm described in the above section,now we apply Table 1.Adaptive Local Mean Algorithm Riesz transform to each IMF. Defining x=(1,r2)T,r is the pixel set The l-th Decomposition in Sifting Process of I,the Riesz kernel in the spatial domain is given for every pixel p of hk.1-1 as R(z)=z/2m3.We can see from the kernel that set T=3,flag false,ur; the integration in the Reisz transform is extremely lo- let p.r be the T x T neighborhood of p cal,which is a crucial requirement for analyzing non- while T<S stationary signals.The combination of a signal and its search for all local extreme points in p.T; Riesz transformed is called the monogenic signal.Let if NumExtreme(p)>r and Symmetry(p)==tue fu(z)be the k-th IMF of the image,the corresponding Cmeam))=TxT∑qen.rhk-1(g monogenic signal thus takes the following form: flag =true; fk.M()=fr()+(R*fk)()=fk()+fk.R() break; else (6(t)+t/2)fk(-t)dt. (3) T=T+2: end if From the monogenic signal,meaningful local ampli- end while tude and local phase can be extracted.The local am- if flag==false plitude of fi.M()is: 1 Cmean(p)=x5∑qe.rhkI-1(gh end if f,M(川=VfM() end for =Vf(x)+lfk.R()2 (4) 4 Algorithm of Face Cleaning As local amplitude offers image's energetic informa- We will present in this section our face cleaning al- tion,we use the square of local amplitude,namely local energy,to characterize the oscillatory degree of every gorithm.First,we give an overview of the algorithm pixel x at level k. Formally,for the facial image I to be processed,we first use ALNEMD to decompose it into a series of both fre- LEk()=lfk.M()2. (5) quency and amplitude modulated IMFs.Next,Riesz transform is applied to every IMF,generating a corre- According to the theory of EMD.with the increase sponding monogenic signal22]which is the extension of of k,LEk(r)decreases.In order to study the behavior the analytical signal from 1D to 2D.Built upon mono- of LEL(x),for a given pixel x,across different levels k genic signals,the meaningful local energy of every pixel in a unified fashion,we normalize LEk(r),x E at ev- is extracted.By analyzing normalized local energy,we ery level,getting the normalized local energy NLE(). propose a function to quantitatively measure the im- Being a relative value,NLEk(z)reflects the relative perfect degree of image.The cleaning result is finally weight of local energy of pixel z at level k.Figs.1(b), produced after the coefficients of IMFs are adaptively 1(c),1(d)show the first three normalized local energy adjusted under the guide of imperfect degree. maps of Fig.1(f),a selected part of Fig.1(a)
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 561 as (T 2 − 1)/3. Compared with interpolation-based EMD, in our adaptive local mean algorithm, the approximation to the local mean of image data is much more accurate, therefore the problem of overshoot or undershoot of interpolation-based EMD is reduced. On the other hand, by setting a maximum neighborhood in sifting process, the minimal frequency of IMF is kept under control and hence image overdecomposition is prohibited. Integrating both adaptive local mean algorithm and neighborhood limitation schemes, more accurate frequency and amplitude can be achieved by our method, laying a foundation to make quantitative energy-space analysis of facial images. Table 1. Adaptive Local Mean Algorithm The l-th Decomposition in Sifting Process for every pixel p of hk,l−1 set T = 3, flag = false, µT ; let Ωp,T be the T × T neighborhood of p; while T < Sl search for all local extreme points in Ωp,T ; if NumExtreme(p) > µT and Symmetry(p) == ture emean(p) = 1 T × T P q∈Ωp,T hk,l−1(q); flag = true; break; else T = T + 2; end if end while if flag == false emean(p) = 1 Sl × Sl P q∈Ωp,T hk,l−1(q); end if end for 4 Algorithm of Face Cleaning We will present in this section our face cleaning algorithm. First, we give an overview of the algorithm. Formally, for the facial image I to be processed, we first use ALNEMD to decompose it into a series of both frequency and amplitude modulated IMFs. Next, Riesz transform is applied to every IMF, generating a corresponding monogenic signal[22] which is the extension of the analytical signal from 1D to 2D. Built upon monogenic signals, the meaningful local energy of every pixel is extracted. By analyzing normalized local energy, we propose a function to quantitatively measure the imperfect degree of image. The cleaning result is finally produced after the coefficients of IMFs are adaptively adjusted under the guide of imperfect degree. 4.1 Extract the Local Property Since our characterization and cleaning operation are only performed on skin parts of face, for a given image, we first need to know which regions on the image are skin parts. This information can be obtained either by requiring users to indicate the skin regions, or by using facial feature detection algorithms to automatize the process. In the paper, we manually mark out the contour of the face, eyes and lips that are essential to the character of individuals which we do not intend to adjust. In the following, the operations are all performed on the skin parts. Assuming that we have decomposed I into K IMFs with the ALNEMD algorithm described in the above section, now we apply Riesz transform to each IMF. Defining x = (x1, x2) T, x ∈ Ω, Ω is the pixel set of I, the Riesz kernel in the spatial domain is given as R(x) = x/2π|x| 3 . We can see from the kernel that the integration in the Reisz transform is extremely local, which is a crucial requirement for analyzing nonstationary signals. The combination of a signal and its Riesz transformed is called the monogenic signal. Let fk(x) be the k-th IMF of the image, the corresponding monogenic signal thus takes the following form: fk,M(x) = fk(x) + (R ∗ fk)(x) = fk(x) + fk,R(x) = Z Ω (δ(t) + t/2π|t| 3 )fk(x − t)dt. (3) From the monogenic signal, meaningful local amplitude and local phase can be extracted. The local amplitude of fk,M(x) is: |fk,M(x)| = q f 2 k,M(x) = q f 2 k (x) + |fk,R(x)| 2. (4) As local amplitude offers image’s energetic information, we use the square of local amplitude, namely local energy, to characterize the oscillatory degree of every pixel x at level k. LEk(x) = |fk,M(x)| 2 . (5) According to the theory of EMD, with the increase of k, LEk(x) decreases. In order to study the behavior of LEk(x), for a given pixel x, across different levels k in a unified fashion, we normalize LEk(x), x ∈ Ω at every level, getting the normalized local energy NLEk(x). Being a relative value, NLEk(x) reflects the relative weight of local energy of pixel x at level k. Figs. 1(b), 1(c), 1(d) show the first three normalized local energy maps of Fig.1(f), a selected part of Fig.1(a)