Course Paper L.-K.Hua Seminar July12.2023 Higher intensity means brighter,so black pixel has 0 intensity,while white pixel has 255. Pixel-wise operation usually means to do transform on the matrix. Representing color image is a much complicated question.The most simple way to do it is to separate the image into three channels:red(R),green(G),and blue(B).By physics, RGB can form all light colors,so a color image can be represented as the combination of three grayscale images,see figure 3. d in Figure 3:RGB representation tation is intuitive and easy for the computer to shov it doesn't fit O C perception of colors. rsensitivity to lu ce is much str 0g9 /0.299 0.587 0.114 0 =-0.1687-0313 0.5 0.5 -0.4187-0.0813/ Y just means the luminance of the image,or using a simpler word,"lightness".There are different weights to average RGB,resulting in different representation of luminance.U and V are called "chrominance" the color of that pixel,still having the range of [0,255].This representation form is called YUV. By cutting luminance from chrominance,it is easier to dealing with problems relating to grayscale.Linear transforming also makes it faster to restore RGB image.Still,U and V don't conform to out perception,a better but much more complex representation is HSI:Hue, Saturation and Intensity. I=z(R+G+B) s=1-3mim(rc,B R+G+B H=0 B≤G (2) 27-0 B2G 2R-B-G 0=arccos (RG)(R-B) Here the luminance is represented as the arithmetic average of RGB.Unlike RGB or YUV form,HSI representation can't be easily discrete into uint8 data type (integer between 0 and 255). 2
Course Paper L.-K. Hua Seminar July 12, 2023 Higher intensity means brighter, so black pixel has 0 intensity, while white pixel has 255. Pixel-wise operation usually means to do transform on the matrix. Representing color image is a much complicated question. The most simple way to do it is to separate the image into three channels: red(R), green(G), and blue(B). By physics, RGB can form all light colors, so a color image can be represented as the combination of three grayscale images, see figure 3. Figure 3: RGB representation [2] Though RGB representation is intuitive and easy for the computer to show, it doesn’t fit our visual perception of colors. Our sensitivity to luminance is much stronger than sensitivity to color intensity, thus we can do linear transform to RGB space to get luminance: Y U V = 0.299 0.587 0.114 −0.1687 −0.3313 0.5 0.5 −0.4187 −0.0813 R G B + 0 128 128 (1) Y just means the luminance of the image, or using a simpler word, ”lightness”. There are different weights to average RGB, resulting in different representation of luminance. U and V are called ”chrominance”, the color of that pixel, still having the range of [0, 255]. This representation form is called YUV. By cutting luminance from chrominance, it is easier to dealing with problems relating to grayscale. Linear transforming also makes it faster to restore RGB image. Still, U and V don’t conform to out perception, a better but much more complex representation is HSI: Hue, Saturation and Intensity. I = 1 3 (R + G + B) S =1 − 3 min{R, G, B} R + G + B H = ( θ B ≤ G 2π − θ B > G θ = arccos 2R − B − G 2(R − G) 1/2 (R − B) 1/2 (2) Here the luminance is represented as the arithmetic average of RGB. Unlike RGB or YUV form, HSI representation can’t be easily discrete into uint8 data type (integer between 0 and 255). 2
Course Paper L.-K.Hua Seminar Jly12,2023 With YUV or HSI representation,the re-colorize problem can be summarized as following: given matrix Y(I),find the most likely matrix of U and V(S and H).It is obvious that such things can't be done with no prior information,for we are trying to construct three dimension from one.From different information,we can get different views of the problem. 1.2 Two Views of Colorization 1.2.1 From Certain Color Style A common situation is that we have already known the color style of a grayscale image For age of a flow might be olori d to but if we ha which is called to transfer the style.Just as the light condition in the left image is the middle, e know ve can trans Figure 4:Style-transferring Different ways to transfer leads to different effects,but the principle is clear:let S be the source image that contains color style,T be the target grayscale image,style-transferring is a problem like I=argminxfade(X.T)+Bd,(X,S)} Here 酒a 空on content如dwa9accg.小 IYx-Yr, nius and still show ove sage In the section that follows,we will talk about some empirical representation of metric de and d.,with the methods elicited by it.It is worth noting that style-transferring is not necessarily an explicit optimization problem.In many cases,de and d.are independent,thus can be minimized respectively. With style-transferring and an image gallery large enough,we can form a straightforward semi-interactive method:compare the target image with every image in the gallery and find one with the nearest content,then transfer the style to the grayscale.Write in formula,that is to choose s by (9 means the gallery): S=argminxecde(T,X)
Course Paper L.-K. Hua Seminar July 12, 2023 With YUV or HSI representation, the re-colorize problem can be summarized as following: given matrix Y (I), find the most likely matrix of U and V (S and H). It is obvious that such things can’t be done with no prior information, for we are trying to construct three dimension from one. From different information, we can get different views of the problem. 1.2 Two Views of Colorization 1.2.1 From Certain Color Style A common situation is that we have already known the color style of a grayscale image. For example, an image of a flower might be re-colorized to any color, but if we have a color image of the same flower, we can fill the grayscale with similar color, which is called to transfer the color style. Just as figure 4 shows, if we know the light condition in the left image is same as the middle, we can transfer it to the right. Figure 4: Style-transferring Different ways to transfer leads to different effects, but the principle is clear: let S be the source image that contains color style, T be the target grayscale image, style-transferring is a problem like I = argminX{αdc(X, T) + βds(X, S)} Here dc, ds means the distance on content and style, while α, β are weight parameters. A simple thought is to let dc(X, T) = kYX − YT k 2 F , use Frobenius norm to judge the distance, but for ds, to find a proper form is quite uneasy: color style should be independent of location, and still show overall message. In the section that follows, we will talk about some empirical representation of metric dc and ds, with the methods elicited by it. It is worth noting that style-transferring is not necessarily an explicit optimization problem. In many cases, dc and ds are independent, thus can be minimized respectively. With style-transferring and an image gallery large enough, we can form a straightforward semi-interactive method: compare the target image with every image in the gallery and find one with the nearest content, then transfer the style to the grayscale. Write in formula, that is to choose S by (G means the gallery): S = argminX∈Gdc(T, X) Semi-interactive implies that the method doesn’t need to designate a certain image, but still the range of the gallery is important. 3
Course Paper L.-K.Hua Seminar July12.2023 1.2.2 From Given Points Another view comes from the cases when part of the image has already been colored.Use the same image as an example,figure 5 shows a partially colored image with a mask informing where has been colored (the white pixels). Figure 5:Given mask and color is therefore problemof ri the black part in the msk to minimize a function that indicates the conformity of the result.Because the white pixel in the mask is randomly distributed,there can't be an algorithm to get results directly,making the use of optimization essential. One attribution is significantly important in this kind of optimization,the gradient of the image.For space continuity,the matrix can be regarded as uniform sampling of some smooth function,so it is possible to estimate its gradient.Since the step size is 1,define difference operators (here i and j cannot be on the edge): 0=+-- d,ay=t1二a 2 0ay=a+1+a4-1-2ay ⊙ 0ay=4+1+a4j-1-2a Aaij =ai-1j+ai+lj aij-1+aij+1-4aij We can portray local intensity difference by these operators.Because they're all linear, after combining with 2-norm,the optimization problem becomes the least squares problem, which can be precisely solved. 2 Style Transferring Methods 2.1 Pixel-wise LUT 2.1.1 Description of Content and Style Both RGB and YUV have discrete values,so the problem zation problem. 4
Course Paper L.-K. Hua Seminar July 12, 2023 1.2.2 From Given Points Another view comes from the cases when part of the image has already been colored. Use the same image as an example, figure 5 shows a partially colored image with a mask informing where has been colored (the white pixels). Figure 5: Given mask and color Re-colorization is therefore converted to the problem of coloring the black part in the mask to minimize a function that indicates the conformity of the result. Because the white pixel in the mask is randomly distributed, there can’t be an algorithm to get results directly, making the use of optimization essential. One attribution is significantly important in this kind of optimization, the gradient of the image. For space continuity, the matrix can be regarded as uniform sampling of some smooth function, so it is possible to estimate its gradient. Since the step size is 1, define difference operators (here i and j cannot be on the edge): ∂xaij = ai+1,j − ai−1,j 2 ∂yaij = ai,j+1 − ai,j−1 2 ∂ 2 xaij =ai+1,j + ai−1,j − 2aij ∂ 2 y aij =ai,j+1 + ai,j−1 − 2aij ∆aij =ai−1,j + ai+1,j + ai,j−1 + ai,j+1 − 4aij (3) We can portray local intensity difference by these operators. Because they’re all linear, after combining with 2-norm, the optimization problem becomes the least squares problem, which can be precisely solved. 2 Style Transferring Methods 2.1 Pixel-wise LUT 2.1.1 Description of Content and Style Both RGB and YUV have discrete values, so the problem can be regarded as a combinatorial optimization problem. For two pixels (g, h),(i, j), we can define distance content between two 4
Course Paper L.-K.Hua Seminar Jmly12,2023 matrices of the same dimension: dgh(X,Y))= 1(gh-x)(gh-)≤0 and yoh≠ 10 Otherwise. d.(X,Y)=∑dX,y g,h》 This”distan requires two pixels tha to h distance means that we recognize the content by the order of intensity in pixels.So long as we preserve the order,we can see the same contents. For style,the metric is described as(I means characteristic function,which is 1 when I is true,otherwise 0): 1 d(X,Y)=(x)-h(Y) Instead of comparing every point pairs like content,style metric just counts how many pixels of a certain intensity are in the matrix,and compare the intensity distribution.From probability view,here the h is just the distribution function of random sampling in X,called the histogram of a matrix. 2.1.2 Look-Up Table Letting ,B=1,the optimization problem turns to the prerved.For ual pixels in X det ed to find a reed as ause x d y have discrete va this f an also b 1 j=0 fori=0:255 hi1e(g(j+1)<h(i+1)(j<255) j=j+1; end LUT(i+1)=j; end Here g means the histogram of the target image,while h means the histogram of the source image.By such matching,the color style of the source image can be transformed to the target By independently transforming on R,G,B or Y,U,V,we can get three look-up tables From equation(1),the inverter equation is also clear: 自-微m
Course Paper L.-K. Hua Seminar July 12, 2023 matrices of the same dimension: dgh,ij (X, Y ) = ( 1 (xgh − xij )(ygh − yij ) ≤ 0 and ygh 6= yij 0 Otherwise. dc(X, Y ) = X (g,h)(i,j) dgh,ij (X, Y ) This ”distance” is not symmetric, for its null point only requires two pixels that are the same in X to be the same in Y, but not the reverse. In addition to that, it requires X and Y to have the same relationship of partial order. Visually speaking, this distance means that we recognize the content by the order of intensity in pixels. So long as we preserve the order, we can see the same contents. For style, the metric is described as (Il means characteristic function, which is 1 when l is true, otherwise 0): h(X) = (hk(X)), hk(X) = 1 size(X) X ij Ixij≤k ds(X, Y ) = kh(X) − h(Y )k Instead of comparing every point pairs like content, style metric just counts how many pixels of a certain intensity are in the matrix, and compare the intensity distribution. From probability view, here the h is just the distribution function of random sampling in X, called the histogram of a matrix. 2.1.2 Look-Up Table Letting α = ∞, β = 1, the optimization problem turns to the case which the content must be preserved. For equal pixels in X determines equal pixels in Y, we need to find a monotonic increase function f, then yij = f(xij ). Because x and y have discrete values, this f can also be recorded as an array, which is called a look-up table. It is not hard to construct the table by the following code: j = 0 for i = 0:255 while (g(j+1) < h(i+1)) && (j < 255) j = j + 1; end LUT(i+1) = j; end Here g means the histogram of the target image, while h means the histogram of the source image. By such matching, the color style of the source image can be transformed to the target image. By independently transforming on R, G, B or Y, U, V, we can get three look-up tables. From equation(1), the inverter equation is also clear: R G B = 1 0 1.402 1 −0.34414 −0.71414 1 1.772 0 Y U − 128 V − 128 (4) 5
Course Paper L.-K.Hua Seminar July12.2023 2.1.3 RGB Matching Results Matching result by the algorithm upward is like figure 6. Figure 6:Pixel-wise LUT result By checking the histograms on figure 7,we can see how the histogram is transformed from the source image. Figure 7:Histogram of result and source image Though the result see First,transfo color to de the e image ond,it totally ed inf ion abou which is sltoolbrSiarimge.oeheistpob YUV space 6
Course Paper L.-K. Hua Seminar July 12, 2023 2.1.3 RGB Matching Results Matching result by the algorithm upward is like figure 6. Figure 6: Pixel-wise LUT result By checking the histograms on figure 7, we can see how the histogram is transformed from the source image. Figure 7: Histogram of result and source image Though the result seems to color the rising sun right, we can still see two main problems. First, transforming on RGB might cause the sharpness of the color to decrease, and makes the image seem like ”dirty”; second, it totally ignored information about places, which is sometimes useful to color similar images. To solve the first problem, we tried to transform on YUV space. 6