Tutorial Vol.11,No.3/September 2019/Advances in Optics and Photonics 689 2.ORGANIZATION OF THIS PAPER The mathematics of SVD is relatively straightforward for finite matrices;such matrices arise,for example,if we have a finite number of small sources communicating to a finite number of small receivers.The mathematics is particularly simple if we also initially consider just scalar waves,such as acoustic waves in air.Such scalar waves allow a good tutorial introduction to communications modes and more generally to the ideas of this SVD approach.We start with the mathematics of such sources,receivers,and waves in Section 3.In Section 4,we go through a simple "toy"example explicitly,show- ing both the mathematical results and physical systems that would implement them. Many quite general physical and mathematical behaviors emerge as we look at wave systems this way,only some of which are currently well known.Though these behav- iors are relatively simple and even intuitive,only some have simple analytic solutions. On the other hand,numerical"experiments"and examples are straightforward,at least for finite numbers of"point"sources and receivers.Then the main calculation is just finding eigenvalues and eigenvectors of finite matrices.So,we introduce these behav- iors informally through a sequence of further numerical examples in Section 5(sup- ported by additional heuristic arguments in Appendices B,C,and D).Pretending we can approximate any set of "smooth"source and receiver functions with sufficiently many such point sources and receivers,we can reveal much of the behavior of the more general case and many of the results. To be general enough for real problems in optics and electromagnetism,we need two major sophistications.First,we need to expand the mathematics to handle sources and received waves that are continuous functions of space,and to consider possibly infinite sets of source and or wave functions.A key point is that we will be able to show that even with continuous source and wave functions,and with possibly infinite sets of them,we end up with finite numbers of useful communications channels or mode-converter basis functions for describing devices. Furthermore,this gives a general statement of diffraction limits for any volumes and any form of waves. The mathematics has to go beyond that of finite matrices,and cannot be deduced from it [121].Fortunately,that mathematics-functional analysis-exists.Unfortunately, this field is often impenetrable to the casual reader;necessarily it has to introduce ideas of convergence of functions,and that involves an additional set of concepts, mathematical tools,and terminology.The important results can,however,be stated relatively simply;in Section 6,I summarize key mathematical results,deferring some detail to Appendices E and F.I have also written a separate(and hopefully accessible) introduction [122]to this functional analysis mathematics,including all required proofs.With the results from functional analysis,continuous sources and waves for the simple scalar case can then be understood quite simply.In Section 7,we relate these mathematical results to known families of functions in the "paraxial"case often encountered in optics. The second major sophistication we require is the extension to electromagnetic waves, and we summarize the key results in Section 8.Scalar waves are often a good first model in optics,and much of their behavior carries over into the full electromagnetic case;dealing with electromagnetic waves properly is,however,more complicated. Not only are electromagnetic waves vectors rather than scalars,but,on the face of it, we have two kinds of fields to deal with-electric and magnetic.Maxwell's equations relate these two kinds of fields,of course.Existing sophisticated approaches to electromagnetism,such as the use of scalar and vector potentials,are helpful here in
2. ORGANIZATION OF THIS PAPER The mathematics of SVD is relatively straightforward for finite matrices; such matrices arise, for example, if we have a finite number of small sources communicating to a finite number of small receivers. The mathematics is particularly simple if we also initially consider just scalar waves, such as acoustic waves in air. Such scalar waves allow a good tutorial introduction to communications modes and more generally to the ideas of this SVD approach. We start with the mathematics of such sources, receivers, and waves in Section 3. In Section 4, we go through a simple “toy” example explicitly, showing both the mathematical results and physical systems that would implement them. Many quite general physical and mathematical behaviors emerge as we look at wave systems this way, only some of which are currently well known. Though these behaviors are relatively simple and even intuitive, only some have simple analytic solutions. On the other hand, numerical “experiments” and examples are straightforward, at least for finite numbers of “point” sources and receivers. Then the main calculation is just finding eigenvalues and eigenvectors of finite matrices. So, we introduce these behaviors informally through a sequence of further numerical examples in Section 5 (supported by additional heuristic arguments in Appendices B, C, and D). Pretending we can approximate any set of “smooth” source and receiver functions with sufficiently many such point sources and receivers, we can reveal much of the behavior of the more general case and many of the results. To be general enough for real problems in optics and electromagnetism, we need two major sophistications. First, we need to expand the mathematics to handle sources and received waves that are continuous functions of space, and to consider possibly infinite sets of source and or wave functions. A key point is that we will be able to show that even with continuous source and wave functions, and with possibly infinite sets of them, we end up with finite numbers of useful communications channels or mode-converter basis functions for describing devices. Furthermore, this gives a general statement of diffraction limits for any volumes and any form of waves. The mathematics has to go beyond that of finite matrices, and cannot be deduced from it [121]. Fortunately, that mathematics—functional analysis—exists. Unfortunately, this field is often impenetrable to the casual reader; necessarily it has to introduce ideas of convergence of functions, and that involves an additional set of concepts, mathematical tools, and terminology. The important results can, however, be stated relatively simply; in Section 6, I summarize key mathematical results, deferring some detail to Appendices E and F. I have also written a separate (and hopefully accessible) introduction [122] to this functional analysis mathematics, including all required proofs. With the results from functional analysis, continuous sources and waves for the simple scalar case can then be understood quite simply. In Section 7, we relate these mathematical results to known families of functions in the “paraxial” case often encountered in optics. The second major sophistication we require is the extension to electromagnetic waves, and we summarize the key results in Section 8. Scalar waves are often a good first model in optics, and much of their behavior carries over into the full electromagnetic case; dealing with electromagnetic waves properly is, however, more complicated. Not only are electromagnetic waves vectors rather than scalars, but, on the face of it, we have two kinds of fields to deal with—electric and magnetic. Maxwell’s equations relate these two kinds of fields, of course. Existing sophisticated approaches to electromagnetism, such as the use of scalar and vector potentials, are helpful here in Tutorial Vol. 11, No. 3 / September 2019 / Advances in Optics and Photonics 689
690 Vol.11,No.3/September 2019/Advances in Optics and Photonics Tutorial understanding just how many independent field components or"degrees of freedom" there really are,but standard approaches are not quite sufficient for clarifying this number.This difficulty can be resolved by proposing a new"gauge"(the"M-gauge") for the electromagnetic field.We provide a full explanation and derivation of the nec- essary electromagnetism in Appendix G,supported with a derivation in Appendix H and additional notation and identities in Appendix I. This new M-gauge,together with the results of the functional analysis and the SVD approach,allows a revised quantization of the electromagnetic field,summarized in Section 9 and in more detail in Appendix J.This resolves several difficulties.In par- ticular,.we can avoid artificial“boxes”and“running waves”in quantizing radiation fields,and they can be quantized for any shape of volume.This quantization means that our results here are generally valid and meaningful for both classical and quantum-mechanical radiation fields. In Section 10,we describe how to apply this same mathematics and physics in con- sidering mode-converter basis sets for devices or scatterers.Section 11 includes dis- cussion of the fundamental aspects of such mode-converter basis sets,including new radiation laws and a revised and simplified "Einstein's A&B"coefficient argument (with a full derivation in Appendix K)in modal form.Finally,in Section 12,I draw some conclusions. Length constraints here mean some relevant topics are omitted.First,in discussing waves,mostly I consider just the monochromatic case,but the underlying mathemat- ics and electromagnetism support general time-dependent fields [123](and I give those results explicitly for electromagnetic fields)and hence "temporal modes" [25,124,125].Second,though the communication channels are well-suited for adding information theory to calculate capacities (e.g.,[18,75)),I have to omit that discussion here.Third,though we could certainly extend the approach to,say,two-dimensional systems such as waves in slab waveguides,for simplicity all of our explicit examples and calculations will be for three-dimensional systems and waves,even if graphically we may show waves just in particular planes. To improve the narrative flow of the paper,I have avoided extensive historical and research review in the main body of the text,but I have included these important discussions in Appendix A.The subject is easier to explain without the constraint of the historical order and way in which the concepts arose,and the history and other research and connections are easier to explain once we understand the concepts. Though some aspects of this material are well known in the literature,and our treat- ment of those aspects is therefore purely a tutorial,for some other aspects,we have to present some original material.To clarify this,and to allow the reader to make their own judgements of the approaches and validity of any new results,I have listed what I believe to be novel results in Appendix L. This work is quite long overall,and that length might be daunting.I suggest that the reader starts with Sections 3,4,and 5-which will convey much of this new way of thinking about waves-followed by Section 10 to understand how this approach de- scribes optical devices and scatterers.Sections 6,7,and 8 add depth and rigor to the wave discussion,and Sections 9 and 11 add discussion of fundamental physical results from this approach. 3.INTRODUCTION TO SVD AND WAVES-SETS OF POINT SOURCES AND RECEIVERS We start here by introducing the main ideas of this SVD approach with the simple example of scalar waves with point sources and detectors
understanding just how many independent field components or “degrees of freedom” there really are, but standard approaches are not quite sufficient for clarifying this number. This difficulty can be resolved by proposing a new “gauge” (the “M-gauge”) for the electromagnetic field. We provide a full explanation and derivation of the necessary electromagnetism in Appendix G, supported with a derivation in Appendix H and additional notation and identities in Appendix I. This new M-gauge, together with the results of the functional analysis and the SVD approach, allows a revised quantization of the electromagnetic field, summarized in Section 9 and in more detail in Appendix J. This resolves several difficulties. In particular, we can avoid artificial “boxes” and “running waves” in quantizing radiation fields, and they can be quantized for any shape of volume. This quantization means that our results here are generally valid and meaningful for both classical and quantum-mechanical radiation fields. In Section 10, we describe how to apply this same mathematics and physics in considering mode-converter basis sets for devices or scatterers. Section 11 includes discussion of the fundamental aspects of such mode-converter basis sets, including new radiation laws and a revised and simplified “Einstein’s A&B” coefficient argument (with a full derivation in Appendix K) in modal form. Finally, in Section 12, I draw some conclusions. Length constraints here mean some relevant topics are omitted. First, in discussing waves, mostly I consider just the monochromatic case, but the underlying mathematics and electromagnetism support general time-dependent fields [123] (and I give those results explicitly for electromagnetic fields) and hence “temporal modes” [25,124,125]. Second, though the communication channels are well-suited for adding information theory to calculate capacities (e.g., [18,75]), I have to omit that discussion here. Third, though we could certainly extend the approach to, say, two-dimensional systems such as waves in slab waveguides, for simplicity all of our explicit examples and calculations will be for three-dimensional systems and waves, even if graphically we may show waves just in particular planes. To improve the narrative flow of the paper, I have avoided extensive historical and research review in the main body of the text, but I have included these important discussions in Appendix A. The subject is easier to explain without the constraint of the historical order and way in which the concepts arose, and the history and other research and connections are easier to explain once we understand the concepts. Though some aspects of this material are well known in the literature, and our treatment of those aspects is therefore purely a tutorial, for some other aspects, we have to present some original material. To clarify this, and to allow the reader to make their own judgements of the approaches and validity of any new results, I have listed what I believe to be novel results in Appendix L. This work is quite long overall, and that length might be daunting. I suggest that the reader starts with Sections 3, 4, and 5—which will convey much of this new way of thinking about waves—followed by Section 10 to understand how this approach describes optical devices and scatterers. Sections 6, 7, and 8 add depth and rigor to the wave discussion, and Sections 9 and 11 add discussion of fundamental physical results from this approach. 3. INTRODUCTION TO SVD AND WAVES—SETS OF POINT SOURCES AND RECEIVERS We start here by introducing the main ideas of this SVD approach with the simple example of scalar waves with point sources and detectors. 690 Vol. 11, No. 3 / September 2019 / Advances in Optics and Photonics Tutorial
Tutorial Vol.11,No.3/September 2019 Advances in Optics and Photonics 691 3.1.Scalar Wave Equation and Green's Functions Suppose we have some uniform (and isotropic)medium,such as air,with a wave propagation velocity v.For simplicity,we presume we are interested in waves of only one (angular)frequency @Then we could propose a simple Helmholtz wave equa- tion;this would be appropriate,for example,for acoustic pressure waves in air [126], with v being the velocity of sound in air.Then,for a spatial source function ws(r)and a resulting wave o(r),this Helmholtz wave equation would be V中oR(r)+k2中oR(r)=Ψas(r), (2 with k2=o2/2 (3) Now,for such an equation,the Green's function(i.e.,the wave that results from a"unit amplitude"point source 6(r-r)at position r')is 1 exp(ik r-r) Gc:)=-4rr-1 (4) As usual with Green's functions(see,e.g.,[127]for an introduction),for an actual continuous source function ws(r),the resulting wave would be 中aR(r)= Go(r:r)wos(r)dr'. 5) J Such a superposition of Green's functions (through the integral here)works because the medium(e.g.,air)is presumed to be linear so that superpositions of solutions to the wave equation are also solutions to this (linear)wave equation (2). 3.2.Matrix-Vector Description of the Coupling of Point Sources and Receivers We presume a set of Ns point sources (Fig.2)at positions rsi(j=1,...,Ns)in the source volume,and with (complex [128])amplitudes h.These might be the (complex) amplitude of the drives to each of a set of Ns small loudspeakers,for example,that we pretend we can approximate as point sources.Then the resulting wave at a point rg;in the receiving volume would be 1 办ncR)= Ws exp(ik Ri-rsjl) Ns (6) 4r台 TRi-sil =1 where 1 exp(ik rRi-rsjl) gj=一4元rR-rS列 (0 Suppose,then,that we had a set of Ng small microphones at a set of positions rgi in the receiving volume(Fig.2);we presume these are omnidirectional(so their response has no angular dependence).Then the received signal at one such microphone or point would be the sum of the waves from all the point sources,added up at the point rei [as in Eq.(6)] 8h (⑧)
3.1. Scalar Wave Equation and Green’s Functions Suppose we have some uniform (and isotropic) medium, such as air, with a wave propagation velocity v. For simplicity, we presume we are interested in waves of only one (angular) frequency ω. Then we could propose a simple Helmholtz wave equation; this would be appropriate, for example, for acoustic pressure waves in air [126], with v being the velocity of sound in air. Then, for a spatial source function ψωSr and a resulting wave ϕωRr, this Helmholtz wave equation would be ∇2ϕωRr k2ϕωRr ψωSr, (2) with k2 ω2∕v2: (3) Now, for such an equation, the Green’s function (i.e., the wave that results from a “unit amplitude” point source δr − r0 at position r0 ) is Gωr; r0 − 1 4π expikjr − r0 j jr − r0 j : (4) As usual with Green’s functions (see, e.g., [127] for an introduction), for an actual continuous source function ψωSr, the resulting wave would be ϕωRr Z VS Gωr; r0 ψωSr0 d3r0 : (5) Such a superposition of Green’s functions (through the integral here) works because the medium (e.g., air) is presumed to be linear so that superpositions of solutions to the wave equation are also solutions to this (linear) wave equation (2). 3.2. Matrix-Vector Description of the Coupling of Point Sources and Receivers We presume a set of NS point sources (Fig. 2) at positions rSj (j 1,…, NS) in the source volume, and with (complex [128]) amplitudes hj. These might be the (complex) amplitude of the drives to each of a set of NS small loudspeakers, for example, that we pretend we can approximate as point sources. Then the resulting wave at a point rRi in the receiving volume would be ϕωRrRi − 1 4π X NS j1 expikjrRi − rSjj jrRi − rSjj hj X NS j1 gijhj, (6) where gij − 1 4π expikjrRi − rSjj jrR − rSjj : (7) Suppose, then, that we had a set of NR small microphones at a set of positions rRi in the receiving volume (Fig. 2); we presume these are omnidirectional (so their response has no angular dependence). Then the received signal at one such microphone or point would be the sum of the waves from all the point sources, added up at the point rRi [as in Eq. (6)] fi X NS j1 gijhj: (8) Tutorial Vol. 11, No. 3 / September 2019 / Advances in Optics and Photonics 691
692 Vol.11,No.3/September 2019/Advances in Optics and Photonics Tutorial Equivalently,if we define the vectors ls)and )and the matrix Gsk for such a problem as g11 812 ”。。 gINs h2 f g21 822 。4。 g2Ns l中R) and GSR ..· .: hNs N gNRI gNK2 gNRNs (⑨) then we can write the set of relations Eq.(8)for all i compactly as the matrix-vector expression l中R)=GsRlws). (10) 3.3.Hermitian Adjoints and Dirac Bra-Ket Notation At this point,we can usefully introduce the final part of the Dirac notation,which involves the Hermitian adjoint (or Hermitian conjugate or conjugate transpose) [l29]l.Generally,this is notated with a superscript“dagger,,”written as“t” The Hermitian adjoint of a matrix is formed by reflecting around the "top-left"to "bottom-right"diagonal of the matrix and taking the complex conjugate of the ele- ments.For some matrix G,with matrix elements gi in the ith row and jth column,the corresponding "row-i,column-"matrix element of the matrix Gi is the number g The Hermitian adjoint of a column vector is,similarly,a row vector whose elements are the complex conjugates of the corresponding elements of the column vector.In Dirac notation,such a row vector is notated using the"bra"notation (So,explicitly, for our matrices and vectors here, h2 (0ws)≡ ... =[h防…]=sl, (11) hNs and similarly for )and the Hermitian adjoint of the operator Gsk is 811 812 gINs gi g21 gNRl 821 822 ++4 g2Ns g12 g22 gNR2 (12) ... gNRI gNR2 gNRNs giNL giNs gNaNs」 Figure 2 Source volume Receiving volume Coupling operator rg1◆ Gs Set of point sources at positions rs;in a source volume,and a set of point receivers at positions rgi in a receiving volume,coupled through the coupling operator Gsk
Equivalently, if we define the vectors jψSi and jϕRi and the matrix GSR for such a problem as jψSi 2 6 6 6 4 h1 h2 . . . hNS 3 7 7 7 5, jϕRi 2 6 6 6 4 f1 f2 . . . fNR 3 7 7 7 5, and GSR 2 6 6 6 4 g11 g12 g1NS g21 g22 g2NS . . . . . . . . . . . . gNR1 gNR2 gNRNS 3 7 7 7 5, (9) then we can write the set of relations Eq. (8) for all i compactly as the matrix-vector expression jϕRi GSRjψSi: (10) 3.3. Hermitian Adjoints and Dirac Bra-Ket Notation At this point, we can usefully introduce the final part of the Dirac notation, which involves the Hermitian adjoint (or Hermitian conjugate or conjugate transpose) [129]. Generally, this is notated with a superscript “dagger,” written as “†”. The Hermitian adjoint of a matrix is formed by reflecting around the “top-left” to “bottom-right” diagonal of the matrix and taking the complex conjugate of the elements. For some matrix G, with matrix elements gij in the ith row and jth column, the corresponding “row-i, column-j” matrix element of the matrix G† is the number g ji. The Hermitian adjoint of a column vector is, similarly, a row vector whose elements are the complex conjugates of the corresponding elements of the column vector. In Dirac notation, such a row vector is notated using the “bra” notation hϕj. So, explicitly, for our matrices and vectors here, jψSi† ≡ 2 6 6 6 4 h1 h2 . . . hNS 3 7 7 7 5 † ≡ h 1 h 2 h NS ≡ hψSj, (11) and similarly for jϕRi, and the Hermitian adjoint of the operator GSR is G† SR ≡ 2 6 6 6 6 4 g11 g12 g1NS g21 g22 g2NS . . . . . . . . . . . . gNR1 gNR2 gNRNS 3 7 7 7 7 5 † ≡ 2 6 6 6 6 4 g 11 g 21 g NR1 g 12 g 22 g NR2 . . . . . . . . . . . . g 1NL g 2NS g NRNS 3 7 7 7 7 5 : (12) Figure 2 Set of point sources at positions rSj in a source volume, and a set of point receivers at positions rRi in a receiving volume, coupled through the coupling operator GSR. 692 Vol. 11, No. 3 / September 2019 / Advances in Optics and Photonics Tutorial
Tutorial Vol.11,No.3/September 2019/Advances in Optics and Photonics 693 Note too that the Hermitian adjoint of a product is the"flipped round"product of the Hermitian adjoints,i.e.,for two operators G and H, (GH)=HGT (13) (which is easily proved by writing such a product out explicitly using the elements of the matrices and summing them appropriately)and for matrix-vector products (Gly))=(G. (14) The Hermitian adjoint of a Hermitian adjoint just brings us back to where we started,i.e., (G)=G (15) and for some vector [()丁=[=lφ), (16) both of which results are obvious from the process of reflecting and complex conjugating matrices and vectors. For a simple scalar wave,for an amplitude fi at a given receiving point (or micro- phone),the corresponding received power (in appropriate units)would typically be Pi=fifi. (17) So the sum of all the detected powers would be P= ∑ff=(中Rl中R)=(WsIGSR)(GsRlWs))=(UsIGSRGsRlws, (18) where we have substituted from Eq.(10)and used the "bra-ket"'shorthand notation for the "row-vector column-vector"product (adp≡(alB) (19) 3.4.Orthogonality and Inner Products In general,a"bra-ket"expression such as (ap)is an example of an inner product,and one formed in this way,as the matrix product of a row vector on the left and a column vector on the right,is an example of a Cartesian inner product.Inner products are very important in our mathematics,and we will be expanding on this concept substantially. (One of the simplest common examples of an inner product is the usual"dot"product of two geometrical vectors;this Cartesian inner product can be thought of as a generalization of this idea to vectors of arbitrary dimensionality and with complex amplitudes.) A key point about inner products is that they can define the concept of orthogonality of functions.Specifically,for two non-zero vectors a)and B),if and only if their inner product is zero,then the functions are said to be orthogonal.(This is also a generalization of the concept of two (non-zero)geometrical vectors being at right angles or"orthogonal"if and only if their dot product is zero.) An immediate consequence of the idea of orthogonality from the inner product is that, for a wave that is the sum of multiple different orthogonal components,a power as in Eq.(18)is simply the sum of the powers of the individual components;all the
Note too that the Hermitian adjoint of a product is the “flipped round” product of the Hermitian adjoints, i.e., for two operators G and H, GH† H†G† (13) (which is easily proved by writing such a product out explicitly using the elements of the matrices and summing them appropriately) and for matrix-vector products Gjψi† hψjG†: (14) The Hermitian adjoint of a Hermitian adjoint just brings us back to where we started, i.e., G†† G (15) and for some vector jϕi† † hϕj † jϕi, (16) both of which results are obvious from the process of reflecting and complex conjugating matrices and vectors. For a simple scalar wave, for an amplitude f i at a given receiving point (or microphone), the corresponding received power (in appropriate units) would typically be Pi f i fi: (17) So the sum of all the detected powers would be P X N i1 f i fi hϕRjϕRi hψSjG† SRGSRjψSi hψSjG† SRGSRjψSi, (18) where we have substituted from Eq. (10) and used the “bra-ket” shorthand notation for the “row-vector column-vector” product hαjβi ≡ hαjjβi: (19) 3.4. Orthogonality and Inner Products In general, a “bra-ket” expression such as hαjβi is an example of an inner product, and one formed in this way, as the matrix product of a row vector on the left and a column vector on the right, is an example of a Cartesian inner product. Inner products are very important in our mathematics, and we will be expanding on this concept substantially. (One of the simplest common examples of an inner product is the usual “dot” product of two geometrical vectors; this Cartesian inner product can be thought of as a generalization of this idea to vectors of arbitrary dimensionality and with complex amplitudes.) A key point about inner products is that they can define the concept of orthogonality of functions. Specifically, for two non-zero vectors jαi and jβi, if and only if their inner product is zero, then the functions are said to be orthogonal. (This is also a generalization of the concept of two (non-zero) geometrical vectors being at right angles or “orthogonal” if and only if their dot product is zero.) An immediate consequence of the idea of orthogonality from the inner product is that, for a wave that is the sum of multiple different orthogonal components, a power as in Eq. (18) is simply the sum of the powers of the individual components; all the Tutorial Vol. 11, No. 3 / September 2019 / Advances in Optics and Photonics 693