latest Blogs

Subscribe to latest Blogs feed
Cleve Moler is the author of the first MATLAB, one of the founders of MathWorks, and is currently Chief Mathematician at the company. He writes here about MATLAB, scientific computing and interesting mathematics.
Updated: 10 hours 47 min ago

Wilkinson and Reinsch Handbook on Linear Algebra

Tue, 12/05/2017 - 05:00

The ACM Special Interest Group on Programming Languages, SIGPLAN, expects to hold the fourth in a series of conferences on the History of Programming Languages in 2020, see HOPL-IV. The first drafts of papers are to be submitted by August, 2018. That long lead time gives me the opportunity to write a detailed history of MATLAB. I plan to write the paper in sections, which I'll post in this blog as they are available.

The mathematical basis for the first version of MATLAB begins with a volume edited by J. H. Wilkinson and C. Reinsch, Handbook for Automatic Computation, Volume II, Linear Algebra, published by the eminent German publisher Springer-Verlag, in 1971.

ContentsEigenvalues

The mathematical term "eigenvalue" is a linguistic hybrid. In German the term is "eigenwert". One Web dictionary says the English translation is "intrinsic value". But after a few decades using terms like "characteristic value" and "latent root", mathematicians have given up trying to translate the entire word and generally translate only the second half.

The eigenvalue problem for a given matrix $A$ is the task of finding scalars $\lambda$ and, possibly, the corresponding vectors $x$ so that

$$Ax = \lambda x$$

It is important to distinguish the case where $A$ is symmetric. After the linear equation problem,

$$Ax = b$$

the eigenvalue problem is the most important computational task in numerical linear algebra.

The state of the art in numerical linear algebra 50 years ago, in the 1960's, did not yet provide reliable, efficient methods for solving matrix eigenvalue problems. Software libraries had a hodgepodge of methods, many of them based upon polynomial root finders.

Jim Wilkinson told a story about having two such subroutines, one using something called Bairstow's method and one using something called Laguerre's method. He couldn't decide which one was best, so he put them together in a program that first tried Bairstow's method, then printed a message and switched to Laguerre if Bairstow failed. After several months of frequent use at the National Physical Laboratory, he never saw a case where Bairstow failed. He wrote a paper with his recommendation for Bairstow's method and soon got a letter from a reader who claimed to have an example which Bairstow could not handle. Wilkinson tried the example with his program and found a bug. The program didn't print the message that it was switching to Laguerre.

Handbook for Automatic Computation

Over the period 1965-1970, Wilkinson and 18 of his colleagues published a series of papers in the Springer journal Numerische Mathematik. The papers offered procedures, written in Algol 60, for solving different versions of the linear equation problem and the eigenvalue problem. These were research papers presenting results about numerical stability, subtle details of implementation, and, in some cases, new methods.

In 1971, these papers were collected, sometimes with modifications, in the volume edited by Wilkinson and Reinsch. This book marks the first appearance is an organized library of the algorithms for the eigenvalue problem for dense, stored matrices that we still use in MATLAB today. The importance of using orthogonal transformations wherever possible was exposed by Wilkinson and the other authors. The effectiveness of the newly discovered QR algorithms of J. Francis and the related LR algorithm of H. Rutishauser had only recently been appreciated.

Contents

The Algol 60 procedures are the focus of each chapter. These codes remain a clear, readable reference for the important ideas of modern numerical linear algebra. Part I of the volume is about the linear equation problem; part II is about the eigenvalue problem. There are 40 procedures in part I and 43 in part II.

Here is a list of the procedures in Part II. A suffix 2 in the procedure name indicates that it computes both eigenvalues and eigenvectors. The "bak" procedures apply reduction transformations to eigenvectors.

Many of the procedures work with a reduced form, which is tridiagonal for symmetric or Hermitian matrices and Hessenberg (upper triangular plus one subdiagonal) for nonsymmetric matrices., Since Algol does not have a complex number data type, the complex arrays are represented by pairs of real arrays.

Symmetric matrices

Reduction to tridiagonal tred1, tred2, tred3, trbak1, trbak3 Orthogonal tridiagonalization. Band bandrd Tridiagonalization. symray Eigenvectors. bqr One eigenvalue. Tridiagonal imtql1,imtql2 All eigenvalues and vectors, implicit QR. tql1, tql2 All eigenvalues and vectors, explicit QR. ratqr Few eigenvalues, rational QR. bisect Few eigenvalues, bisection. tristurm Few eigenvalues and vectors. Few eigenvalues ritzit Simultaneous iteration. Jacobi method jacobi Jacobi method. Generalized problem, Ax = \lambda Bx reduc1, reduc2, rebaka, rebakb Symmetric A, positive definite B.

Nonsymmetric matrices

Reduction to Hessenberg balance, balbak Balance (scaling) dirhes, dirbak, dirtrans Elementary, accumulated innerprod. elmhes, elmbak, elmtrans Elementary. orthes, ortbak, ortrans Orthogonal. Band unsray Eigenvectors. Hessenberg hqr, hqr2 All eigenvalues and vectors, implicit QR. invit Few eigenvectors, inverse iteration. Norm reducing eigen Eigenvalues.

Complex matrices

comeig Norm reducing Jacobi. comhes, combak Reduce to Hessenberg form. comlr, comlr2 Complex LR algorithm. cxinvit Inverse iteration.

Preferred paths

The preferred path for finding all the eigenvalues of a real, symmetric matrix is tred1 followed by imtql1. The preferred path for finding all the eigenvalues and eigenvectors of a real, symmetric matrix is tred2 followed by imtql2.

The preferred path for finding all the eigenvalues of a real, nonsymmetric matrix is balanc, elmhes, and finally hqr. The preferred path for finding all the eigenvalues and eigenvectors of a real, nonsymmetric matrix is balanc, elmhes, elmtrans, hqr2, and finally balbak.

QR vs. QL

"QR" and "QL" are right- and left-handed, or forward and backward, versions of the same algorithm. The algorithm is usually described in terms of factoring a matrix into an orthogonal factor, Q, and an upper or right triangular factor, R. This leads to QR algorithms. But for reasons having to do with graded matrices and terminating a loop at 1 rather than n-1, the authors of the Handbook decided to use left triangular and QL algorithms.

Historical note

When the papers from Numerische Mathematik were collected to form the 1971 Handbook for Automatic Computation, Volume II, Linear Algebra, almost all of them where reprinted without change. But, despite what its footnote says, Contribution II/4, The Implicit QL Algorithm, never appeared in the journal. The paper is the merger of a half-page paper by Austin Dubrulle and earlier contributions by R.S. Martin and Wilkinson. Dubrulle was able to reduce the operation count in the inner loop.

The authors of Contribution II/4 of the Handbook are listed as A. Dubrulle, R.S.Martin, and J.H.Wilkinson, although the three of them never actually worked together. Proper credit is given, but I'm afraid that an interesting little bit of history has been lost.

Dubrulle's version of the implicit tridiagonal QR algorithm continues to be important today. In future posts I will describe how the Handbook led to EISPACK, LINPACK and LAPACK. In LAPACK the imtql1 and imtql2 functions are combined into one subroutine named DSTEQR.F The code for the inner loop is very similar to Austin's note.

Years after making this contribution, Austin returned to grad school and was one of my Ph.D. students at the University of New Mexico. His thesis was about the distribution of the singular value of the matrices derived from photographs.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. The first drafts of % papers are to be submitted by August, 2018. That long lead time % gives me the opportunity to write a detailed history of MATLAB. % I plan to write the paper in sections, which I'll post in % this blog as they are available. % % The mathematical basis for the first version of MATLAB begins % with a volume edited by J. H. Wilkinson and C. Reinsch, % _Handbook for Automatic Computation, Volume II, Linear Algebra_, % published by the eminent German publisher Springer-Verlag, in 1971. % % <> %% Eigenvalues % The mathematical term "eigenvalue" is a linguistic hybrid. In German % the term is "eigenwert". One Web dictionary says the English translation % is "intrinsic value". But after a few decades using terms like % "characteristic value" and "latent root", mathematicians have given up % trying to translate the entire word and generally translate only the % second half. %% % The _eigenvalue problem_ for a given matrix $A$ is the task of finding % scalars $\lambda$ and, possibly, the corresponding vectors $x$ so that % % $$Ax = \lambda x$$ % % It is important to distinguish the case where $A$ is symmetric. % After the _linear equation problem_, % % $$Ax = b$$ % % the eigenvalue problem is the most important computational task % in numerical linear algebra. %% % The state of the art in numerical linear algebra 50 years ago, in the % 1960's, did not yet provide reliable, efficient methods for solving % matrix eigenvalue problems. Software libraries had a hodgepodge % of methods, many of them based upon polynomial root finders. %% % told a story about having two such subroutines, one using % something called Bairstow's method and one using something called % Laguerre's method. He couldn't decide which one was best, % so he put them together in a program that first tried % Bairstow's method, then printed a message and switched to Laguerre % if Bairstow failed. After several months of frequent use at the National % Physical Laboratory, he never saw a case where Bairstow failed. % He wrote a paper with his recommendation for Bairstow's method and % soon got a letter from a reader who claimed to have an example which % Bairstow could not handle. Wilkinson tried the example with his program % and found a bug. The program didn't print the message that it was % switching to Laguerre. %% Handbook for Automatic Computation % Over the period 1965-1970, Wilkinson and 18 of his colleagues % published a series of % papers in the Springer journal _Numerische Mathematik_. % The papers offered procedures, written in Algol 60, for solving different % versions of the linear equation problem and the eigenvalue problem. % These were _research_ papers presenting results about numerical stability, % subtle details of implementation, and, in some cases, new methods. %% % In 1971, these papers were collected, sometimes with modifications, % in the volume edited by Wilkinson and Reinsch. This book marks the % first appearance is an organized library of the algorithms for the % eigenvalue problem for dense, stored matrices that we still use in % MATLAB today. The importance of using orthogonal transformations % wherever possible was exposed by Wilkinson and the other authors. The % effectiveness of the newly discovered QR algorithms of J. Francis % and the related LR algorithm of H. Rutishauser had only recently been % appreciated. %% Contents % The Algol 60 procedures are the focus of each chapter. These codes % remain a clear, readable reference for the important ideas of modern % numerical linear algebra. Part I of the volume is about the linear % equation problem; part II is about the eigenvalue problem. % There are 40 procedures in part I and 43 in part II. %% % Here is a list of the procedures in Part II. A suffix 2 in the procedure % name indicates that it computes both eigenvalues and eigenvectors. % The "bak" procedures apply reduction transformations to eigenvectors. %% % Many of the procedures work with a reduced form, which is tridiagonal for % symmetric or Hermitian matrices and Hessenberg (upper triangular plus % one subdiagonal) for nonsymmetric matrices., % Since Algol does not have a complex number data type, the complex % arrays are represented by pairs of real arrays. %% Symmetric matrices % % % Reduction to tridiagonal % tred1, tred2, tred3, trbak1, trbak3 Orthogonal tridiagonalization. % Band % bandrd Tridiagonalization. % symray Eigenvectors. % bqr One eigenvalue. % Tridiagonal % imtql1,imtql2 All eigenvalues and vectors, implicit QR. % tql1, tql2 All eigenvalues and vectors, explicit QR. % ratqr Few eigenvalues, rational QR. % bisect Few eigenvalues, bisection. % tristurm Few eigenvalues and vectors. % Few eigenvalues % ritzit Simultaneous iteration. % Jacobi method % jacobi Jacobi method. % Generalized problem, Ax = \lambda Bx % reduc1, reduc2, rebaka, rebakb Symmetric A, positive definite B. % %% Nonsymmetric matrices % % % Reduction to Hessenberg % balance, balbak Balance (scaling) % dirhes, dirbak, dirtrans Elementary, accumulated innerprod. % elmhes, elmbak, elmtrans Elementary. % orthes, ortbak, ortrans Orthogonal. % Band % unsray Eigenvectors. % Hessenberg % hqr, hqr2 All eigenvalues and vectors, implicit QR. % invit Few eigenvectors, inverse iteration. % Norm reducing % eigen Eigenvalues. % %% Complex matrices % % % comeig Norm reducing Jacobi. % comhes, combak Reduce to Hessenberg form. % comlr, comlr2 Complex LR algorithm. % cxinvit Inverse iteration. % %% Preferred paths % The preferred path for finding all the eigenvalues of a real, symmetric % matrix is _tred1_ followed by _imtql1_. % The preferred path for finding all the eigenvalues and eigenvectors % of a real, symmetric matrix is _tred2_ followed by _imtql2_. % % The preferred path for finding all the eigenvalues of a real, nonsymmetric % matrix is _balanc_, _elmhes_, and finally _hqr_. % The preferred path for finding all the eigenvalues and eigenvectors % of a real, nonsymmetric matrix is _balanc_, _elmhes_, _elmtrans_, _hqr2_, % and finally _balbak_. %% QR vs. QL % "QR" and "QL" are right- and left-handed, or forward and backward, % versions of the same algorithm. The algorithm is usually described in % terms of factoring a matrix into an orthogonal factor, Q, and an upper % or right triangular factor, R. This leads to QR algorithms. But for % reasons having to do with graded matrices and terminating a loop at |1| % rather than |n-1|, the authors of the Handbook decided to use left % triangular and QL algorithms. %% Historical note % When the papers from _Numerische Mathematik_ were collected to form the % 1971 _Handbook for Automatic Computation, Volume II, Linear Algebra_, % almost all of them where reprinted without change. But, despite what % its footnote says, Contribution II/4, _The Implicit QL Algorithm_, % never appeared in the journal. The paper is the merger of a half-page % paper by % and earlier contributions by R.S. Martin and Wilkinson. % Dubrulle was able to reduce the operation count in the inner loop. %% % The authors of Contribution II/4 of the Handbook are listed as % A. Dubrulle, R.S.Martin, and J.H.Wilkinson, although the three of them % never actually worked together. % Proper credit is given, but I'm afraid that an interesting little bit % of history has been lost. % % Dubrulle's version of the implicit tridiagonal QR algorithm continues to % be important today. In future posts I will describe how the Handbook % led to EISPACK, LINPACK and LAPACK. % In LAPACK the _imtql1_ and _imtql2_ functions are combined into one % subroutine named % % The code for the inner loop is very similar to Austin's note. % % Years after making this contribution, Austin returned to grad school % and was one of my Ph.D. students at the University of New Mexico. % His thesis was about the distribution of the singular value of the % matrices derived from photographs. ##### SOURCE END ##### 26cefaedad044f33a20d375f994408b9 -->

Leslie Fox

Mon, 11/20/2017 - 22:30

Leslie Fox (1918-1992) is a British numerical analyst who was a contemporary of three men who played such an important role in my education and early professional life, John Todd, George Forsythe, and Jim Wilkinson. Although I saw less of Fox than I did of the others, he was still an important influence in my life.

ContentsNPL

The National Physical Laboratory, in Teddington, England, just west of London, is the British equivalent of the U.S. agency that was originally called the National Bureau of Standards and is now known as the National Institute of Science and Technology. Fox worked at NPL from 1945 until 1956. Wilkinson worked there his entire professional life. Despite the fact that Fox was from Oxford and Wilkinson was from Cambridge, they became close friends.

In the first few years after World War II, both NPL and NBS sponsored the development of first generation stored program electronic computers. An initial design for the NPL machine, the Automatic Computing Engine, was proposed by Alan Turing. But the design was based in part on secret work that Turing had done during the War for breaking German military codes. The management of NPL at the time was not allowed access to Turing's secret work and judged the machine to be impractical with current technology. Disillusioned, Turing left NPL and went to the University of Manchester, where they were also building a computer.

It fell to one of Turing's colleagues, Jim Wilkinson, to take over the project and build a simplified version of the ACE, the Pilot Ace. Wilkinson went on to become the world's leading authority on matrix computation. He was my inspiration and friend. His work provided the mathematical basis for the first MATLAB. Fox was less involved in the hardware development, but he was certainly an early user of the Pilot Ace.

NBS

Meanwhile in U.S., NBS was sponsoring the development of two early machines, one at its headquarters in Maryland, the Standards Eastern Automatic Computer, or SEAC, and one in southern California, at UCLA, the Standards Western Automatic Computer, or SWAC.

NBS also established a research group at UCLA, the Institute for Numerical Analysis, or INA, to contribute to the work on the SWAC. Two members of the INA were John Todd, who a few years later as a faculty member at Caltech, would introduce me to numerical analysis and computing, and George Forsythe, who would go on to establish the Computer Science Department at Stanford and also be my Ph.D. thesis advisor.

The groups at NPL in England and INA in California were in close contact with each other and often exchanged visits. As a result, the four men -- Fox, Wilkinson, Todd, and Forsythe -- became friends and colleagues and leading figures in the budding field that today would be called computational science.

Modern Computing Methods

When I took my first numerical analysis course from Todd at Caltech in 1959, we didn't have a regular textbook. There weren't many to choose from. Todd was in the process of writing what would become his Survey of Numerical Analysis. He passed out purple dittoed copies of an early draft of the book. (Today, Amazon has 19 used copies available for $8.23 with free shipping.)

In lieu of a formal textbook, Todd also referred to this 130-page paper bound pamphlet. Its title says "Modern", but it was first published 60 years ago, in 1957. Here is a link to the hardcover second addition, Modern Computing Methods. (Amazon has 3 used copies available for $7.21, plus $3.99 shipping.)

The publisher of the first edition is listed as the picturesque Her Majesty's Stationery Office. But there are no authors listed. The source is just the National Physical Laboratory. For many years I believed the authors were Fox and Wilkinson. Now, while writing this post, I have come across this review by Charles Saltzer in SIAM Review. The authors are revealed to be "E. T. Goodwin et al". Goodwin was another member of the NPL group. So, I suspect that Fox and Wilkinson were that " et al ".

Oxford

Fox resigned from NPL in 1956 and, after a year in the U.S. visiting U. C. Berkeley, joined the faculty at his old school, Oxford. He remained there the rest of his life. He established the Oxford Computing Laboratory. In 1963 he was appointed Professor of Numerical Analysis and Fellow of Balliol College. (My good friend Nick Trefethen now holds both of those appointments.)

Fox was a prolific author. The bibliography in Leslie Fox booklet lists eight books. He was also a popular professor. The same booklet lists 18 D. Phil. students that he supervised and 71 D. Phil. students where he a member of the committee.

NAG

The Numerical Algorithms Group is a British company, founded in 1970 and with headquarters in Oxford, which markets mathematical software, consulting, and related services. Fox helped the company get started and served on its Board of Directors for many years.

Householder VIII

Fox hosted the eighth Householder Symposium at Oxford in 1981. Here is the group photo. Fox is in the center, in the brown suit. (I am in the second row, second from left, in the blue shirt.)

Fox Prize

The Leslie Fox Prize for Numerical Analysis is awarded every two years by the IMA, the British Institute of Mathematics. The prize honors "young numerical analysts worldwide", that is anyone who is less than 31 years old. One or two first prize winners and usually several second prize winners are announced at symposia in the UK in June of odd numbered years. Eighteen meetings and prize announcements have been held since the award was established in 1985. The 2017 call for papers is IMA Leslie Fox Prize.

The Prize Winners for 2017 were announced in a meeting in Glasgow on June 26. The first prize was awarded to Nicole Spillain of Ecole Polytechnique. Her winning paper was An Adaptive Multipreconditioned Conjugate Gradient Algorithm.

Blue Plaque

Wikipedia explains that "a blue plaque is a permanent sign installed in a public place in the United Kingdom and elsewhere to commemorate a link between that location and a famous person or event, serving as a historical marker." In June, two blue plaques were unveiled in the Dewsbury Railroad Station, in West Yorkshire, about 40 miles west of Manchester. (Google maps tells me that today trains to Manchester leave every half hour and usually can make the trip in 40 minites.)

Photo credit: Huddersfield Daily Examiner

The plaques honor two computer scientists, Leslie Fox and Tom Kilburn. Here is a local newspaper story covering the unveiling. Huddersfield Daily Examiner. It turns out the two were school mates in the local grammar school.

Tom Kilburn worked with Freddie Williams at the University of Manchester to develop the Williams-Kilburn Tube, an early random access memory employing the static charge left by a grid of dots on a cathode ray tube. The scheme provided fast access time, but required constant tuning.

Kilburn and Williams needed a stored program digital computer to test their storage device. So they built a relatively simple machine that came to be known as "Manchester Baby". One day in June, 1948, on the train from Dewsbury to Manchester, Kilburn wrote a program to find the prime factors of an integer. The code was probably only a few dozen machine instructions. When the team got the machine in shape to actually run the program it worked correctly the first time. This was the world's first working computer program, and first instance of mathematical software.

Kilburn became the Professor of Computer Science at Manchester about the same time as Fox became the Professor of Numerical Analysis at Oxford.

Further Reading

Two articles about Leslie Fox are available on the Web at Wikipedia and at the MacTutor History of Mathematics Archive.

The booklet prepared by Oxford for his memorial service is available at Leslie Fox booklet.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

, % , and % . % Although I saw less of Fox than I did of the others, he was still % an important influence in my life. % % <> %% NPL % The National Physical Laboratory, in Teddington, England, just west % of London, is the British equivalent of the U.S. agency that was % originally called the National Bureau of Standards and is now known as % the National Institute of Science and Technology. Fox worked at % NPL from 1945 until 1956. Wilkinson worked there his entire % professional life. Despite the fact that Fox was from Oxford and % Wilkinson was from Cambridge, they became close friends. %% % In the first few years after World War II, both NPL and NBS sponsored % the development of first generation stored program electronic computers. % An initial design for the NPL machine, the % , was proposed by Alan Turing. But the % design was based in part on secret work that Turing had done during the % War for breaking German military codes. The management of NPL at % the time was not allowed access to Turing's secret work and judged the % machine to be impractical with current technology. Disillusioned, Turing % left NPL and went to the University of Manchester, where they were also % building a computer. %% % It fell to one of Turing's colleagues, Jim Wilkinson, to take over the % project and build a simplified version of the ACE, the % . Wilkinson went % on to become the world's leading authority on matrix computation. % He was my inspiration and friend. His work provided the mathematical % basis for the first MATLAB. Fox was less involved in the hardware % development, but he was certainly an early user of the Pilot Ace. %% NBS % Meanwhile in U.S., NBS was sponsoring the development of two early % machines, one at its headquarters in Maryland, the Standards Eastern % Automatic Computer, or % , % and one in southern California, at UCLA, the Standards Western % Automatic Computer, or % . %% % NBS also established a research group at UCLA, the Institute for % Numerical Analysis, or % , to contribute to the work on the SWAC. % Two members of the INA were John Todd, who a few years later as a % faculty member at Caltech, would introduce me to numerical analysis % and computing, and George Forsythe, who would go on to establish % the Computer Science Department at Stanford and also be my Ph.D. thesis % advisor. %% % The groups at NPL in England and INA in California were in close % contact with each other and often exchanged visits. As a result, the % four men REPLACE_WITH_DASH_DASH Fox, Wilkinson, Todd, and Forsythe REPLACE_WITH_DASH_DASH became friends and % colleagues and leading figures in the budding field that today would % be called computational science. %% Modern Computing Methods % When I took my first numerical analysis course from Todd at Caltech % in 1959, we didn't have a regular textbook. There weren't many to % choose from. Todd was in the process of writing what would become his % . He passed out purple dittoed copies of % an early draft of the book. (Today, Amazon has 19 used copies available % for $8.23 with free shipping.) %% % In lieu of a formal textbook, Todd also referred to this 130-page % paper bound pamphlet. Its title says "Modern", but it was first % published 60 years ago, in 1957. % Here is a link to the hardcover second addition, % . (Amazon has 3 used copies available for % $7.21, plus $3.99 shipping.) % % <> %% % The publisher of the first edition is listed as the picturesque % _Her Majesty's Stationery Office_. But there are no authors listed. % The source is just the National Physical Laboratory. For many years % I believed the authors were Fox and Wilkinson. Now, while writing this % post, I have come across % by % Charles Saltzer in _SIAM Review_. The authors are revealed to be % "E. T. Goodwin _et al_". Goodwin was another member of the NPL group. % So, I suspect that Fox and Wilkinson were that " _et al_ ". %% Oxford % Fox resigned from NPL in 1956 and, after a year in the U.S. visiting % U. C. Berkeley, joined the faculty at his old school, Oxford. % He remained there the rest of his life. He established the % Oxford Computing Laboratory. In 1963 he was appointed Professor of % Numerical Analysis and Fellow of Balliol College. (My good friend % Nick Trefethen now holds both of those appointments.) %% % Fox was a prolific author. The bibliography in % lists eight books. % He was also a popular professor. The same booklet lists 18 D. Phil. % students that he supervised and 71 D. Phil. students where he a % member of the committee. %% NAG % The is a British % company, founded in 1970 and with headquarters in Oxford, which markets % mathematical software, consulting, and related services. Fox helped % the company get started and served on its Board of Directors for many % years. %% Householder VIII % Fox hosted the eighth Householder Symposium at Oxford in 1981. % Here is the group photo. Fox is in the center, in the brown suit. % (I am in the second row, second from left, in the blue shirt.) % % <> % %% Fox Prize % The % is awarded every two years % by the IMA, the British Institute of Mathematics. The prize honors % "young numerical analysts worldwide", that is anyone who is less than % 31 years old. One or two first prize winners and usually several second % prize winners are announced at symposia in the UK in June of odd numbered % years. Eighteen meetings and prize announcements have been held since % the award was established in 1985. The 2017 call for papers is % . %% % The were announced in a meeting in Glasgow % on June 26. The first prize was awarded to % of Ecole Polytechnique. % Her winning paper was % . %% Blue Plaque % Wikipedia explains that "a is a permanent sign installed in a public place in the % United Kingdom and elsewhere to commemorate a link between that location % and a famous person or event, serving as a historical marker." % In June, two blue plaques were unveiled in the Dewsbury Railroad % Station, in West Yorkshire, about 40 miles west of Manchester. % (Google maps tells me that today trains to Manchester leave every half % hour and usually can make the trip in 40 minites.) % % <> % % Photo credit: Huddersfield Daily Examiner %% % The plaques honor two computer scientists, Leslie Fox and Tom Kilburn. % Here is a local newspaper story covering the unveiling. % . It turns out the two were school mates in % the local grammar school. %% % worked with % Freddie Williams at the University of Manchester to develop the % , % an early random access memory employing the static charge left by % a grid of dots on a cathode ray tube. The scheme provided fast access % time, but required constant tuning. % % Kilburn and Williams needed a stored program digital computer to test % their storage device. So they built a relatively simple machine that % came to be known as % . % One day in June, 1948, on the train from Dewsbury to Manchester, % Kilburn wrote a program to find the prime factors of an integer. % The code was probably only a few dozen machine instructions. % When the team got the machine in shape to actually run the program % it worked correctly the first time. This was the world's first working % computer program, and first instance of mathematical software. % % Kilburn became the Professor of Computer Science at Manchester about % the same time as Fox became the Professor of Numerical Analysis at % Oxford. %% Further Reading % Two articles about Leslie Fox are available on the Web at % and at the % . % % The booklet prepared by Oxford for his memorial service is available at % . ##### SOURCE END ##### ed1184c94b7b4a1da68df57a87340027 -->

Three-Term Recurrence Relations and Bessel Functions

Mon, 11/06/2017 - 22:30
d

Three-term recurrence relations are the basis for computing Bessel functions.

ContentsA Familiar Three-Term Recurrence

Start with two large integers, say 514229 and 317811. For reasons that will soon become clear, I'll label them $f_{30}$ and $f_{29}$.

clear n = 30 f(30) = 514229; f(29) = 317811 n = 30 f = Columns 1 through 6 0 0 0 0 0 0 Columns 7 through 12 0 0 0 0 0 0 Columns 13 through 18 0 0 0 0 0 0 Columns 19 through 24 0 0 0 0 0 0 Columns 25 through 30 0 0 0 0 317811 514229

Now take repeated differences, and count backwards.

$$ f_{n-1} = f_{n+1} - f_n, \ n = 29, 28, ..., 2, 1 $$

for n = 29:-1:2 f(n-1) = f(n+1) - f(n); end n f n = 2 f = Columns 1 through 6 0 1 1 2 3 5 Columns 7 through 12 8 13 21 34 55 89 Columns 13 through 18 144 233 377 610 987 1597 Columns 19 through 24 2584 4181 6765 10946 17711 28657 Columns 25 through 30 46368 75025 121393 196418 317811 514229

Recognize those integers? They're the venerable Fibonacci numbers. I've just reversed the usual procedure for computing them. And, I didn't pick the first two values at random. In fact, their choice is very delicate. Try picking any other six-digit integers, and see if you can take 30 differences before you hit a negative value. Of course, we usually start with $f_0 = 0$, $f_1 = 1$ and go forward with additions.

The relation

$$ f_n = f_{n+1} - f_{n-1} $$

is an example of a three-term recurrence. I've written it in such a way that I can go either forward or backward.

Friedrich Bessel

Friedrich Bessel was a German astronomer, mathematician and physicist who lived from 1784 until 1846. He was a contemporary of Gauss, but the two had some sort of falling out. Although he did not have a university education, he made major contributions to precise measurements of planetary orbits and distances to stars. He has a crater on the Moon and an asteroid named after him.

In mathematics, the special functions that bear his name were actually discovered by someone else, Daniel Bernoulli, and christened after Bessel's death. They play the same role in polar coordinates that the trig functions play in Cartesian coordinates. They are applicable in a wide range of fields, from physics to finance.

Bessel Functions

Bessel functions are defined by an ordinary differential equation that generalizes the harmonic oscillator to polar coordinates. The equation contains a parameter $n$, the order.

$$ x^2 y'' + x y' + (x^2 - n^2) y = 0 $$

Solutions to this equation are also sometimes called cylindrical harmonics.

For my purposes here, their most important property is the three-term recurrence relation:

$$ \frac{2n}{x} J_n(x) = J_{n-1}(x) + J_{n+1}(x) $$

This has a variable coefficient that depends upon the two parameters, the order $n$ and the argument $x$.

To complete the specification, it would be necessary to supply two initial conditions that depend upon $x$. But, as we shall see, using the recurrence in the forward direction is disastrous numerically. Alternatively, if we could somehow supply two end conditions, we could use the recurrence in the reverse direction. This the idea behind a classical method known as Miller's algorithm.

Miller's Algorithm

Miller's algorithm also employs this interesting identity, which I'll call the normalizer.

$$ J_0(x) + 2 J_2(x) + 2 J_4(x) + 2 J_6(x) + \ ... = 1 $$

It's not obvious where this identity comes from and I won't try to derive it here.

We don't know the end conditions, but we can pick arbitrary values, run the recurrence backwards, and then use the normalizer to rescale.

Step 1. Pick a large $N$ and temporarily let

$$ \tilde{J}_N = 0 $$

$$ \tilde{J}_{N-1} = 1 $$

Then for $n = N-1, ..., 1$ compute

$$ \tilde{J}_{n-1} = 2n/x \tilde{J}_n \ - \tilde{J}_{n+1} $$

Step 2. Normalize:

$$ \sigma = \tilde{J}_0 + 2 \tilde{J}_2 + 2 \tilde{J}_4 + ... $$

$$ J_n = \tilde{J}_n/\sigma, \ n = 0, ..., N $$

I'll show some results in a moment, after I introduce an alternative formulation.

Matrix Formulation

You know that I like to describe algorithms in matrix terms. The three-term recurrence generates a tridiagonal matrix. But which three diagonals? I can put the three below the diagonal, above the diagonal, or centered on the diagonal. This is accomplished with the second argument of the sparse matrix generator, spdiags. Here is the code.

type blt function B = blt(n,x,loc) % BesseL Three term recurence. % B = blt(n,x,loc) is an n-by-n sparse tridiagonal coefficent matrix % that generates the Bessel functions besselj((0:n-1)',x). % loc specifies the location of the diagonals. % loc = 'center', the default, centers the three diagonals. % loc = 'lower' puts the three diagonals in the lower triangle. % loc = 'upper' puts the three diagonals in the upper triangle. if nargin == 2 loc = 'center'; end switch loc case 'center', locd = -1:1; case 'lower', locd = -2:0; case 'upper', locd = 0:2; end % Three term recurrence: 2*n/x * J_n(x) = J_(n-1)(x) + J_n+1(x). d = -2*(0:n-1)'/x; e = ones(n,1); B = spdiags([e d e],locd,n,n); end Lower

If there were no roundoff error, and if we knew the values of the first two Bessel functions, $J_0(x)$ and $J_1(x)$, we could use backslash, the linear equation solver, with the diagonals in the lower triangle to compute $J_n(x)$ for $n > 1$. Here, for example, is $x = 1$.

format long J = @(n,x) besselj(n,x); n = 8 x = 1.0 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs n = 8 x = 1 v = 0.765197686557967 0.440050585744934 0.114903484931901 0.019563353982668 0.002476638964110 0.000249757730213 0.000020938338022 0.000001502326053

These values are accurate to almost all the digits shown. Except for the last one. It's beginning to show the effects of severe numerical instability. Let's try a larger value of n.

n = 18 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs; semilogy(0:n-1,v,'o-') title('J_n(1)') xlabel('n') n = 18

The values beyond about n = 9 are suspect and, in fact, totally wrong. The forward elimination that backslash uses to solve this lower triangular system is just the three-term recurrence for increasing n. And that recurrence has two solutions -- the one we're trying to compute and a second one corresponding to $Y_n(x)$, the Bessel function of second kind. $Y_n(x)$ is stimulated by roundoff error and completely takes over for $n > 9$.

So the lower triangular option -- the forward recurrence -- is unsatisfactory for two reasons: it requires two Bessel values to get started and it is unstable.

Upper

We need to make one modification to the upper triangular coefficient matrix.

n = 30; B = blt(n,x,'upper'); B(n-1,n) = 0;

Now the last two rows and columns are a 2-by-2 identity.

full(B(n-1:n,n-1:n)) ans = 1 0 0 1

The back substitution process just runs the three-term recursion backwards. We need to start with $J_{n-1}(x)$ and $J_{n-2}(x)$.

rhs = zeros(n,1); rhs(n-1:n) = J((n-2:n-1)',x); format long e v = B\rhs v = 7.651976865579656e-01 4.400505857449330e-01 1.149034849319004e-01 1.956335398266838e-02 2.476638964109952e-03 2.497577302112342e-04 2.093833800238925e-05 1.502325817436807e-06 9.422344172604491e-08 5.249250179911870e-09 2.630615123687451e-10 1.198006746303136e-11 4.999718179448401e-13 1.925616764480172e-14 6.885408200044221e-16 2.297531532210343e-17 7.186396586807488e-19 2.115375568053260e-20 5.880344573595754e-22 1.548478441211652e-23 3.873503008524655e-25 9.227621982096665e-27 2.098223955943776e-28 4.563424055950103e-30 9.511097932712488e-32 1.902951751891381e-33 3.660826744416801e-35 6.781552053554108e-37 1.211364502417112e-38 2.089159981718163e-40

These values are accurate to almost all the significant digits shown. The backwards recursion suppresses the unwanted $Y_n(x)$ and the process is stable.

Center

Can we avoid having to know those two starting values? Yes, by using a matrix formulation of the Miller algorithm. Insert the normalization condition in the first row. Now $B$ is tridiagonal, except for first row, which is

[1 0 2 0 2 0 2 ... ]

Other rows have $1$ s on the super- and subdiagonal and $-2n/x$ on the diagonal of the $(n+1)$ -st row.

n = 30; B = blt(n,x); B(1,1:2) = [1 0]; B(1,3:2:n) = 2; spy(B) title('B')

The beauty of this approach is that it does not require any values of the Bessel function a priori. The right hand side is simply a unit vector.

rhs = eye(n,1); format long e v = B\rhs semilogy(0:n-1,v,'o-') xlabel('n') title('J_n(1)') v = 7.651976865579665e-01 4.400505857449335e-01 1.149034849319005e-01 1.956335398266840e-02 2.476638964109954e-03 2.497577302112343e-04 2.093833800238926e-05 1.502325817436807e-06 9.422344172604494e-08 5.249250179911872e-09 2.630615123687451e-10 1.198006746303136e-11 4.999718179448401e-13 1.925616764480171e-14 6.885408200044220e-16 2.297531532210343e-17 7.186396586807487e-19 2.115375568053260e-20 5.880344573595753e-22 1.548478441211652e-23 3.873503008524655e-25 9.227621982096663e-27 2.098223955943775e-28 4.563424055950102e-30 9.511097932712487e-32 1.902951751891381e-33 3.660826744416762e-35 6.781552053355331e-37 1.211364395117367e-38 2.088559301926495e-40 Relative error w = J((0:n-1)',x); relerr = (v - w)./w; semilogy(0:n-1,abs(relerr) + eps*(v==w),'o-') xlabel('n') title('relative error')

The relative error deteriorates for the last few values of n. This is not unexpected because we have effectively truncated a semi-infinite matrix.

Triangular Factors

The LU decomposition of $B$ shows no pivoting for this value of $x$, but there is fill-in. The normalization condition is mixing in with the recurrence.

[L,U] = lu(B); spy(L) title('L') snapnow spy(U) title('U') References

J. C. P. Miller, "On the choice of standard solutions for a homogeneous linear equation of the second order", Quart. J. Mech. Appl. Math. 3, 1950, 225-235.

Walter Gautschi, "Computational Aspects of Three-Term Recurrence Relations", SIAM Review 9, 1967, 24-82.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. I've just reversed the usual procedure for % computing them. And, I didn't pick the first two values % at random. In fact, their choice is very delicate. Try picking % any other six-digit integers, and see if you can take 30 differences % before you hit a negative value. Of course, we usually start with % $f_0 = 0$, $f_1 = 1$ and go forward with additions. %% % The relation % % $$ f_n = f_{n+1} - f_{n-1} $$ % % is an example of a _three-term recurrence_. I've written it in such % a way that I can go either forward or backward. %% Friedrich Bessel % Friedrich Bessel was a German astronomer, mathematician and physicist % who lived from 1784 until 1846. % He was a contemporary of Gauss, but the two had some sort of falling % out. Although he did not have a university education, he made major % contributions to precise measurements of planetary orbits and % distances to stars. He has a crater on the Moon and an asteroid % named after him. %% % In mathematics, the special functions that bear his name were actually % discovered by someone else, Daniel Bernoulli, and christened after % Bessel's death. They play the same role in polar coordinates that the % trig functions play in Cartesian coordinates. They are applicable % in a wide range of fields, from physics to finance. %% Bessel Functions % Bessel functions are defined by an ordinary differential % equation that generalizes the harmonic oscillator to polar % coordinates. The equation contains a parameter $n$, the _order_. % % $$ x^2 y'' + x y' + (x^2 - n^2) y = 0 $$ % % Solutions to this equation are also sometimes called % _cylindrical harmonics_. %% % For my purposes here, their most important property is the % three-term recurrence relation: % % $$ \frac{2n}{x} J_n(x) = J_{n-1}(x) + J_{n+1}(x) $$ % % This has a variable coefficient that depends upon the two parameters, % the order $n$ and the argument $x$. %% % To complete the specification, it would be necessary to supply two % initial conditions that depend upon $x$. % But, as we shall see, using the recurrence in the forward direction % is disastrous numerically. % Alternatively, if we could somehow supply two end conditions, % we could use the recurrence in the reverse direction. % This the idea behind a classical method known as Miller's algorithm. %% Miller's Algorithm % Miller's algorithm also employs this interesting identity, % which I'll call the _normalizer_. % % $$ J_0(x) + 2 J_2(x) + 2 J_4(x) + 2 J_6(x) + \ ... = 1 $$ % % It's not obvious where this identity comes from and I won't try to % derive it here. %% % We don't know the end conditions, but we can pick arbitrary values, % run the recurrence backwards, and then use the normalizer to rescale. %% % Step 1. Pick a large $N$ and temporarily let % % $$ \tilde{J}_N = 0 $$ % % $$ \tilde{J}_{N-1} = 1 $$ % % Then for $n = N-1, ..., 1$ compute % % $$ \tilde{J}_{n-1} = 2n/x \tilde{J}_n \ - \tilde{J}_{n+1} $$ % %% % Step 2. Normalize: % % $$ \sigma = \tilde{J}_0 + 2 \tilde{J}_2 + 2 \tilde{J}_4 + ... $$ % % $$ J_n = \tilde{J}_n/\sigma, \ n = 0, ..., N $$ %% % I'll show some results in a moment, after I introduce an % alternative formulation. %% Matrix Formulation % You know that I like to describe algorithms in matrix terms. % The three-term recurrence generates a tridiagonal matrix. % But which three diagonals? I can put the three below the diagonal, % above the diagonal, or centered on the diagonal. This is accomplished % with the second argument of the sparse matrix generator, |spdiags|. % Here is the code. type blt %% Lower % If there were no roundoff error, and if we knew the values of the % first two Bessel functions, $J_0(x)$ and $J_1(x)$, we could use % backslash, the linear equation solver, with the diagonals in the % lower triangle to compute $J_n(x)$ for $n > 1$. Here, for example, % is $x = 1$. format long J = @(n,x) besselj(n,x); n = 8 x = 1.0 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs %% % These values are accurate to almost all the digits shown. % Except for the last one. It's beginning to show the effects of % severe numerical instability. Let's try a larger value of |n|. n = 18 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs; semilogy(0:n-1,v,'o-') title('J_n(1)') xlabel('n') %% % The values beyond about |n = 9| are suspect and, in fact, totally % wrong. The forward elimination that backslash uses to solve this % lower triangular system is just the three-term recurrence for % increasing |n|. And that recurrence has two solutions REPLACE_WITH_DASH_DASH the one % we're trying to compute and a second one corresponding to $Y_n(x)$, % the _Bessel function of second kind_. $Y_n(x)$ is stimulated % by roundoff error and completely takes over for $n > 9$. %% % So the lower triangular option REPLACE_WITH_DASH_DASH the forward recurrence REPLACE_WITH_DASH_DASH is % unsatisfactory for two reasons: it requires two Bessel values to get % started and it is unstable. %% Upper % We need to make one modification to the upper triangular coefficient % matrix. n = 30; B = blt(n,x,'upper'); B(n-1,n) = 0; %% % Now the last two rows and columns are a 2-by-2 identity. full(B(n-1:n,n-1:n)) %% % The back substitution process just runs the three-term recursion % backwards. We need to start with $J_{n-1}(x)$ and $J_{n-2}(x)$. rhs = zeros(n,1); rhs(n-1:n) = J((n-2:n-1)',x); format long e v = B\rhs %% % These values are accurate to almost all the significant digits % shown. The backwards recursion suppresses the unwanted $Y_n(x)$ and % the process is stable. %% Center % Can we avoid having to know those two starting values? % Yes, by using a matrix formulation of the Miller algorithm. % Insert the normalization condition in the first row. % Now $B$ is tridiagonal, except for first row, which is % % [1 0 2 0 2 0 2 ... ] % % Other rows have $1$ s on the super- and subdiagonal and $-2n/x$ % on the diagonal of the $(n+1)$ -st row. n = 30; B = blt(n,x); B(1,1:2) = [1 0]; B(1,3:2:n) = 2; spy(B) title('B') %% % The beauty of this approach is that it does not require any % values of the Bessel function _a priori_. The right hand side % is simply a unit vector. rhs = eye(n,1); format long e v = B\rhs semilogy(0:n-1,v,'o-') xlabel('n') title('J_n(1)') %% Relative error w = J((0:n-1)',x); relerr = (v - w)./w; semilogy(0:n-1,abs(relerr) + eps*(v==w),'o-') xlabel('n') title('relative error') %% % The relative error deteriorates for the last few values of |n|. % This is not unexpected because we have effectively truncated % a semi-infinite matrix. %% Triangular Factors % The LU decomposition of $B$ shows no pivoting for this value of $x$, % but there is fill-in. The normalization condition is mixing in % with the recurrence. [L,U] = lu(B); spy(L) title('L') snapnow spy(U) title('U') %% References % J. C. P. Miller, "On the choice of standard solutions for a % homogeneous linear equation of the second order", % _Quart. J. Mech. Appl. Math._ 3, 1950, 225-235. % % Walter Gautschi, "Computational Aspects of Three-Term Recurrence % Relations", _SIAM Review_ 9, 1967, 24-82. % ##### SOURCE END ##### eac8550e3f7a4d2ea24b0951cd0803d1 -->

Three-Term Recurrence Relations and Bessel Functions

Mon, 11/06/2017 - 22:30
d

Three-term recurrence relations are the basis for computing Bessel functions.

ContentsA Familiar Three-Term Recurrence

Start with two large integers, say 514229 and 317811. For reasons that will soon become clear, I'll label them $f_{30}$ and $f_{29}$.

clear n = 30 f(30) = 514229; f(29) = 317811 n = 30 f = Columns 1 through 6 0 0 0 0 0 0 Columns 7 through 12 0 0 0 0 0 0 Columns 13 through 18 0 0 0 0 0 0 Columns 19 through 24 0 0 0 0 0 0 Columns 25 through 30 0 0 0 0 317811 514229

Now take repeated differences, and count backwards.

$$ f_{n-1} = f_{n+1} - f_n, \ n = 29, 28, ..., 2, 1 $$

for n = 29:-1:2 f(n-1) = f(n+1) - f(n); end n f n = 2 f = Columns 1 through 6 0 1 1 2 3 5 Columns 7 through 12 8 13 21 34 55 89 Columns 13 through 18 144 233 377 610 987 1597 Columns 19 through 24 2584 4181 6765 10946 17711 28657 Columns 25 through 30 46368 75025 121393 196418 317811 514229

Recognize those integers? They're the venerable Fibonacci numbers. I've just reversed the usual procedure for computing them. And, I didn't pick the first two values at random. In fact, their choice is very delicate. Try picking any other six-digit integers, and see if you can take 30 differences before you hit a negative value. Of course, we usually start with $f_0 = 0$, $f_1 = 1$ and go forward with additions.

The relation

$$ f_n = f_{n+1} - f_{n-1} $$

is an example of a three-term recurrence. I've written it in such a way that I can go either forward or backward.

Friedrich Bessel

Friedrich Bessel was a German astronomer, mathematician and physicist who lived from 1784 until 1846. He was a contemporary of Gauss, but the two had some sort of falling out. Although he did not have a university education, he made major contributions to precise measurements of planetary orbits and distances to stars. He has a crater on the Moon and an asteroid named after him.

In mathematics, the special functions that bear his name were actually discovered by someone else, Daniel Bernoulli, and christened after Bessel's death. They play the same role in polar coordinates that the trig functions play in Cartesian coordinates. They are applicable in a wide range of fields, from physics to finance.

Bessel Functions

Bessel functions are defined by an ordinary differential equation that generalizes the harmonic oscillator to polar coordinates. The equation contains a parameter $n$, the order.

$$ x^2 y'' + x y' + (x^2 - n^2) y = 0 $$

Solutions to this equation are also sometimes called cylindrical harmonics.

For my purposes here, their most important property is the three-term recurrence relation:

$$ \frac{2n}{x} J_n(x) = J_{n-1}(x) + J_{n+1}(x) $$

This has a variable coefficient that depends upon the two parameters, the order $n$ and the argument $x$.

To complete the specification, it would be necessary to supply two initial conditions that depend upon $x$. But, as we shall see, using the recurrence in the forward direction is disastrous numerically. Alternatively, if we could somehow supply two end conditions, we could use the recurrence in the reverse direction. This the idea behind a classical method known as Miller's algorithm.

Miller's Algorithm

Miller's algorithm also employs this interesting identity, which I'll call the normalizer.

$$ J_0(x) + 2 J_2(x) + 2 J_4(x) + 2 J_6(x) + \ ... = 1 $$

It's not obvious where this identity comes from and I won't try to derive it here.

We don't know the end conditions, but we can pick arbitrary values, run the recurrence backwards, and then use the normalizer to rescale.

Step 1. Pick a large $N$ and temporarily let

$$ \tilde{J}_N = 0 $$

$$ \tilde{J}_{N-1} = 1 $$

Then for $n = N-1, ..., 1$ compute

$$ \tilde{J}_{n-1} = 2n/x \tilde{J}_n \ - \tilde{J}_{n+1} $$

Step 2. Normalize:

$$ \sigma = \tilde{J}_0 + 2 \tilde{J}_2 + 2 \tilde{J}_4 + ... $$

$$ J_n = \tilde{J}_n/\sigma, \ n = 0, ..., N $$

I'll show some results in a moment, after I introduce an alternative formulation.

Matrix Formulation

You know that I like to describe algorithms in matrix terms. The three-term recurrence generates a tridiagonal matrix. But which three diagonals? I can put the three below the diagonal, above the diagonal, or centered on the diagonal. This is accomplished with the second argument of the sparse matrix generator, spdiags. Here is the code.

type blt function B = blt(n,x,loc) % BesseL Three term recurence. % B = blt(n,x,loc) is an n-by-n sparse tridiagonal coefficent matrix % that generates the Bessel functions besselj((0:n-1)',x). % loc specifies the location of the diagonals. % loc = 'center', the default, centers the three diagonals. % loc = 'lower' puts the three diagonals in the lower triangle. % loc = 'upper' puts the three diagonals in the upper triangle. if nargin == 2 loc = 'center'; end switch loc case 'center', locd = -1:1; case 'lower', locd = -2:0; case 'upper', locd = 0:2; end % Three term recurrence: 2*n/x * J_n(x) = J_(n-1)(x) + J_n+1(x). d = -2*(0:n-1)'/x; e = ones(n,1); B = spdiags([e d e],locd,n,n); end Lower

If there were no roundoff error, and if we knew the values of the first two Bessel functions, $J_0(x)$ and $J_1(x)$, we could use backslash, the linear equation solver, with the diagonals in the lower triangle to compute $J_n(x)$ for $n > 1$. Here, for example, is $x = 1$.

format long J = @(n,x) besselj(n,x); n = 8 x = 1.0 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs n = 8 x = 1 v = 0.765197686557967 0.440050585744934 0.114903484931901 0.019563353982668 0.002476638964110 0.000249757730213 0.000020938338022 0.000001502326053

These values are accurate to almost all the digits shown. Except for the last one. It's beginning to show the effects of severe numerical instability. Let's try a larger value of n.

n = 18 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs; semilogy(0:n-1,v,'o-') title('J_n(1)') xlabel('n') n = 18

The values beyond about n = 9 are suspect and, in fact, totally wrong. The forward elimination that backslash uses to solve this lower triangular system is just the three-term recurrence for increasing n. And that recurrence has two solutions -- the one we're trying to compute and a second one corresponding to $Y_n(x)$, the Bessel function of second kind. $Y_n(x)$ is stimulated by roundoff error and completely takes over for $n > 9$.

So the lower triangular option -- the forward recurrence -- is unsatisfactory for two reasons: it requires two Bessel values to get started and it is unstable.

Upper

We need to make one modification to the upper triangular coefficient matrix.

n = 30; B = blt(n,x,'upper'); B(n-1,n) = 0;

Now the last two rows and columns are a 2-by-2 identity.

full(B(n-1:n,n-1:n)) ans = 1 0 0 1

The back substitution process just runs the three-term recursion backwards. We need to start with $J_{n-1}(x)$ and $J_{n-2}(x)$.

rhs = zeros(n,1); rhs(n-1:n) = J((n-2:n-1)',x); format long e v = B\rhs v = 7.651976865579656e-01 4.400505857449330e-01 1.149034849319004e-01 1.956335398266838e-02 2.476638964109952e-03 2.497577302112342e-04 2.093833800238925e-05 1.502325817436807e-06 9.422344172604491e-08 5.249250179911870e-09 2.630615123687451e-10 1.198006746303136e-11 4.999718179448401e-13 1.925616764480172e-14 6.885408200044221e-16 2.297531532210343e-17 7.186396586807488e-19 2.115375568053260e-20 5.880344573595754e-22 1.548478441211652e-23 3.873503008524655e-25 9.227621982096665e-27 2.098223955943776e-28 4.563424055950103e-30 9.511097932712488e-32 1.902951751891381e-33 3.660826744416801e-35 6.781552053554108e-37 1.211364502417112e-38 2.089159981718163e-40

These values are accurate to almost all the significant digits shown. The backwards recursion suppresses the unwanted $Y_n(x)$ and the process is stable.

Center

Can we avoid having to know those two starting values? Yes, by using a matrix formulation of the Miller algorithm. Insert the normalization condition in the first row. Now $B$ is tridiagonal, except for first row, which is

[1 0 2 0 2 0 2 ... ]

Other rows have $1$ s on the super- and subdiagonal and $-2n/x$ on the diagonal of the $(n+1)$ -st row.

n = 30; B = blt(n,x); B(1,1:2) = [1 0]; B(1,3:2:n) = 2; spy(B) title('B')

The beauty of this approach is that it does not require any values of the Bessel function a priori. The right hand side is simply a unit vector.

rhs = eye(n,1); format long e v = B\rhs semilogy(0:n-1,v,'o-') xlabel('n') title('J_n(1)') v = 7.651976865579665e-01 4.400505857449335e-01 1.149034849319005e-01 1.956335398266840e-02 2.476638964109954e-03 2.497577302112343e-04 2.093833800238926e-05 1.502325817436807e-06 9.422344172604494e-08 5.249250179911872e-09 2.630615123687451e-10 1.198006746303136e-11 4.999718179448401e-13 1.925616764480171e-14 6.885408200044220e-16 2.297531532210343e-17 7.186396586807487e-19 2.115375568053260e-20 5.880344573595753e-22 1.548478441211652e-23 3.873503008524655e-25 9.227621982096663e-27 2.098223955943775e-28 4.563424055950102e-30 9.511097932712487e-32 1.902951751891381e-33 3.660826744416762e-35 6.781552053355331e-37 1.211364395117367e-38 2.088559301926495e-40 Relative error w = J((0:n-1)',x); relerr = (v - w)./w; semilogy(0:n-1,abs(relerr) + eps*(v==w),'o-') xlabel('n') title('relative error')

The relative error deteriorates for the last few values of n. This is not unexpected because we have effectively truncated a semi-infinite matrix.

Triangular Factors

The LU decomposition of $B$ shows no pivoting for this value of $x$, but there is fill-in. The normalization condition is mixing in with the recurrence.

[L,U] = lu(B); spy(L) title('L') snapnow spy(U) title('U') References

J. C. P. Miller, "On the choice of standard solutions for a homogeneous linear equation of the second order", Quart. J. Mech. Appl. Math. 3, 1950, 225-235.

Walter Gautschi, "Computational Aspects of Three-Term Recurrence Relations", SIAM Review 9, 1967, 24-82.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. I've just reversed the usual procedure for % computing them. And, I didn't pick the first two values % at random. In fact, their choice is very delicate. Try picking % any other six-digit integers, and see if you can take 30 differences % before you hit a negative value. Of course, we usually start with % $f_0 = 0$, $f_1 = 1$ and go forward with additions. %% % The relation % % $$ f_n = f_{n+1} - f_{n-1} $$ % % is an example of a _three-term recurrence_. I've written it in such % a way that I can go either forward or backward. %% Friedrich Bessel % Friedrich Bessel was a German astronomer, mathematician and physicist % who lived from 1784 until 1846. % He was a contemporary of Gauss, but the two had some sort of falling % out. Although he did not have a university education, he made major % contributions to precise measurements of planetary orbits and % distances to stars. He has a crater on the Moon and an asteroid % named after him. %% % In mathematics, the special functions that bear his name were actually % discovered by someone else, Daniel Bernoulli, and christened after % Bessel's death. They play the same role in polar coordinates that the % trig functions play in Cartesian coordinates. They are applicable % in a wide range of fields, from physics to finance. %% Bessel Functions % Bessel functions are defined by an ordinary differential % equation that generalizes the harmonic oscillator to polar % coordinates. The equation contains a parameter $n$, the _order_. % % $$ x^2 y'' + x y' + (x^2 - n^2) y = 0 $$ % % Solutions to this equation are also sometimes called % _cylindrical harmonics_. %% % For my purposes here, their most important property is the % three-term recurrence relation: % % $$ \frac{2n}{x} J_n(x) = J_{n-1}(x) + J_{n+1}(x) $$ % % This has a variable coefficient that depends upon the two parameters, % the order $n$ and the argument $x$. %% % To complete the specification, it would be necessary to supply two % initial conditions that depend upon $x$. % But, as we shall see, using the recurrence in the forward direction % is disastrous numerically. % Alternatively, if we could somehow supply two end conditions, % we could use the recurrence in the reverse direction. % This the idea behind a classical method known as Miller's algorithm. %% Miller's Algorithm % Miller's algorithm also employs this interesting identity, % which I'll call the _normalizer_. % % $$ J_0(x) + 2 J_2(x) + 2 J_4(x) + 2 J_6(x) + \ ... = 1 $$ % % It's not obvious where this identity comes from and I won't try to % derive it here. %% % We don't know the end conditions, but we can pick arbitrary values, % run the recurrence backwards, and then use the normalizer to rescale. %% % Step 1. Pick a large $N$ and temporarily let % % $$ \tilde{J}_N = 0 $$ % % $$ \tilde{J}_{N-1} = 1 $$ % % Then for $n = N-1, ..., 1$ compute % % $$ \tilde{J}_{n-1} = 2n/x \tilde{J}_n \ - \tilde{J}_{n+1} $$ % %% % Step 2. Normalize: % % $$ \sigma = \tilde{J}_0 + 2 \tilde{J}_2 + 2 \tilde{J}_4 + ... $$ % % $$ J_n = \tilde{J}_n/\sigma, \ n = 0, ..., N $$ %% % I'll show some results in a moment, after I introduce an % alternative formulation. %% Matrix Formulation % You know that I like to describe algorithms in matrix terms. % The three-term recurrence generates a tridiagonal matrix. % But which three diagonals? I can put the three below the diagonal, % above the diagonal, or centered on the diagonal. This is accomplished % with the second argument of the sparse matrix generator, |spdiags|. % Here is the code. type blt %% Lower % If there were no roundoff error, and if we knew the values of the % first two Bessel functions, $J_0(x)$ and $J_1(x)$, we could use % backslash, the linear equation solver, with the diagonals in the % lower triangle to compute $J_n(x)$ for $n > 1$. Here, for example, % is $x = 1$. format long J = @(n,x) besselj(n,x); n = 8 x = 1.0 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs %% % These values are accurate to almost all the digits shown. % Except for the last one. It's beginning to show the effects of % severe numerical instability. Let's try a larger value of |n|. n = 18 B = blt(n,x,'lower'); rhs = zeros(n,1); rhs(1:2) = J((0:1)',x); v = B\rhs; semilogy(0:n-1,v,'o-') title('J_n(1)') xlabel('n') %% % The values beyond about |n = 9| are suspect and, in fact, totally % wrong. The forward elimination that backslash uses to solve this % lower triangular system is just the three-term recurrence for % increasing |n|. And that recurrence has two solutions REPLACE_WITH_DASH_DASH the one % we're trying to compute and a second one corresponding to $Y_n(x)$, % the _Bessel function of second kind_. $Y_n(x)$ is stimulated % by roundoff error and completely takes over for $n > 9$. %% % So the lower triangular option REPLACE_WITH_DASH_DASH the forward recurrence REPLACE_WITH_DASH_DASH is % unsatisfactory for two reasons: it requires two Bessel values to get % started and it is unstable. %% Upper % We need to make one modification to the upper triangular coefficient % matrix. n = 30; B = blt(n,x,'upper'); B(n-1,n) = 0; %% % Now the last two rows and columns are a 2-by-2 identity. full(B(n-1:n,n-1:n)) %% % The back substitution process just runs the three-term recursion % backwards. We need to start with $J_{n-1}(x)$ and $J_{n-2}(x)$. rhs = zeros(n,1); rhs(n-1:n) = J((n-2:n-1)',x); format long e v = B\rhs %% % These values are accurate to almost all the significant digits % shown. The backwards recursion suppresses the unwanted $Y_n(x)$ and % the process is stable. %% Center % Can we avoid having to know those two starting values? % Yes, by using a matrix formulation of the Miller algorithm. % Insert the normalization condition in the first row. % Now $B$ is tridiagonal, except for first row, which is % % [1 0 2 0 2 0 2 ... ] % % Other rows have $1$ s on the super- and subdiagonal and $-2n/x$ % on the diagonal of the $(n+1)$ -st row. n = 30; B = blt(n,x); B(1,1:2) = [1 0]; B(1,3:2:n) = 2; spy(B) title('B') %% % The beauty of this approach is that it does not require any % values of the Bessel function _a priori_. The right hand side % is simply a unit vector. rhs = eye(n,1); format long e v = B\rhs semilogy(0:n-1,v,'o-') xlabel('n') title('J_n(1)') %% Relative error w = J((0:n-1)',x); relerr = (v - w)./w; semilogy(0:n-1,abs(relerr) + eps*(v==w),'o-') xlabel('n') title('relative error') %% % The relative error deteriorates for the last few values of |n|. % This is not unexpected because we have effectively truncated % a semi-infinite matrix. %% Triangular Factors % The LU decomposition of $B$ shows no pivoting for this value of $x$, % but there is fill-in. The normalization condition is mixing in % with the recurrence. [L,U] = lu(B); spy(L) title('L') snapnow spy(U) title('U') %% References % J. C. P. Miller, "On the choice of standard solutions for a % homogeneous linear equation of the second order", % _Quart. J. Mech. Appl. Math._ 3, 1950, 225-235. % % Walter Gautschi, "Computational Aspects of Three-Term Recurrence % Relations", _SIAM Review_ 9, 1967, 24-82. % ##### SOURCE END ##### eac8550e3f7a4d2ea24b0951cd0803d1 -->

Two Other MATLABs, in Bangladesh and in Hindi

Mon, 10/23/2017 - 22:30

This post is about the words "Matlab" and "matlab", in upper and lower case, and without a trademark symbol. Matlab with a capital "M" is a district in Bangladesh. "matlab" with a lower case "m" is a common word in the Hindi language.

ContentsMatlab, in Bangladesh

Matlab is the name of both a rural district in central Bangladesh and it largest city. The population of the district is only about a quarter of a million people, compared to over 160 million in all of Bangladesh.

For over 50 years, since the early 1960s, Matlab has served as important laboratory for the study of health issues in impoverished rural areas. The studies began with a hospital aboard a barge floating around the cholera-prone rivers of the district. The Health and Demographic Surveillance System (HDSS) was established in 1966.

Almost everyone in Matlab has a personal heath identification number and records are kept of births, deaths, marriages, and various illnesses. The relative isolation of the area makes it a nearly self-contained population. Dozens of studies and hundreds of reports and academic papers detail all aspects of public health, especially fertility, reproductive health, maternal and child health, cause of death and child morbidity, health equity, and effects of climate change.

Matlab played a central role the development of ORS, oral rehydration salts, a home-made mixture of water, salt and molasses that prevents dehydration in children sick with cholera. The British medical journal The Lancet described ORS as "potentially the most important medical advance of this century."

photo credit <http://ghes.berkeley.edu>

matlab, in Hindi

In the Hindi language, the word "matlab" means "means", as in "What do you mean?". With English characters the word is spelled matlab, but is pronounced a little bit like mutlub. Let's try to get your browser to display Hindi chacters,

This video, by Anil Mihato, has an excellent explanation. YouTube video.

And here is an introduction to MATLAB in Hindi. MATLAB in Hindi.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

> %% % Matlab is the name of both a rural district in central Bangladesh % and it largest city. The population of the district is only about a % quarter of a million people, compared to over 160 million in all of % Bangladesh. %% % For over 50 years, since the early 1960s, Matlab has served as important % laboratory for the study of health issues in impoverished rural areas. % The studies began with a hospital aboard a barge floating around % the cholera-prone rivers of the district. The Health and % Demographic Surveillance System (HDSS) was established in 1966. %% % Almost everyone in Matlab has a personal heath identification number % and records are kept of births, deaths, marriages, and various illnesses. % The relative isolation of the area makes it a nearly self-contained % population. Dozens of studies and hundreds of reports and academic % papers detail all aspects of public health, especially fertility, % reproductive health, maternal and child health, cause of death and % child morbidity, health equity, and effects of climate change. %% % Matlab played a central role the development of ORS, oral rehydration % salts, a home-made mixture of water, salt and molasses that prevents % dehydration in children sick with cholera. The British medical journal % _The Lancet_ described ORS as "potentially the most important medical % advance of this century." %% % % <> % % %% matlab, in Hindi % In the Hindi language, the word "matlab" means "means", as in % "What do you _mean_?". With English characters the word is spelled % _matlab_, but is pronounced a little bit like _mutlub_. % Let's try to get your browser to display Hindi chacters, % % <> %% % This video, by Anil Mihato, has an excellent explanation. % . %% % And here is an introduction to MATLAB in Hindi. % . ##### SOURCE END ##### 6a8630bce901459286954cbc2f307292 -->

My First Computer, the Burroughs 205 Datatron

Mon, 10/09/2017 - 22:30

The first computer that I seriously used was the Burroughs 205 Datatron. I actually used two different machines, one at Caltech for two years in 1959-61 and one at the University of Utah in the summer of 1960. Both the technology and the style of usage were wildly different from today's computers.

The blinking lights on the console made the B205 a featured player in many movies and TV shows during the late '50s and early '60s, including Batman and Lost in Space. Here is an animated gif on a reproduction of the console.

image credit: http://www.angelfire.com/scifi/B205

ContentsBurroughs Corporation

Wikipedia tells us that the American Arithmometer Company was founded in 1886 to manufacture and sell adding machines. The company name was changed to Burroughs Adding Machine in 1904 to honor the founder, William Seward Burroughs, who, coincidentally, was the grandfather of beat generation poet William S. Burroughs. At one time it was the largest adding machine in the country. The company was merged with Sperry Corporation to form Unisys in 1986.

In 1956 Burroughs got into the new computer business by acquiring Electro Data and began marketing their Datatron, which had been introduced two years earlier. Throughout the '50s, '60s, and '70s, Burroughs was a force in computing, particularly in the academic and banking markets. In the 1960's, IBM completely dominated the industry, but there were enough smaller companies that all together they were known as "IBM and the Seven Dwarfs."

Room Full

In 1959 the Burroughs 205 Datatron cost a few hundred thousand dollars and occupied an entire room. (Here is a 1957 price list). In the foreground of this photo are the console and an electric typewriter. In the background are a magnetic tape drive and several cabinets containing a total of about 1600 vacuum tubes.

I like to joke that this was a "personal computer". Only one person could use it at a time. We would sign up for machine time, usually in half-hour chunks, at all hours of the day or night.

Vacuum Tubes

This was before transistors and semiconductors. The primary logic unit was the electronic vacuum tube. Here is one plugin module with eight tubes. The computer worked in base 10 and this module held one decimal digit. Four of the tubes held one digit using the binary combinations 0 through 9. The binary values 10 through 15 were never used; if they showed up, it was an error condition and caused an immediate halt.

photo credit: http://www.cs.kent.edu/~rothstei/10051/history/UVa-page_files/ComputerHistory1.html

Drum

Before the magnetic core, a huge problem with early computers was memory. People knew how to produce the electronics for the arithmetic, logic and control units, but where do you store programs and data?

The 205 had drum memory. Think of the rotating drum in a small clothes dryer. Here is a photo of a machine that has not been used for years and that has not been stored carefully. The drum is at the lower right. Four registers, each containing ten of the single decimal digit modules, are at the upper left.

photo credit: http://www.angelfire.com/scifi/B205/ecd.html

The drum held 4000 ten-digit words. If you regard each digit as half a byte on a modern binary machine, that is 20K bytes. The words were stored in 20 bands of 200 words each. The drum's speed was 3570 rpm. So, on average, it would take about 1/30-th of a second to access the start of a band.

In addition to the primary drum storage, there were four "high speed" bands where a block of 20 words is duplicated ten times around the drum. This was sort of like cache memory on today's computers. One machine instruction would duplicate 20 words, usually a piece of a program, ten times around a high speed band where it could be accessed ten times as fast.

Caltech and Utah

I learned a little about the 205 during my sophomore year at Caltech in the spring of 1959, and more when I was a junior and took numerical analysis from Prof. John Todd in 1959-60. We used Caltech's 205 for some of the homework in that the class. Then I went back home to Salt Lake City for the summer of 1960. It turned out the University of Utah also had a 205 and they were happy to hire me for a summer job.

There were just four people working in Utah's computer center at the beginning of that summer -- a woman who served as director, me, and two part timers, a secretary and a grad student from EE who maintained the machine. Then, a few weeks into the summer, the director resigned. So I found myself -- two months before I turned 20 -- Acting Director of the University of Utah Computer Center.

I returned to Caltech in the fall, and had a chance to do individual study under Todd and more work on that 205 until I graduated in 1961. I did a project involving Hilbert matrices. It was my introduction to matrix computation.

Paper Tape

At first, at both Caltech and Utah, we did not have a compiler or an assembler. We wrote our programs in absolute numeric machine language and punched them on paper tape, like this.

The machine was designed so that op code 00 meant "read one word from paper tape into the current location and go to the next location". There was a button on the console that would erase the drum, set the location counter to 0000 and the op code to 00, and then start the tape reader. That served as the "operating system". It would read the entire contents of the paper tape onto the drum. When the tape ran off the end, you would turn off the reader, set the location counter back to 0000, and press start. That would begin executing the program you had just read from tape onto the drum.

Registers

There were five registers. The A register -- the accumulator -- is where arithmetic happens. It holds ten decimal digits, plus a sign bit. For example, one instruction is "Add the contents of memory location XXXX to the A register". The R register -- the remainder -- is an extension of the A register. Multiplication of two ten digit (fixed point) numbers produces a twenty digit result across A and R. Division produces a quotient in A and a remainder in R.

The C register -- the command register -- holds the instructions as they are being executed. The D register holds a single word as it is being transferred to or from the drum or external I/O equipment.

The B register -- the index register -- is only four digits wide. If the sign bit of an instruction is a 1, the contents of the B register is added to the address portion of the instruction, thereby providing an array index. The value of the B register is automatically increased or decreased by one after each indexing instruction.

Floating Point

A floating point arithmetic option was available for an additional \$12,500. (Floating point was also an option with the Intel chips used in the first PC's. The 8087 chip cost about \$150.) The floating point word has a two-digit exponent and an eight-digit fraction. So floating point numbers range from about 10^(-50) to 10^50 and have an accuracy of about 10^(-8).

Program

Here is an example program from the appendix of the machine manual. This writeup is much fancier than the programs we would actually write. It's on a coding sheet; we would just use lined yellow notebooks. We would never need to write out 6980, 6981, etc. And the Remarks are much more detailed. This example is intended to be carefully explanatory.

Skip on Minus

One of the programs that I wrote during the summer at the University of Utah involved a loop that was exactly 20 instructions long. That meant the entire loop could be loaded into the high speed memory. It didn't even need the jump instruction from the end of the loop back to the beginning. It would just continue execution from the duplicate of the code in the next block on the drum. The trouble was there was no way to exit the loop.

I needed a "Skip on Minus" instruction. There was a "Skip on Plus", but no "Skip on Minus". The EE grad student who maintained the machine was a guy named Joe. I wish I could remember his last name. Anyway, I told him I needed a "Skip on Minus" instruction. He said, "I'll one in". He disappeared behind the cabinets with circuit diagrams and a soldering iron. He called out, "the op code will be 51".

I wrote my program with a "51" instruction. It worked perfectly. And so the University of Utah had the only Burroughs 205 Datatron in the world with a "Skip on Minus" instruction.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

> % % image credit: http://www.angelfire.com/scifi/B205 %% Burroughs Corporation % Wikipedia tells us that the American Arithmometer Company was founded % in 1886 to manufacture and sell adding machines. The company name was % changed to Burroughs Adding Machine in 1904 to honor the founder, % William Seward Burroughs, who, coincidentally, was the grandfather of % beat generation poet William S. Burroughs. At one time it was the % largest adding machine in the country. The company was merged with % Sperry Corporation to form Unisys in 1986. %% % In 1956 Burroughs got into the new computer business by acquiring % Electro Data and began marketing their Datatron, which had been % introduced two years earlier. % Throughout the '50s, '60s, and '70s, Burroughs was a force in computing, % particularly in the academic and banking markets. In the 1960's, IBM % completely dominated the industry, but there were enough smaller % companies that all together they were known as "IBM and the Seven % Dwarfs." %% Room Full % In 1959 the Burroughs 205 Datatron cost a few hundred thousand dollars % and occupied an entire room. (Here is a 1957 % ). In the foreground % of this photo are the console and an electric typewriter. % In the background are a magnetic tape drive % and several cabinets containing a total of about 1600 vacuum tubes. % % <> %% % I like to joke that this was a "personal computer". Only one person % could use it at a time. We would sign up for machine time, usually % in half-hour chunks, at all hours of the day or night. %% Vacuum Tubes % This was before transistors and semiconductors. The primary logic unit % was the electronic vacuum tube. Here is one plugin module with eight % tubes. The computer worked in base 10 and this module held one decimal % digit. Four of the tubes held one digit % using the binary combinations 0 through 9. The binary values 10 through % 15 were never used; if they showed up, it was an error condition and % caused an immediate halt. % % <> % % photo credit: http://www.cs.kent.edu/~rothstei/10051/history/UVa-page_files/ComputerHistory1.html %% Drum % Before the magnetic core, a huge problem with early computers was % memory. People knew how to produce the electronics for the arithmetic, % logic and control units, but where do you store programs and data? %% % The 205 had drum memory. Think of the rotating drum in a small clothes % dryer. Here is a photo of a machine that has not been used for years % and that has not been stored carefully. The drum is at the lower right. % Four registers, each containing ten of the single decimal digit modules, % are at the upper left. % % <> % % photo credit: http://www.angelfire.com/scifi/B205/ecd.html % % The drum held 4000 ten-digit words. If you regard each digit as half % a byte on a modern binary machine, that is 20K bytes. The words were % stored in 20 bands of 200 words each. The drum's speed was 3570 rpm. % So, on average, it would take about 1/30-th of a second to access the % start of a band. %% % In addition to the primary drum storage, there were four "high speed" % bands where a block of 20 words is duplicated ten times around the drum. % This was sort of like cache memory on today's computers. One machine % instruction would duplicate 20 words, usually a piece of a program, % ten times around a high speed band where it could be accessed % ten times as fast. %% Caltech and Utah % I learned a little about the 205 during my sophomore year at Caltech % in the spring of 1959, and more when I was a junior and took numerical % analysis from Prof. John Todd in 1959-60. We used Caltech's 205 % for some of the homework in that the class. Then I went back home % to Salt Lake City for the summer of 1960. It turned out the % University of Utah also had a 205 and they were happy % to hire me for a summer job. %% % There were just four people working in Utah's computer center at the % beginning of that summer REPLACE_WITH_DASH_DASH a woman who served as director, me, and % two part timers, a secretary and a grad student from EE who maintained % the machine. Then, a few weeks into the summer, the director resigned. % So I found myself REPLACE_WITH_DASH_DASH two months before I turned 20 REPLACE_WITH_DASH_DASH Acting Director % of the University of Utah Computer Center. %% % I returned to Caltech in the fall, and had a chance to do individual % study under Todd and more work on that 205 until I graduated in 1961. % I did a project involving Hilbert matrices. It was my introduction % to matrix computation. %% Paper Tape % At first, at both Caltech and Utah, we did not have a compiler or an % assembler. We wrote our programs in absolute numeric machine language % and punched them on paper tape, like this. % % <> %% % The machine was designed so that op code 00 meant "read one word from % paper tape into the current location and go to the next location". % There was a button on the console that would erase the drum, % set the location % counter to 0000 and the op code to 00, and then start the tape reader. % That served as the "operating system". It would read the entire contents % of the paper tape onto the drum. % When the tape ran off the end, you would turn off the % reader, set the location counter back to 0000, and press start. That % would begin executing the program you had just read from tape onto % the drum. %% Registers % There were five registers. The A register REPLACE_WITH_DASH_DASH the accumulator REPLACE_WITH_DASH_DASH is % where arithmetic happens. It holds ten decimal digits, plus a sign bit. % For example, one instruction is "Add the contents of memory location % XXXX to the A register". The R register REPLACE_WITH_DASH_DASH the remainder REPLACE_WITH_DASH_DASH is an % extension of the A register. Multiplication of two ten digit % (fixed point) numbers produces a twenty digit result across A and R. % Division produces a quotient in A and a remainder in R. %% % The C register REPLACE_WITH_DASH_DASH the command register REPLACE_WITH_DASH_DASH holds the instructions as they % are being executed. The D register holds a single word as it is being % transferred to or from the drum or external I/O equipment. %% % The B register REPLACE_WITH_DASH_DASH the index register REPLACE_WITH_DASH_DASH is only four digits wide. % If the sign bit of an instruction is a 1, the contents of the B register % is added to the address portion of the instruction, thereby providing % an array index. The value of the B register is automatically increased % or decreased by one after each indexing instruction. %% Floating Point % A floating point arithmetic option was available for an additional % \$12,500. (Floating point was also an option with the Intel chips used % in the first PC's. The 8087 chip cost about \$150.) The floating point % word has a two-digit exponent and an eight-digit fraction. So floating % point numbers range from about 10^(-50) to 10^50 and have an accuracy % of about 10^(-8). %% Program % Here is an example program from the appendix of the % . This writeup is much fancier than the programs we would % actually write. It's on a coding sheet; we would just use lined yellow % notebooks. We would never need to write out 6980, 6981, etc. % And the Remarks are much more detailed. This example is intended to be % carefully explanatory. % % <> %% Skip on Minus % One of the programs that I wrote during the summer at the University of % Utah involved a loop that was exactly 20 instructions long. That meant % the entire loop could be loaded into the high speed memory. It didn't % even need the jump instruction from the end of the loop back to the % beginning. It would just continue execution from the duplicate of the % code in the next block on the drum. The trouble was there was no way % to exit the loop. %% % I needed a "Skip on Minus" instruction. There was a "Skip on Plus", % but no "Skip on Minus". The EE grad student who maintained the machine % was a guy named Joe. I wish I could remember his last name. Anyway, % I told him I needed a "Skip on Minus" instruction. He said, "I'll % one in". He disappeared behind the cabinets with circuit diagrams and % a soldering iron. He called out, "the op code will be 51". %% % I wrote my program with a "51" instruction. It worked perfectly. % And so the University of Utah had the only Burroughs 205 Datatron in the % world with a "Skip on Minus" instruction. ##### SOURCE END ##### 95f11082495b4f85b8b30cb5f23fe0b2 -->

How Far Apart Are Two Random Points in a Hypercube?

Thu, 09/28/2017 - 04:29

Two days ago I wrote about random points in a square. At the last minute I added the paragraph asking about the generalization to random points in a cube. I have to admit that I didn't check the Web to see what was known about the question.

Shortly after my post appeared, Pieterjan Robbe of KU Leuven in Belgium submitted a comment pointing out that the problem has been studied extensively. He provided several pointers. The solution even has a name; it's called Robbins Constant. I have listed a few of the references below.

ContentsRobbins Constant

Robbins Constant is the value of this six-fold integral.

$$\delta = \int_0^1 \int_0^1 \int_0^1 \int_0^1 \int_0^1 \int_0^1 \! \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2 + (x_3-y_3)^2} \ \mathrm{d}x_1 \mathrm{d}y_1 \mathrm{d}x_2 \mathrm{d}y_2 \mathrm{d}x_3 \mathrm{d}y_3 $$

A change of variables makes it a triple integral.

$$\delta = 8 \int_0^1\int_0^1 \int_0^1 \! \sqrt{x^2 + y^2 + z^2} \ (1-x)(1-y)(1-z) \ \mathrm{d}x \mathrm{d}y \mathrm{d}z$$

Numerical Integration

Numerical integration with the default tolerances gets about nine decimal places.

format long F = @(x,y,z) sqrt(x.^2+y.^2+z.^2).*(1-x).*(1-y).*(1-z) delta = 8*integral3(F,0,1,0,1,0,1) F = function_handle with value: @(x,y,z)sqrt(x.^2+y.^2+z.^2).*(1-x).*(1-y).*(1-z) delta = 0.661707182095901 Analytic Solution

The exact analytic solution is impressive. I can't explain where it comes from.

$$\delta = \frac{1}{105}(4 + 17\sqrt{2} - 6\sqrt{3} + 21\log{(1+\sqrt{2})} + 84\log{(1+\sqrt{3})} - 42\log{(2)} - 7\pi)$$

Here is the numeric value.

delta = (1/105)*(4 + 17*sqrt(2) - 6*sqrt(3) + 21*log(1+sqrt(2)) + ... 84*log(1+sqrt(3)) - 42*log(2) - 7*pi) delta = 0.661707182267176 Hypercubes and Vectorize

I also received email from MathWorks colleague Matt Tearle informing me about vecnorm, a new function that is in MATLAB Release R2017b. Unlike the norm function that computes matrix norms, vecnorm treats the elements along a specified dimension as vectors and calculates the norm of each vector. Here is the documentation page.

I don't yet have R2017b on this computer, but Matt's email prompted me to think about vectorizing the simulation, even without vecnorm. While we're at it, let's generalize to d dimensional hypercubes. And, I'll make it a one-liner.

n = 10000000; % n samples. for d = 1:10 % d dimensions. delta = mean(sqrt(sum((rand(n,d)-rand(n,d)).^2,2))); fprintf('%6d %7.4f\n',d,delta) end 1 0.3335 2 0.5215 3 0.6617 4 0.7778 5 0.8786 6 0.9692 7 1.0517 8 1.1282 9 1.1997 10 1.2675 Thanks

Thanks to Pieterjan Robbe and Matt Tearle.

References

D. P. Robbins, Problem 2629, Average distance between two points in a box, Amer. Mathematical Monthly, 85.4, 1978, p. 278.

D. H. Bailey, J. M. Borwein and R. E. Crandall, Advances in the Theory of Box Integrals, <http://crd-legacy.lbl.gov/~dhbailey/dhbpapers/BoxII.pdf>.

Johan Philip, The Probability Distribution of the Distance Between Two Random Points in a Box, https://people.kth.se/~johanph/habc.pdf.

Eric W. Weisstein, Hypercube Line Picking. MathWorld, <http://mathworld.wolfram.com/HypercubeLinePicking.html>.

N. J. A. Sloane, editor, The On-Line Encyclopedia of Integer Sequences, Decimal Expansion of Robbins Constant, <http://oeis.org/A073012>.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. At the last minute I added the paragraph % asking about the generalization to random points in a cube. % I have to admit that I didn't check the Web to see what was known % about the question. % % Shortly after my post appeared, Pieterjan Robbe of KU Leuven in Belgium % submitted a comment pointing out that the problem has been studied % extensively. He provided several pointers. The solution even has % a name; it's called Robbins Constant. I have listed a few of the % references below. %% Robbins Constant % Robbins Constant is the value of this six-fold integral. % % $$\delta = \int_0^1 \int_0^1 \int_0^1 \int_0^1 \int_0^1 \int_0^1 \! \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2 + (x_3-y_3)^2} \ \mathrm{d}x_1 \mathrm{d}y_1 \mathrm{d}x_2 \mathrm{d}y_2 \mathrm{d}x_3 \mathrm{d}y_3 $$ % % A change of variables makes it a triple integral. % % $$\delta = 8 \int_0^1\int_0^1 \int_0^1 \! \sqrt{x^2 + y^2 + z^2} \ (1-x)(1-y)(1-z) \ \mathrm{d}x \mathrm{d}y \mathrm{d}z$$ %% Numerical Integration % Numerical integration with the default tolerances gets about nine % decimal places. format long F = @(x,y,z) sqrt(x.^2+y.^2+z.^2).*(1-x).*(1-y).*(1-z) delta = 8*integral3(F,0,1,0,1,0,1) %% Analytic Solution % The exact analytic solution is impressive. I can't explain where it % comes from. % % $$\delta = \frac{1}{105}(4 + 17\sqrt{2} - 6\sqrt{3} + 21\log{(1+\sqrt{2})} + % 84\log{(1+\sqrt{3})} - 42\log{(2)} - 7\pi)$$ %% % Here is the numeric value. delta = (1/105)*(4 + 17*sqrt(2) - 6*sqrt(3) + 21*log(1+sqrt(2)) + ... 84*log(1+sqrt(3)) - 42*log(2) - 7*pi) %% Hypercubes and Vectorize % I also received email from MathWorks colleague Matt Tearle informing % me about |vecnorm|, a new function that is in MATLAB Release R2017b. % Unlike the |norm| function that computes matrix norms, |vecnorm| % treats the elements along a specified dimension as vectors and % calculates the norm of each vector. % % the documentation page. %% % I don't yet have R2017b on this computer, but Matt's email prompted % me to think about vectorizing the simulation, even without |vecnorm|. % While we're at it, let's generalize to |d| dimensional hypercubes. % And, I'll make it a one-liner. n = 10000000; % n samples. for d = 1:10 % d dimensions. delta = mean(sqrt(sum((rand(n,d)-rand(n,d)).^2,2))); fprintf('%6d %7.4f\n',d,delta) end %% Thanks % Thanks to Pieterjan Robbe and Matt Tearle. %% References % D. P. Robbins, Problem 2629, Average distance between two points in a box, % Amer. Mathematical Monthly, 85.4, 1978, p. 278. % % D. H. Bailey, J. M. Borwein and R. E. Crandall, % Advances in the Theory of Box Integrals, % . % % Johan Philip, The Probability Distribution of the Distance Between % Two Random Points in a Box, % . % % Eric W. Weisstein, Hypercube Line Picking. MathWorld, % . % % N. J. A. Sloane, editor, The On-Line Encyclopedia of Integer Sequences, % Decimal Expansion of Robbins Constant, % . ##### SOURCE END ##### c1cdb632a661445c81ccd107144f0003 -->

How Far Apart Are Two Random Points in a Square?

Tue, 09/26/2017 - 01:02

How far apart can you expect two points chosen at random in the unit square to be? I found this problem on the YouTube channel maintained by Presh Talwalkar, Mind Your Decisions. He correctly calls it a very hard puzzle. At first, I guessed the answer might be $1/2$. But the correct answer is more interesting than that.

ContentsSimulation

Let's do a Monte Carlo simulation to get a numerical estimate. Sampling one million pairs of points doesn't take much time.

n = 1000000; sum = 0; rng(0) for k = 1:n x = rand(1,2); y = rand(1,2); delta = norm(x-y); sum = sum + delta; end format short delta = sum/n delta = 0.5214

It turns out that this run of the simulation generates a result that is accurate to the four digit precision of format short. But can we find the exact value?

Quadruple Integral

The expected distance, $\delta$, can be expressed as this quadruple integral, but the Symbolic Toolbox cannot find a closed form.

$$\delta = \int_0^1 \int_0^1 \int_0^1 \int_0^1 \! \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2} \ \mathrm{d}x_1 \mathrm{d}y_1 \mathrm{d}x_2 \mathrm{d}y_2$$

Double Integral

Make the substitutions $x = x_1 - y_1$ and $y = x_2 - y_2$ and consider the integral over the four regions where these variables are positive or negative. The four integrals are equal to each other and we obtain this double integral.

$$\delta = 4 \int_0^1 \int_0^1 \! \sqrt{x^2 + y^2} \ (1-x)(1-y) \ \mathrm{d}x \mathrm{d}y$$

Numerical Integration

Let's tackle this double integral numerically.

F = @(x,y) 4*sqrt(x.^2+y.^2).*(1-x).*(1-y); delta = integral2(F,0,1,0,1) delta = 0.5214 Polar Coordinates

Switch to polar coordinates, $r$ and $\theta$. The $\sqrt{x^2+y^2}$ term is simply $r$ and the double integral has two equal halves about the $45^o$ line, $\theta = \pi/4$.

$$\delta/8 = \int_0^{\pi/4} \int_0^{\sec{\theta}} r (1-r\cos{\theta}) (1-r\sin{\theta}) \ r \mathrm{d}r \mathrm{d}\theta$$

Symbolic Integration

The integrand is a polynomial in $r$.

syms r theta real F = expand(r^2*(1-r*cos(theta))*(1-r*sin(theta))) F = r^2 - r^3*sin(theta) - r^3*cos(theta) + r^4*cos(theta)*sin(theta)

The Toolbox can integrate this polynomial easily.

inner = int(F,r,0,sec(theta)) inner = 1/(12*cos(theta)^3) - sin(theta)/(20*cos(theta)^4)

The Toolbox can also do the outer integral over $\theta$.

outer = int(inner,theta,0,pi/4) outer = log(2^(1/2) + 1)/24 + 2^(1/2)/120 + 1/60

Multiply by 8.

delta = 8*outer delta = log(2^(1/2) + 1)/3 + 2^(1/2)/15 + 2/15

Generate a latex representation for $\delta$ to cut and paste later into this post.

latex(delta); Numerical value format long delta = double(delta) delta = 0.521405433164721 This Is My Final Answer

Here is the result.

$$\delta = \frac{\ln\left(\sqrt{2}+1\right)}{3}+\frac{\sqrt{2}}{15}+\frac{2}{15}$$

Three Dimensions

What about three dimensions? How far apart can you expect two points chosen at random in the unit cube to be? I'll leave that as a challenge and invite anyone who thinks they know the answer to post a comment.

Thanks

Thanks to Presh Talwalkar for this little nugget.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. % He correctly calls it . % At first, I guessed the answer might be $1/2$. But the correct % answer is more interesting than that. % % <> %% Simulation % Let's do a Monte Carlo simulation to get a numerical estimate. % Sampling one million pairs of points doesn't take much time. n = 1000000; sum = 0; rng(0) for k = 1:n x = rand(1,2); y = rand(1,2); delta = norm(x-y); sum = sum + delta; end format short delta = sum/n %% % It turns out that this run of the simulation generates a result % that is accurate to the four digit precision of |format short|. % But can we find the exact value? %% Quadruple Integral % The expected distance, $\delta$, can be expressed as this quadruple % integral, but the Symbolic Toolbox cannot find a closed form. % % $$\delta = \int_0^1 \int_0^1 \int_0^1 \int_0^1 \! \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2} \ \mathrm{d}x_1 \mathrm{d}y_1 \mathrm{d}x_2 \mathrm{d}y_2$$ %% Double Integral % Make the substitutions $x = x_1 - y_1$ and $y = x_2 - y_2$ and consider % the integral over the four regions where these variables are positive % or negative. The four integrals are equal to each other and we obtain % this double integral. % % $$\delta = 4 \int_0^1 \int_0^1 \! \sqrt{x^2 + y^2} \ (1-x)(1-y) \ \mathrm{d}x \mathrm{d}y$$ %% Numerical Integration % Let's tackle this double integral numerically. F = @(x,y) 4*sqrt(x.^2+y.^2).*(1-x).*(1-y); delta = integral2(F,0,1,0,1) %% Polar Coordinates % Switch to polar coordinates, $r$ and $\theta$. The $\sqrt{x^2+y^2}$ % term is simply $r$ and the double integral has two equal halves about % the $45^o$ line, $\theta = \pi/4$. % % $$\delta/8 = \int_0^{\pi/4} \int_0^{\sec{\theta}} r (1-r\cos{\theta}) (1-r\sin{\theta}) \ r \mathrm{d}r \mathrm{d}\theta$$ %% Symbolic Integration % The integrand is a polynomial in $r$. syms r theta real F = expand(r^2*(1-r*cos(theta))*(1-r*sin(theta))) %% % The Toolbox can integrate this polynomial easily. inner = int(F,r,0,sec(theta)) %% % The Toolbox can also do the outer integral over $\theta$. outer = int(inner,theta,0,pi/4) %% % Multiply by 8. delta = 8*outer %% % Generate a |latex| representation for $\delta$ to cut and paste later % into this post. latex(delta); %% Numerical value format long delta = double(delta) %% This Is My Final Answer % Here is the result. % % $$\delta = % \frac{\ln\left(\sqrt{2}+1\right)}{3}+\frac{\sqrt{2}}{15}+\frac{2}{15}$$ %% Three Dimensions % What about three dimensions? % How far apart can you expect two points chosen at random in the unit % cube to be? % I'll leave that as a challenge and invite anyone who thinks they know % the answer to post a comment. %% Thanks % Thanks to Presh Talwalkar for this little nugget. ##### SOURCE END ##### 4605bcb308bf4a25bc7dcabdf3c3a242 -->

A Logo Zoetrope

Thu, 09/14/2017 - 02:47
A Logo Zoetrope

A quick post to point my readers to Ned Gulley's post in MATLAB Community, A Galloping Logo Zoetrope. Don't know what a Zoetrope is? Ned explains.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

. Don't know what a _Zoetrope_ is? % Ned explains. ##### SOURCE END ##### 8b8b9a9f6ea84a5f99ae99beb8679770 -->

Brad Efron’s Nontransitive Dice

Mon, 09/11/2017 - 22:30

Do not get involved in a bar bet with Brad Efron and his dice until you have read this.

ContentsBrad Efron

Bradley Efron is the Max H. Stein Professor of Humanities and Sciences, Professor of Statistics, and Professor of Biostatistics at Stanford. He is best known for his invention of the bootstrap resampling technique, which is a fundamental tool in computational statistics. In 2005, he received the National Medal of Science. But I think of him for something much less esoteric -- his nontransitive dice. And I also remember him as one of my roommates when we were both students at Caltech.

Brad's wager

If you come across Brad in a bar some night, he might offer you the following wager. He would show you four individual dice, like these.

Photo credit: http://www.grand-illusions.com

You would get to pick one of the dice. He would then pick one for himself. You would then roll the dice a number of times, with the player rolling the highest number each time winning a dollar. (Rolling the same number is a push. No money changes hands.) Do you want to play? I'll reveal the zinger -- no matter which of the four dice you choose, Brad can pick one that will win money from you in the long run.

Transitivity

A binary mathematical relationship, $\rho$, is said to be transitive if $x \ \rho \ y$ and $y \ \rho \ z$ implies $x \ \rho \ z$. Equality is the obvious example. If $x = y$ and $y = z$ then $x = z$. Other examples include greater than, divides, and subset.

On the other hand, a relationship is nontransitive if the implication does not necessarily hold. Despite their efforts to make it transitive, Facebook friend is still nontransitive.

Ordinarily, the term dominates, or beats is transitive. And, ordinarily, you might think this applies to dice games. But, no, such thinking could cost you money. Brad's dice are nontransitive.

Brad's dice

Here are the six values on the faces of Brad's four dice.

A = [4 4 4 4 0 0]; B = [3 3 3 3 3 3]; C = [6 6 2 2 2 2]; D = [5 5 5 1 1 1];

How do these compare when paired up against each other? It is obvious that A wins money from B because B rolls a 3 all of the time, while A rolls a winning 4 two-thirds of the time. The odds are 2:1 in A's favor. Similarly B's constant 3's dominate C's 2's two-thirds of the time. The odds are 2:1 in B's favor.

Comparing C with D is more complicated because neither rolls a constant value.. We have to compare each possible roll of one die with all the possible rolls of the other. That's 36 comparisons. The comparison is facilitated by constructing the following matrix. The comparison of a row vector with a column vector is allowed under MATLAB's new rules of singleton expansion.

Z = (C > D') - (C < D') Z = 1 1 -1 -1 -1 -1 1 1 -1 -1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

The result is a matrix Z with +1 where C beats D and a -1 where D beats C. There is a submatrix of -1s in the upper right, but no solid column or row. All that remains is to count the number of +1's and -1's to get the odds.

odds = [nnz(Z == +1) nnz(Z == -1)] odds = 24 12

There are twice as many +1s as -1s, so, again, the odds are 2:1 that C beats D.

But now how does D fair against A?

Z = (D > A') - (D < A') odds = [nnz(Z == +1) nnz(Z == -1)] Z = 1 1 1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 odds = 24 12

Again, there are twice as many +1s as -1s. The odds are 2:1 in D's favor.

To summarize, A beats B, B beats C, C beats D, and D beats A. No matter which die you choose, Brad can pick one that will beat it -- and by a substantial margin. It's like "Rock, Paper, Scissors" with a fourth option, but you have to make the first move.

Compare

Let's capture that comparison in a function. The function can print and label the matrix Z, as well as provide for pushes, which we'll need later.

type compare function odds = compare(X,Y) Z = (X > Y') - (X < Y'); odds = [nnz(Z==1) nnz(Z==0) nnz(Z==-1)]; fprintf('\n X %3d %5d %5d %5d %5d %5d\n',X) fprintf(' Y\n') fprintf(' %5d %5d %5d %5d %5d %5d %5d\n',[Y' Z]') end

Use the function to compare A with C.

AvC = compare(A,C) X 4 4 4 4 0 0 Y 6 -1 -1 -1 -1 -1 -1 6 -1 -1 -1 -1 -1 -1 2 1 1 1 1 -1 -1 2 1 1 1 1 -1 -1 2 1 1 1 1 -1 -1 2 1 1 1 1 -1 -1 AvC = 16 0 20

Dividing out a common factor, that is odds of 4:5 against A.

The final possible comparison is B versus D. B's unimaginative play results in complete rows of +1 or -1.

BvD = compare(B,D) X 3 3 3 3 3 3 Y 5 -1 -1 -1 -1 -1 -1 5 -1 -1 -1 -1 -1 -1 5 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 BvD = 18 0 18

So B and D are an even match. Here is the complete summary of all possible pairings among Efron's Dice,

A B C D A 2:1 4:5 1:2 B 1:2 2:1 1:1 C 5:4 1:2 2:1 D 2:1 1:1 1:2Simulate

In case we don't believe these comparisons, or because we'd like to see the shape of the distribution, or just because it's fun, let's do some Monte Carlo simulations. Because 108 is divisible by 36, I'll make 108 rolls one game and simulate 10000 games. Here is the code.

type simulate function simulate(X,Y) m = 108; % Number of rolls in a single game. n = 10000; % Number of games in a simulation. s = zeros(1,n); for k = 1:n for l = 1:m i = randi(6); j = randi(6); if X(i) > Y(j) s(k) = s(k) + 1; elseif X(i) < Y(j) s(k) = s(k) - 1; % else X(i) == Y(j) % push counts as a roll end end end clf shg histogram(s) line([0 0],get(gca,'ylim'),'color','k','linewidth',2) title(sprintf('mean = %6.2f',mean(s))) end

I'll initialize the random number generator so that I get the same results every time I publish this post.

rng(1)

Now verify that A beats B.

simulate(A,B);

Sure enough, the 2:1 odds imply that A can be expected to be ahead by 108/3 dollars after 108 flips. There is a vertical black line at zero to emphasize that this isn't a fair game.

How about those 4:5 odds for A versus C? A can be expected to lose 108/9 dollars with 108 flips.

simulate(A,C) Three dice nontransitive set

Efron's dice were introduced in a Scientific American article by Martin Gardner in 1970. Since then a number of people have found other sets of nontransitive dice with interesting properties. The Wikipedia article on nontransitive dice has an abundance of information, including the following set.

A = [1 1 3 5 5 6]; B = [2 3 3 4 4 5]; C = [1 2 2 4 6 6];

This set is very close to a conventional set. The sum of the numbers on each die is 21, the same as a conventional die. And each die differs from a standard 1:6 die in only two places. Nevertheless, the set is nontransitive.

Since this is so close to a conventional set, we expect the odds to be closer to even. Let's see.

AvB = compare(A,B) X 1 1 3 5 5 6 Y 2 -1 -1 1 1 1 1 3 -1 -1 0 1 1 1 3 -1 -1 0 1 1 1 4 -1 -1 -1 1 1 1 4 -1 -1 -1 1 1 1 5 -1 -1 -1 0 0 1 AvB = 17 4 15

Since the dice have matching values, we have some zeros in the matrix. A has only 17 winning values compared to B's 15.

simulate(A,B)

Here is the result all three pairwise comparisons.

A B C A 17:15 15:17 B 15:17 17:15 C 17:15 15:17 \n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

> % % Photo credit: http://www.grand-illusions.com % % You would get to pick one of the dice. He would then pick one for % himself. You would then roll the dice a number of times, with the % player rolling the highest number each time winning a dollar. % (Rolling the same number is a _push_. No money changes hands.) % Do you want to play? I'll reveal the zinger REPLACE_WITH_DASH_DASH no matter which of % the four dice you choose, Brad can pick one that will win money from % you in the long run. %% Transitivity % A binary mathematical relationship, $\rho$, is said to be _transitive_ % if $x \ \rho \ y$ and $y \ \rho \ z$ implies $x \ \rho \ z$. % _Equality_ is the obvious example. If $x = y$ and $y = z$ then $x = z$. % Other examples include _greater than_, _divides_, and _subset_. %% % On the other hand, a relationship is _nontransitive_ if the % implication does not necessarily hold. Despite their efforts to % make it transitive, _Facebook friend_ is still nontransitive. %% % Ordinarily, the term _dominates_, or _beats_ is % transitive. And, ordinarily, you might think this applies to % dice games. But, no, such thinking could cost you money. % Brad's dice are _nontransitive_. %% Brad's dice % Here are the six values on the faces of Brad's four dice. A = [4 4 4 4 0 0]; B = [3 3 3 3 3 3]; C = [6 6 2 2 2 2]; D = [5 5 5 1 1 1]; %% % How do these compare when paired up against each other? % It is obvious that |A| wins money from |B| because |B| rolls a % 3 all of the time, while |A| rolls a winning 4 two-thirds of the % time. The odds are 2:1 in |A|'s favor. Similarly |B|'s constant % 3's dominate |C|'s 2's two-thirds of the time. The odds are 2:1 % in |B|'s favor. %% % Comparing |C| with |D| is more complicated because neither rolls % a constant value.. We have to compare each possible % roll of one die with all the possible rolls of the other. That's % 36 comparisons. The comparison is facilitated by constructing % the following matrix. % The comparison of a row vector with a column vector is allowed under % MATLAB's new rules of _singleton expansion_. Z = (C > D') - (C < D') %% % The result is a matrix |Z| % with +1 where |C| beats |D| and a -1 where |D| beats |C|. There is a % submatrix of -1s in the upper right, but no solid column or row. % All that remains is to count the number of +1's and -1's to get the odds. odds = [nnz(Z == +1) nnz(Z == -1)] %% % There are twice as many +1s as -1s, so, again, the odds are 2:1 that % |C| beats |D|. %% % But now how does |D| fair against |A|? Z = (D > A') - (D < A') odds = [nnz(Z == +1) nnz(Z == -1)] %% % Again, there are twice as many +1s as -1s. The odds are 2:1 in |D|'s % favor. %% % To summarize, |A| beats |B|, |B| beats |C|, |C| beats |D|, and % |D| beats |A|. No matter which die you choose, Brad can pick % one that will beat it REPLACE_WITH_DASH_DASH and by a substantial margin. It's % like "Rock, Paper, Scissors" with a fourth option, but you have % to make the first move. %% Compare % Let's capture that comparison in a function. % The function can print and label the matrix |Z|, as well as % provide for pushes, which we'll need later. type compare %% % Use the function to compare |A| with |C|. AvC = compare(A,C) %% % Dividing out a common factor, that is odds of 4:5 against |A|. %% % The final possible comparison is |B| versus |D|. |B|'s unimaginative % play results in complete rows of +1 or -1. BvD = compare(B,D) %% % So |B| and |D| are an even match. Here is the complete summary % of all possible pairings among Efron's Dice, %% % A B C D % A 2:1 4:5 1:2 % B 1:2 2:1 1:1 % C 5:4 1:2 2:1 % D 2:1 1:1 1:2 %% Simulate % In case we don't believe these comparisons, or because we'd like to see % the shape of the distribution, or just because it's fun, % let's do some Monte Carlo simulations. Because 108 is divisible % by 36, I'll make 108 rolls one game and simulate 10000 games. % Here is the code. type simulate %% % I'll initialize the random number generator so that I get the same % results every time I |publish| this post. rng(1) %% % Now verify that |A| beats |B|. simulate(A,B); %% % Sure enough, the 2:1 odds imply that |A| can be expected to be % ahead by 108/3 dollars after 108 flips. There is a vertical % black line at zero to emphasize that this isn't a fair game. %% % How about those 4:5 odds for |A| versus |C|? |A| can be expected % to lose 108/9 dollars with 108 flips. simulate(A,C) %% Three dice nontransitive set % Efron's dice were introduced in a _Scientific American_ article % by Martin Gardner in 1970. Since then a number of people have % found other sets of nontransitive dice with interesting properties. % The has an abundance of information, % including the following set. A = [1 1 3 5 5 6]; B = [2 3 3 4 4 5]; C = [1 2 2 4 6 6]; %% % This set is very close to a conventional set. The sum of the % numbers on each die is 21, the same as a conventional die. % And each die differs from a standard |1:6| die in only two % places. Nevertheless, the set is nontransitive. %% % Since this is so close to a conventional set, we expect the % odds to be closer to even. Let's see. AvB = compare(A,B) %% % Since the dice have matching values, we have some zeros in % the matrix. |A| has only 17 winning values compared to |B|'s 15. simulate(A,B) %% % Here is the result all three pairwise comparisons. % % A B C % A 17:15 15:17 % B 15:17 17:15 % C 17:15 15:17 ##### SOURCE END ##### fbd59e85e13747dcaaa91b54ebefd563 -->

C^5, Cleve’s Corner Collection Card Catalog

Mon, 08/28/2017 - 22:30

I have been writing books, programs, newsletter columns and blogs since 1990. I have now collected all of this material into one repository. Cleve's Corner Collection consists of 458 "documents", all available on the internet. There are

  • 150 posts from Cleve's Corner blog.
  • 43 columns from Cleve's Corner News and Notes edition.
  • 33 chapters from two books, Experiments with MATLAB and Numerical Computing with MATLAB.
  • 218 programs from Cleve's Laboratory, EXM and NCM.
  • 14 video transcripts from MIT Open Courseware and elsewhere.

C^5 is an app, a search tool that acts like a traditional library card catalog. It allows you to do keyword based searches through the collection and follow links to the material on the net. Responses to queries are ordered by the scores generated by Latent Semantic Indexing, LSI, which employs the singular value decomposition of a term-document matrix of key word counts.

ContentsOpening figure

Here is the opening window for C^5.

c5

Enter a query, usually just a single key word, in the edit box at the top. This is a term. The names of the various documents that are relevant to the term are then displayed, one at a time, in the document box.

The arrow keys allow the document list to be scanned and changed. The LSI score determines the ordering of the list. The term count is the number of times, if any, that the query term appears in the document. The web button accesses a copy of the document on the internet.

Don Knuth

For my first example, let's search for material I have written that mentions Stanford Emeritus Professor of Computer Science, Donald Knuth. Enter "knuth" in the query box or on the command line.

c5 knuth

The first document name "blog/c5_blog.m" refers to this blog post, so there is a bit of self reference happening here. The document suffix is .m because the source texts for my blog are MATLAB programs processed by the publish command.

The term count "10, 10/29" indicates that "knuth" appears 10 times in this document and that so far we have seen 10 out of the 29 times that "knuth" appears in the entire collection.

Click on the log button and then click a dozen or so times on the right arrow. This log of the successive displays is printed in the command window. Document names, dates, and term counts are displayed in decreasing order of LSI score.

% knuth % arrow document term counts lsi date % blog/c5_blog.m 10 10/29 0.594 28-Aug-2017 % > blog/easter_blog.m 5 15/29 0.553 18-Mar-2013 % > blog/lambertw_blog.m 3 18/29 0.183 02-Sep-2013 % > news/stiff_equations.txt 2 20/29 0.182 01-May-2003 % > blog/hilbert_blog.m 2 22/29 0.139 02-Feb-2013 % > blog/random_blog.m 2 24/29 0.139 17-Apr-2015 % > exmm/easter.m 1 25/29 0.112 2016 % > blog/gef.m 3 28/29 0.100 07-Jan-2013 % > ncmm/ode23tx.m 0 28/29 0.086 2016 % > news/normal_behavior.txt 0 28/29 0.070 01-May-2001 % > blog/magic_2.m 0 28/29 0.059 05-Nov-2012 % .......... % >> blog/denorms.m 1 29/29 0.010 21-Jul-2014 .........

The second most relevant document, "easter_blog.m", is a post from 2013 that describes an algorithm, popularized by Knuth, for computing the date each year in the Western, or Gregorian Calendar that Easter Sunday is celebrated. The term count is "5, 15/29", so the first two documents account for slightly over half of the total appearances of the search term.

The next six lines tell us that "knuth" appears in blog posts about the Lambert W function, Hilbert matrices, random numbers, and George Forsythe (gef), as well as a MATLAB News and Notes column in 2003 about stiff differential equations, and the actual MATLAB program from EXM for computing the date of Easter.

The following results with term counts of zero are blog posts that do not contain "knuth", but which have LSI scores indicating they might be relevant. Finally, the blog post named "denorms" is about denormal floating point numbers. It is reached by right-clicking on the right arrow to skip over documents with term counts of zero.

c5setup

I don't know how to parse .html or .pdf files, so I have collected the original source material for everything that I have written that is now available on the web. There are .m files for the blog and MATLAB programs, .tex files for the LaTeX of the book chapters, and .txt files for the newsletter columns and transcripts of the videos. There are 458 files totaling about 3.24 megabytes of text.

I have a program, c5setup, that I run on my own laptop to extract all the individual words and produce the term-document matrix. This is a sparse matrix whose (k,j) -th entry is the number of times that the k -th term appears in the j -th document. It is saved in c5database.mat for use by the c^5 app.

This setup processing eliminates frequently occurring English language words, like "the", on a list of stopwords.

length(stopwords) ans = 177 c5database clear load c5database whos Name Size Bytes Class Attributes A 16315x458 1552776 double sparse D 458x1 40628 string L 1x1 120156 struct T 16315x1 1132930 string
  • A is the term-document matrix.
  • D is a string array of the file names in my personal repository of the source documents.
  • L is a struct containing string arrays used to generate URLs of the documents on the web.
  • T is a string array of key words or terms.
Sparsity

The sparsity of the term-document matrix is a little over one percent.

sparsity = nnz(A)/numel(A) sparsity = 0.0130 Spy

Spy plot of the first 1000 rows of the term-document matrix.

clf spy(A(1:1000,:)) Most frequent terms

The row sums are the total term counts.

ttc = sum(A,2);

Find the terms that occur at least 1000 times.

k = find(ttc >= 1000); fprintf('%-10s %6s\n',[T(k) num2str(ttc(k))]') function 1806 matlab 1407 matrix 1499 one 1262 two 1090

Surprise. I write a lot about MATLAB and matrices.

Singular values

We might as well compute all the singular values of the full matrix. It takes less than a second. It's important to use the economical version of the SVD that produces a U the same size as A. Otherwise we'd have a 16,315-by-16,315 U.

tic [U,S,V] = svd(full(A),'econ'); toc Elapsed time is 0.882556 seconds.

A logarithmic plot of the singular values shows that they do not decrease very rapidly.

clf semilogy(diag(S),'.','markersize',10) axis([-10 450 1 1000]) title('singular values') Reduced rank approximation

I wrote a post about Latent Semantic Indexing a month ago. LSI employs a reduced rank approximation to the term-document matrix. c^5 has a slider for choosing the rank. The plot of the singular values shows that the accuracy of the approximation is pretty much independent of the chosen value. Any value except very small values or large values near full rank gives an approximation good to between one and ten percent. The power of LSI does not derive from the approximation accuracy. I usually take the rank to be about half the number of columns.

n = size(A,2); k = n/2; Uk = U(:,1:k); Sk = S(1:k,1:k); Vk = V(:,1:k); relerr = norm(Uk*Sk*Vk'-A)/S(1,1) relerr = 0.0357 Arrow keys

The three arrow keys in the c^5 app can be clicked with either the left or right mouse button (or control-click on a one-button mouse).

  • left >: next document, any term count.
  • right >: next document with nonzero term count.
  • left <: previous document, any term count.
  • right <: previous document with nonzero term count.
  • left ^: use the root of the current document for the query.
  • right ^: use a random term for the query.

Repeatedly clicking the up arrow with the right button (an alt click) is a good way to browse the entire collection.

Lothar Collatz

Let's see the logs for two more examples. Lothar Collatz has a short log.

c5 Collatz % collatz % % arrow document term counts lsi date % blog/threenplus1_blog.m 9 9/19 0.904 19-Jan-2015 % >> blog/collatz_inequality.m 4 13/19 0.108 16-Mar-2015 % >> blog/c5_blog.m 5 18/19 0.075 28-Aug-2017 % >> ncm/intro.tex 1 19/19 -0.003 2004

Collatz appears in two posts from 2015, one on his 3n+1 problem and one on an elegant inequality that produces a surprising graphic, and in the section of this blog post about c^5 that you are now reading, He is also mentioned in the introduction to the NCM book, but the LSI value of very small. The double arrow at the beginning of each line signifies a right click, skipping over documents that do not mention him.

Blackjack

I have written a lot about the card game Blackjack.

c5 blackjack % blackjack % arrow document term counts lsi date % news/simulating_blackjack.txt 19 19/68 0.536 01-Oct-2012 % >> ncmm/ncmgui.m 4 23/68 0.372 2016 % >> blog/random_blog2.m 4 27/68 0.266 04-May-2015 % >> ncmm/Contents.m 2 29/68 0.244 2016 % >> blog/c5_blog.m 5 34/68 0.206 28-Aug-2017 % >> ncm/random.tex 13 47/68 0.148 2004 % >> lab/thumbnails2.m 2 49/68 0.088 2017 % >> lab/lab2.m 1 50/68 0.061 2017 % >> news/numerical_computing.txt 1 51/68 0.025 01-Jun-2004 % >> blog/lab_blog.m 1 52/68 0.004 31-Oct-2016 % >> ncmm/blackjack.m 8 60/68 -0.023 2016 % >> lab/blackjack.m 8 68/68 -0.026 2017

We can see two newsletter columns, three blogs, a portion of a book chapter, several code segments, and two copies of the blackjack app. Again, I am using right clicks.

Levenshtein distance

I recently wrote a blog post about Levenshtein Edit Distance Between Strings. If c^5 does not recognize the key word in a query, it uses Levenshtein distance to find the closest match in the term list to the unrecognized query. This easily corrects simple spelling mistakes, like missing letters. For example the missing "i" in "polynomal" is corrected to become "polynomial". And "Levenstein" becomes "levenshtein".

I received a pleasant surprise when I entered "Molar", expecting it to become "moler". Instead, I got "polar" because only one substitution is required to convert "Molar" to "polar", but two substitutions are required to turn "Molar" into "moler". (Microsoft Word spelling correction used to turn "MATLAB" into "Meatball".)

Multi-word queries

I'm not quite sure what to do with queries consisting of more than one term. What is the expected response to a query of "Wilkinson polynomial", for example? Is it documents that contain either "Wilkinson" or "polynomial"? This is what LSI would provide. But it is probably better to look for documents that contain both "Wilkinson" and "polynomial". I'm not sure how to do this.

Worse yet, I can't look for an exact match to the two-word string "Wilkinson polynomial" because the first thing the setup program does is to break text into individual words.

Stemming

This project is not finished. If I work on it any more, I am going to have learn about scraping, stemming and lemmatization of the source texts. This involves relatively simple tasks like removing possessives and plurals and more complicated tasks like combining all the words with the same root or lemma. The sentence

"the quick brown fox jumped over the lazy dog's back"

becomes

"the quick brown fox jump over the lazi dog' back"

Loren's guest blogger Toshi Takeuchi posted an article in 2015 about Latent Semantic Analysis with MATLAB. He references MATLAB code for stemming.

Parsing queries

I can imagine doing a better job of parsing queries, although I could never approach the sophistication of a system like Google or Siri.

Limitations

A significant fraction of what I have written is not prose -- it is mathematics or code. It cannot be parsed with the techniques of text analytics. For example, the source texts for the books NCM and EXM have hundreds of snippets of LaTeX like

\begin{eqnarray*} A V \eqs U \Sigma , \\ A^H U \eqs V \Sigma^H . \end{eqnarray*}

And earlier in this blog post I had

tic [U,S,V] = svd(full(A),'econ'); toc

My c5setup program now has to skip over everything like this. In doing so, it misses much the message.

Software

I had updated Cleve's Laboratory in the Central File Exchange to include c5.m and c5database.mat.

\n'); d.write(code_string); // Add copyright line at the bottom if specified. if (copyright.length > 0) { d.writeln(''); d.writeln('%%'); if (copyright.length > 0) { d.writeln('% _' + copyright + '_'); } } d.write('\n'); d.title = title + ' (MATLAB code)'; d.close(); } -->


Get the MATLAB code (requires JavaScript)

Published with MATLAB® R2017a

blog/easter_blog.m 5 15/29 0.553 18-Mar-2013 % > blog/lambertw_blog.m 3 18/29 0.183 02-Sep-2013 % > news/stiff_equations.txt 2 20/29 0.182 01-May-2003 % > blog/hilbert_blog.m 2 22/29 0.139 02-Feb-2013 % > blog/random_blog.m 2 24/29 0.139 17-Apr-2015 % > exmm/easter.m 1 25/29 0.112 2016 % > blog/gef.m 3 28/29 0.100 07-Jan-2013 % > ncmm/ode23tx.m 0 28/29 0.086 2016 % > news/normal_behavior.txt 0 28/29 0.070 01-May-2001 % > blog/magic_2.m 0 28/29 0.059 05-Nov-2012 % .......... % >> blog/denorms.m 1 29/29 0.010 21-Jul-2014 ......... %% % The second most relevant document, "easter_blog.m", is a post from 2013 % that describes an algorithm, popularized by Knuth, for computing the date % each year in the Western, or Gregorian Calendar that Easter Sunday is % celebrated. The term count is "5, 15/29", so the first two documents % account for slightly over half of the total appearances of the % search term. %% % The next six lines tell us that "knuth" appears in blog posts about % the Lambert W function, Hilbert matrices, random numbers, and % George Forsythe (gef), as well as a MATLAB News and Notes column in 2003 % about stiff differential equations, and the actual MATLAB program from % EXM for computing the date of Easter. %% % The following results with term counts of zero are blog posts that do % not contain "knuth", but which have LSI scores indicating they might be % relevant. Finally, the blog post named "denorms" is about denormal % floating point numbers. It is reached by right-clicking on the right % arrow to skip over documents with term counts of zero. %% c5setup % I don't know how to parse |.html| or |.pdf| files, so I have collected % the original source material for everything that I have written % that is now available on the web. There are |.m| files for the blog % and MATLAB programs, |.tex| files for the LaTeX of the book chapters, % and |.txt| files for the newsletter columns and transcripts of the % videos. % There are 458 files totaling about 3.24 megabytes of text. %% % I have a program, |c5setup|, that I run on my own laptop to extract % all the individual words and produce the term-document matrix. % This is a sparse matrix whose |(k,j)| -th entry is the number of times % that the |k| -th term appears in the |j| -th document. It is saved % in |c5database.mat| for use by the |c^5| app. %% % This setup processing eliminates frequently occurring English language % words, like "the", on a list of |stopwords|. length(stopwords) %% c5database clear load c5database whos %% % % * |A| is the term-document matrix. % * |D| is a string array of the file names in my personal repository % of the source documents. % * |L| is a struct containing string arrays used to % generate URLs of the documents on the web. % * |T| is a string array of key words or terms. %% Sparsity % The sparsity of the term-document matrix is a little over one percent. sparsity = nnz(A)/numel(A) %% Spy % Spy plot of the first 1000 rows of the term-document matrix. clf spy(A(1:1000,:)) %% Most frequent terms % The row sums are the total term counts. ttc = sum(A,2); %% % Find the terms that occur at least 1000 times. k = find(ttc >= 1000); fprintf('%-10s %6s\n',[T(k) num2str(ttc(k))]') %% % Surprise. I write a lot about MATLAB and matrices. %% Singular values % We might as well compute all the singular values of the full matrix. % It takes less than a second. It's important to use the economical % version of the SVD that produces a |U| the same size as |A|. % Otherwise we'd have a 16,315-by-16,315 |U|. tic [U,S,V] = svd(full(A),'econ'); toc %% % A logarithmic plot of the singular values shows that they do not % decrease very rapidly. clf semilogy(diag(S),'.','markersize',10) axis([-10 450 1 1000]) title('singular values') %% Reduced rank approximation % I wrote a post about % a month ago. LSI employs a reduced rank % approximation to the term-document matrix. |c^5| has a slider for % choosing the rank. The plot of the singular values shows that the % accuracy of the approximation is pretty much independent of the % chosen value. % Any value except very small values or large values near full rank gives % an approximation good to between one and ten percent. The power of LSI % does not derive from the approximation accuracy. I usually take the % rank to be about half the number of columns. n = size(A,2); k = n/2; Uk = U(:,1:k); Sk = S(1:k,1:k); Vk = V(:,1:k); relerr = norm(Uk*Sk*Vk'-A)/S(1,1) %% Arrow keys % The three arrow keys in the |c^5| app can be clicked with either the % left or right mouse button (or control-click on a one-button mouse). % % * left >: next document, any term count. % * right >: next document with nonzero term count. % * left <: previous document, any term count. % * right <: previous document with nonzero term count. % * left ^: use the root of the current document for the query. % * right ^: use a random term for the query. %% % Repeatedly clicking the up arrow with the right button (an alt click) % is a good way to browse the entire collection. %% Lothar Collatz % Let's see the logs for two more examples. Lothar Collatz has a % short log. c5 Collatz %% % collatz % % arrow document term counts lsi date % blog/threenplus1_blog.m 9 9/19 0.904 19-Jan-2015 % >> blog/collatz_inequality.m 4 13/19 0.108 16-Mar-2015 % >> blog/c5_blog.m 5 18/19 0.075 28-Aug-2017 % >> ncm/intro.tex 1 19/19 -0.003 2004 %% % Collatz appears in two posts from 2015, one on his |3n+1| problem and % one on an elegant inequality that produces a surprising graphic, % and in the section of this blog post about |c^5| that you % are now reading, He is also mentioned in the introduction to % the NCM book, but the LSI value of very small. The double arrow at % the beginning of each line signifies a right click, skipping over % documents that do not mention him. %% Blackjack % I have written a lot about the card game Blackjack. c5 blackjack %% % blackjack % arrow document term counts lsi date % news/simulating_blackjack.txt 19 19/68 0.536 01-Oct-2012 % >> ncmm/ncmgui.m 4 23/68 0.372 2016 % >> blog/random_blog2.m 4 27/68 0.266 04-May-2015 % >> ncmm/Contents.m 2 29/68 0.244 2016 % >> blog/c5_blog.m 5 34/68 0.206 28-Aug-2017 % >> ncm/random.tex 13 47/68 0.148 2004 % >> lab/thumbnails2.m 2 49/68 0.088 2017 % >> lab/lab2.m 1 50/68 0.061 2017 % >> news/numerical_computing.txt 1 51/68 0.025 01-Jun-2004 % >> blog/lab_blog.m 1 52/68 0.004 31-Oct-2016 % >> ncmm/blackjack.m 8 60/68 -0.023 2016 % >> lab/blackjack.m 8 68/68 -0.026 2017 %% % We can see two newsletter columns, three blogs, a portion of a book % chapter, several code segments, and two copies of the blackjack app. % Again, I am using right clicks. %% Levenshtein distance % I recently wrote a blog post about % . If |c^5| does not recognize % the key word in a query, it uses Levenshtein distance to find the % closest match in the term list to the % unrecognized query. This easily corrects simple spelling mistakes, % like missing letters. For example the missing "i" in "polynomal" is % corrected to become "polynomial". % And "Levenstein" becomes "levenshtein". %% % I received a pleasant surprise when I entered "Molar", expecting it % to become "moler". Instead, I got "polar" because only one substitution % is required to convert "Molar" to "polar", but two substitutions are % required to turn "Molar" into "moler". (Microsoft Word spelling % correction used to turn "MATLAB" into "Meatball".) %% Multi-word queries % I'm not quite sure what to do with queries consisting of more than % one term. What is the expected response to a query of "Wilkinson % polynomial", for example? Is it documents that contain _either_ % "Wilkinson" _or_ "polynomial"? This is what LSI would provide. % But it is probably better to look for documents that contain _both_ % "Wilkinson" _and_ "polynomial". I'm not sure how to do this. %% % Worse yet, I can't look for an exact match to the two-word string % "Wilkinson polynomial" because the first thing the setup program % does is to break text into individual words. %% Stemming % This project is not finished. If I work on it any more, I am going to % have learn about _scraping_, _stemming_ and _lemmatization_ of the % source texts. This involves relatively simple tasks like removing % possessives and plurals and more complicated tasks like combining % all the words with the same root or _lemma_. The sentence % % "the quick brown fox jumped over the lazy dog's back" % % becomes % % "the quick brown fox jump over the lazi dog' back" % %% % Loren's guest blogger Toshi Takeuchi % in 2015 about Latent Semantic Analysis with MATLAB. % He references % % for stemming. %% Parsing queries % I can imagine doing a better job of parsing queries, although I could % never approach the sophistication of a system like Google or Siri. %% Limitations % A significant fraction of what I have written is not prose REPLACE_WITH_DASH_DASH it is % mathematics or code. It cannot be parsed with the techniques of % text analytics. For example, the source texts for the books NCM and EXM % have hundreds of snippets of LaTeX like % % \begin{eqnarray*} % A V \eqs U \Sigma , \\ % A^H U \eqs V \Sigma^H . % \end{eqnarray*} %% % And earlier in this blog post I had % % tic % [U,S,V] = svd(full(A),'econ'); % toc %% % My |c5setup| program now has to skip over everything like this. % In doing so, it misses much the message. %% Software % I had updated % in the Central File Exchange to include |c5.m| % and |c5database.mat|. ##### SOURCE END ##### e3bb1eb330494ccb89b42865eb518eee -->