1st Edition

The Beauty of Mathematics in Computer Science

By Jun Wu Copyright 2019
    284 Pages
    by Chapman & Hall

    284 Pages
    by Chapman & Hall

    The Beauty of Mathematics in Computer Science explains the mathematical fundamentals of information technology products and services we use every day, from Google Web Search to GPS Navigation, and from speech recognition to CDMA mobile services. The book was published in Chinese in 2011 and has sold more than 600,000 copies. Readers were surprised to find that many daily-used IT technologies were so tightly tied to mathematical principles. For example, the automatic classification of news articles uses the cosine law taught in high school.

    The book covers many topics related to computer applications and applied mathematics including:

    Natural language processing

    Speech recognition and machine translation

    Statistical language modeling

    Quantitive measurement of information

    Graph theory and web crawler

    Pagerank for web search

    Matrix operation and document classification

    Mathematical background of big data

    Neural networks and Google’s deep learning

    Jun Wu was a staff research scientist in Google who invented Google’s Chinese, Japanese, and Korean Web Search Algorithms and was responsible for many Google machine learning projects. He wrote official blogs introducing Google technologies behind its products in very simple languages for Chinese Internet users from 2006-2010. The blogs had more than 2 million followers. Wu received PhD in computer science from Johns Hopkins University and has been working on speech recognition and natural language processing for more than 20 years. He was one of the earliest engineers of Google, managed many products of the company, and was awarded 19 US patents during his 10-year tenure there. Wu became a full-time VC investor and co-founded Amino Capital in Palo Alto in 2014 and is the author of eight books.

    1. Words and languages, numbers and information
       Information
       Words and numbers
       The mathematics behind language

    2. Natural language processing|From rules to statistics
       Machine intelligence
       From rules to statistics

    3. Statistical language model
       Describing language through mathematics
       Extended reading: Implementation caveats
       Higher order language models
       Training methods, zero-probability problems, and smoothing
       Corpus selection

    4. Word segmentation
       Evolution of Chinese word segmentation
       Extended reading: evaluating results
       Consistency
       Granularity

    5. Hidden Markov model
       Communication models
       Hidden Markov model
       Extended reading: HMM training

    6. Quantifying information
       Information entropy
       Role of information
       Mutual information
       Extended reading: Relative entropy

    7. Jelinek and modern language processing
       Early life
       From Watergate to Monica Lewinsky
       An old man's miracle

    8. Boolean algebra and search engines
       Boolean algebra
       Indexing

    9. Graph theory and web crawlers
       Graph theory
       Web crawlers
       Extended reading: two topics in graph theory
       Euler's proof of the Königsberg bridges
       The engineering of a web crawler

    10.PageRank: Google's democratic ranking technology
       The PageRank algorithm
       Extended reading: PageRank calculations

    11.Relevance in web search
       TF-IDF
       Extended reading: TF-IDF and information theory

    12.Finite state machines and dynamic programming: Navigation in Google Maps
       Address analysis and Finite state machines
       Global navigation and dynamic programming
       Finite state transducer

    13.Google's AK- designer, Dr Amit Singhal

    14.Cosines and news classification
       Feature vectors for news
       Vector distance
       Extended reading: The art of computing cosines
       Cosines in big data
       Positional weighting

    15.Solving classification problems in text processing with matrices
       Matrices of words and texts
       Extended reading: Singular value decomposition method and applications

    16.Information Fingerprinting and its application
       Information Fingerprint
       Applications of information Fingerprint
       Determining identical sets
       Detecting similar sets
       YouTube's anti-piracy
       Extended reading: Information Fingerprint's repeatability and SimHash
       Probability of repeated information Fingerprint
       SimHash

    17.Thoughts inspired by the Chinese TV series Plot: The mathematical principles of cryptography
       The spontaneous era of cryptography
       Cryptography in the information age

    18.Not all that glitters is gold: Search engine's anti-SPAM problem and search result authoritativeness question
       Search engine anti-SPAM
       Authoritativeness of search results
       Summary

    19.Discussion on the importance of mathematical models

    20.Don't put all your eggs in one basket: The principle of maximum entropy
       Principle of maximum entropy and maximum entropy model
       Extended reading: Maximum entropy model training

    21.Mathematical principles of pinyin input method
       Input method and coding
       How many keystrokes to type a Chinese character?
       Discussion on Shannon's First Theorem
       The algorithm of phonetic transcription
       Extended reading: Personalized language models

    22.Bloom Filters
       The principle of Bloom Filters
       Extended reading: The false alarm problem of Bloom Filters

    23.Bayesian network: Extension of Markov Chain
       Bayesian network
       Bayesian network's application in word classification
       Extended reading: Training a Bayesian network

    24.Conditional random Fields, syntactic parsing, and more
       Syntactic parsing|the evolution of computer algorithms
       Conditional random fields
       Conditional random fields' applications in other fields

    25.Andrew Viterbi and the Viterbi Algorithm
       The Viterbi algorithm
       CDMA technology: The foundation of G mobile communication

    26.God's algorithm: The expectation maximization algorithm
       Self-converged document classification
       Extended reading: Convergence of expectation-maximization algorithms

    27.Logistic regression and web search advertisement
       The evaluation of web search advertisement
       The logistic model

    28.Google Brain and artificial neural networks
       Artificial neural network
       Training an artificial neural network
       The relationship between artificial neural networks and
       Bayesian networks
       Extended reading: \Google Brain"

    29.The power of big data
       The importance of data
       Statistics and information technology
       Why we need big data

    Biography

    Jun Wu was a staff research scientist in Google who invented Google’s Chinese, Japanese, and Korean Web Search Algorithms and was responsible for many Google machine learning projects. He wrote official blogs introducing Google technologies behind its products in very simple languages for Chinese internet users from 2006-2010. The blogs had more than two million followers. He received Ph.D. in computer science from the Johns Hopkins University and had been working on speech recognition and natural language processing for more than 20 years. He was one of the earliest engineers of Google, managed many products of the company, and was awarded more than ten US patents during his ten-year tenure there. He became a full-time VC investor and co-founded Amino Capital in Palo Alto in 2014 and is the author of eight books.

    "This volume originates from a series of blog articles by the author, who works as senior staff research scientist for Google China. The blog articles have been rewritten to make them more accessible to uninitiated readers. As a result, the book contains 29 chapters which may be read independently. The aim is to provide evidence for the beauty of mathematics and the wealth of its applications to the layman . . . The volume may be quite valuable for readers who want to get some insight into how enterprises like Google achieve their performance, and how much mathematics is at work in the background of many commonplace services . . . "

    ~Dieter Riebesehl (Lüneburg), zbMath