MAT 6932/21525 Special Topics in Applied Mathematics (Fall 2021)

 

Introduction to Deep Learning and Its Mathematical Foundation

Objective and Description of the Course:

    Deep learning is a novel methodology currently receiving much attention and has been successfully applied to a large variety of fields, such as image classification, image and speech recognition, computer vision, natural language understanding, precision medicine, and computational biology. The underlying mathematical and computational foundation is extremely important for better understanding and development of deep learning techniques.

    The aim of this course is to prepare our students this modern computational skill for his/her future research. This course will introduce (1). basic concepts on machine learning and statistical learning theory, (2) basic structure of artificial and convolutional deep neural networks (DNNs), (3) algorithms for training DNNs, (4) numbers of popular and efficient DNNs for supervised regression and classification and their application as the case studies.

    Moreover, to have a better understanding for deep learning from mathematical aspect, the course will study some necessary concepts on convex and noncovex analysis, several first order deterministic and stochastic optimization algorithms, and the framework of optimal control. Finally, we will discuss how to design DNNs that have good learning ability from optimization or optimal control aspects. Applications in solving inverse problems and meta-learning problems will be discussed.

    The topic of this course is one of the rapid developing fields. There is no textbook available. I will provide some references (recent papers). Students presentations, discussions and projects are expected.

References:

  • LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
  • M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. Tensorflow: A system for large-scale machine learning. In OSDI, volume 16, pages 265–283, 2016.
  •  

  • J. Adler and O. Oktem, “Learned primal-dual reconstruction,” ¨ IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1322–1332, 2018.
  • M. Borgerding, P. Schniter, and S. Rangan, “ AMP-inspired deep networks for sparse linear inverse problems,” IEEE Transactions on Signal Processing, vol. 65, no. 16, pp. 4293–4308, 2017.
  • X. Chen, J. Liu, Z. Wang, and W. Yin. Theoretical linear convergence of unfolded ISTA and its practical weights and thresholds. In NIPS, pages 9061–9071, 2018.
  • Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q Weinberger. Densely Connected Convolutional Networks. CVPR, pages 2261–2269, 2017.
  • K. Hammernik, T. Klatzer, E. Kobler, M. P. Recht, D. K. Sodickson, T. Pock, and F. Knoll. Learning a variational network for reconstruction of accelerated mri data. Magnetic resonance in medicine, 79(6):3055–3071, 2018.
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in The IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  • K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In European conference on computer vision, pages 630–645, 2016.
  • Y. Li, M. Tofighi, J. Geng, V. Monga, and Y. C Eldar. Efficient and Interpretable Deep Blind Image Deblurring via Algorithm Unrolling. IEEE Trans. Comput. Imaging, vol. 6, pages 666–681, 2020.
  • T. Meinhardt, M. Moller, C. Hazirbas, and D. Cremers. Learning proximal operators: Using denoising networks for regularizing inverse imaging problems. In ICCV, pages 1781–1790, 2017.
  • Saining Xie, Ross B Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated Residual Transformations for Deep Neural Networks. CVPR, pages 5987–5995, 2017.
  • Y. Yang, J. Sun, H. Li, and Z. Xu. Deep admm-net for compressive sensing mri. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, NIPS 29, pages 10–18, 2016.
  • J. Zhang and B. Ghanem. Ista-net: Interpretable optimization-inspired deep network for image compressive sensing. In CVPR, 2018.
  • Z. Zhang, Y. Liu, J. Liu, F. Wen, and C. Zhu. Amp-net: Denoising based deep unfolding for compressive image sensing. ArXiv:2004.10078, 2020.
  • S. Zhou, Y. He, Y. Liu, and C. Li. Multi-channel deep networks for block-based image compressive sensing. ArXiv:1908.11221, 2019.
  • T. Hospedales, A. Antoniou, P. Micaelli, A. Storkey. Meta-learning in neural networks: A survey ArXiv:2004.05439, 2020.
  • M. Huisman, J.v. Rijn, A.Plaat, A survey of deep meta-learning. Artificial Intelligence Review 2021, pp. 1–59.
  •  

  • Lawrence C. Evans, An Introduction to Mathematical Optimal Control Theory, https://math.berkeley.edu/~evans/control.course.pdf
  • T. Q. Chen, Y. Rubanova, J. Bettencourt and D. K. Duvenaud, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, 6572–6583.
  • Q. Li, L. Chen, C. Tai and E. Weinan, Maximum principle based algorithms for deep learning,
    The Journal of Machine Learning Research, 18 (2017), 5998–6026.
  • M. Benning et al., Deep learning as optimal control problems: models and numerical methods, arXiv:1904.05657v3.
  •  

  • Amir Beck, First-Order Methods in Optimization, MOS-SIAM Series on Optimization, 2017.
  • Leonid I. Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total variation based noise removal algorithms. Physica D, 60(1):259-268, 1992.
  • A. Beck and M. Teboulle. A fast iterative shrinkage thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1):183–202, 2009.
  • A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vision, 40 (2011), 120–145.
  •  

     Meeting Time and Rooms:

    •   MWF 4 (Online Synchronous)
    • Office Hours: MWF 5 or by appointment
    •  

      Arrangement of the Course::

      • Unit 1: Basic Concepts of Deep Learning (Tentatively week 1-3)

      1.1. Briefs of deep learning: What is deep learning, approaches, and applications of deep learning. (week 1)

      1.2. Basic machine learning theory: assessment of accuracy, generalization, under-fitting and overfitting, bias and variance decomposition of prediction error, sample and computational complexity. (week 2)

      1.3. Regression and classification in supervised learning: Maximum likelihood estimator, Bayes rule and maximum a priori estimation, linear regression and logistic regression models and computations. (week 3)

      • Unit 2: Brief Introduction to Deep Neural Networks (DNNs) (Tentatively week 4-8)

      2.1. Artificial neural network (ANN): Components and architecture of ANNs, universality theory; (week 4-4.5)

      2.2. Convolutional neuron network (CNN): Components and architecture of CNNs, convolutional operations, dropout, batch normalization, basic architectures of CNNs; (week 4.5-6);

      2.3. DNN training: Backward propagation, minimization of loss function, gradient decent algorithm for nonconvex optimization, stochastic gradient decent algorithm, and adaptive stochastic gradient decent algorithms; (week 7)

      2.4. Case study and applications. (week 8)

      • Unit 3:  Basic Concepts on Convex/Nonconvex Analysis and Optimization Algorithm Inspired DNNs: (Tentatively week 9-13)

      3.1. Basic concept on convex/nonconvex and smooth/nonsmooth optimizations: Definition and basic properties of convex functions, L-smooth function, convex conjugate function; Optimality conditions and global/local solutions. (week 9)

      3.2. Sub-gradient and sub-differential: Definition, properties, and calculation; (week 10-10.5)

      3.3. Optimization algorithm inspired CNNs: Advantage of algorithm unrolling (or unfolded) DNNs, learnable variational models and algorithms, proximal gradient algorithm and ISTA-net, primal dual algorithm and primal-dual-net, gradient decent algorithm and Variational-net, alternating direction method of multipliers (ADMM) and ADMM-net. (week 10.5-12)

      3.4. Bi-level optimization in meta-learning: Goal and advantage of meta learning, general framework of bi-level optimization for meta-learning, methods for solving bi-level optimization problems (week 13)

      • Unit 4: Deep learning from optimal control perspective: (Tentatively week 14-16)

      4.1. Brief introduction to optimal control: Basic optimal control problem in discrete time and continuous time, Hamiltonian and Pontrygin Maximum Principle (PMP) in discrete and continuous time; (weeks 14)

      4.2. DNN from dynamical systems perspective: Neural ordinary differential equations for supervised learning, Case study: Neural Ordinary Differential Equations; (week 15)

      4.3. Deep learning from optimal control aspect: First order optimality condition from PMP, numerical methods, Relationships between back-propagation and Hamiltonian maximization and method of Lagrange multipliers for network training, applications in image analysis; (week 16)

     
    Additional Information:
     
    Grading:
    Students will be required to present one to two papers or projects related to the course content. The projects may be related to problems of particular interest to the individual student. Grades will be assigned on the basis of the presentations or projects. Current UF grading policies can be found from the following link https://catalog.ufl.edu/ugrad/current/regulations/info/grades.aspx.
     
    Honor Code: “UF students are bound by The Honor Pledge which states, “We, the members of the University of Florida community, pledge to hold ourselves and our peers to the highest standards of honor and integrity by abiding by the Honor Code. On all work submitted for credit by students at the University of Florida, the following pledge is either required or implied: “On my honor, I have neither given nor received unauthorized aid in doing this assignment.” The Honor Code specifies a number of behaviors that are in violation of this code and the possible sanctions. Furthermore, you are obligated to report any condition that facilitates academic misconduct to appropriate personnel. If you have any questions or concerns, please consult with the instructor or TAs in this class.”
     
    Class Attendance: “Requirements for class attendance and make-up exams, assignments, and other work in this course are consistent with university policies that can be found at: https://catalog.ufl.edu/ugrad/current/regulations/info/attendance.aspx.”
     
    Accommodations for Students with Disabilities: “Students with disabilities requesting accommodations should first register with the Disability Resource Center (352-392-8565, https://www.dso.ufl.edu/drc/) by providing appropriate documentation. Once registered, students will receive an accommodation letter which must be presented to the instructor when requesting accommodation. Students with disabilities should follow this procedure as early as possible in the semester.”
     
    Online Evaluations: “Students are expected to provide feedback on the quality of instruction in this course by completing online evaluations at https://evaluations.ufl.edu. Evaluations are typically open during the last two or three weeks of the semester, but students will be given specific times when they are open. Summary results of these assessments are available to students at https://evaluations.ufl.edu/results/.”
     
    Contact information for the Counseling and Wellness Center: https://counseling.ufl.edu/, 392-1575; and the University Police Department: 392-1111 or 9-1-1 for emergencies.

     
    Diversity:
    I and the department of Mathematics are committed to diversity and inclusion of all students in this course. I acknowledge, respect, and value the diverse nature, background and perspective of students and believe that it furthers academic achievements. It is our intent to present materials and activities that are respectful of diversity: race, color, creed, gender, gender identity, sexual orientation, age, religious status, national origin, ethnicity, disability, socioeconomic status, and any other distinguishing qualities.

     
    Class Recording:
    Our class sessions may be audio-visually recorded for students in the class to refer back and for enrolled students who are unable to attend live. Students who participate with their camera engaged or utilize a profile image are agreeing to have their video or image recorded. If you are unwilling to consent to have your profile or video image recorded, be sure to keep your camera off and do not use a profile image. Likewise, students who un-mute during class and participate orally are agreeing to have their voice recorded. If you are not willing to consent to have your voice recorded during class, you will need to keep your mute button activated and communicate exclusively using the “chat” feature, which allows students to type questions and comments live. The chat will not be recorded or shared. As in all courses, unauthorized recording and unauthorized sharing of recorded materials by students or any other party is prohibited.