Deep LearningGoodfellow, I., Bengio, Y., & Courville, A. (2016). Chapters 7 & 8. In Deep learning. MIT Press. https://www.deeplearningbook.org
Read Section 7.4 to understand dropout, a powerful regularization method for preventing overfitting in deep networks. Then read Chapter 8 (Sections 8.1–8.4) to explore optimization algorithms such as SGD, momentum, and Adam, which are used to train deep models effectively.