The Lindahl Letter
The Lindahl Letter
A machine learning literature review (ML syllabus edition 2/8)
1
0:00
-13:12

A machine learning literature review (ML syllabus edition 2/8)

Part 2 of 8 in the ML syllabus series
1

You can find a lot of quality explanations of the differences between the various flavors of machine learning [1]. This second lecture in the introduction to ML syllabus series should open with a series of the best literature reviews I could find and pull together to share. That will be the second part of this lecture. The third part will cover the intersection of programming languages. Some rather high quality textbooks and manuscripts exist within the field of machine learning. You can even find ones for free on GitHub and other places. Instead of starting with the obvious way to go by digging into some weighty tomes. I’m going to spend some time sharing readouts of some of the most highly cited machine learning papers. For a lot of people jumping into the field they are working on something in a different field of study and find a use case or a business related adventure that could benefit from machine learning. Typically at this point they are going to start digging into software and can get going very rapidly. That part of the journey requires no real deep dive into the relevant literature. It’s great that people can just jump in and find machine learning accessible. However, (you knew that was coming) the next phase in the journey is when people start wondering about the why and how of what is happening or they dig deep enough that they may want to know about the foundations of the technology or techniques they are using. At that point, depending on what is being done people will see a massive number of papers published and shared online. The vast majority are available to freely download and read. 

Part 1: Highly cited machine learning papers

Within this section I’m going to try to build out a collection of 10 things you could read to start getting a sense of what papers within the machine learning space are highly cited. That is not a measure of readability or how solid of a literature review for machine learning they provide. You will find that most of them do not have really lengthy literature sections. The authors make the citations they need to make for related work and jump into the main subject pretty quickly. I’m guessing that is a key part of why they are highly cited publications. To begin with; from what I can tell, the most highly cited and widely shared paper of all time in the machine learning or deep learning space has over 125,285 citations that Google Scholar is aware of and can index. That is the first paper in the list below.

1. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://arxiv.org/abs/1512.03385?context=cs 

This paper is cited a ton of times and has a pretty solid references section. If you read it after seeing the link above, then you would run into a bit of introduction on deep convolutional neural networks and then it would jump into some related work sections on residual representations, shortcut connections, and finally deep residual learning. While this paper is cited well over one hundred thousand times it is not designed to be an introduction to machine learning. It’s 12 pages and it provides a solid explanation of using deep residual learning for doing image recognition. To that end, this paper is highly on point and easy to read which is probably why so many people have cited it from 2016 to now. 

2. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260. https://www.science.org/doi/abs/10.1126/science.aaa8415

Within the start of this review you are going to get a lot more of an introduction to what machine learning involves and I’m not surprised this work is highly cited. 

3. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539 

This one is a very readable paper. It was certainly written to be widely read and is very consumable. It has 103 citations as well which is an intense number. 

4. Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). PMLR. https://arxiv.org/pdf/1502.03167.pdf 

5. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28. https://proceedings.neurips.cc/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf 

6. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf 

7. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. https://arxiv.org/pdf/1409.0473.pdf 

8. Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015). https://doi.org/10.1038/nature14236 

9. Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791. http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf 

10. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://arxiv.org/pdf/1412.6980.pdf 

During part one of this lecture I covered 10 different machine learning papers that are highly cited. My top 10 list might very well not be your top 10 list. If you have a different one, then feel free to share it as well. I'm open to criticism and alternative methods. You can work the paper and reference journey to start to get a solid understanding of machine learning. That is one way to go about getting an introduction to the field. That method involves reading key pieces of literature and as you see footnotes and references that would help fill in your knowledge you take the time to work your way backward from anchor to anchor completing a highly personalized literature review. For academics or people highly focused on a special area within the academic space this is a tried and true method for learning. People are doing it all the time in business and in graduate schools all over the world. Another method exists as well and we will explore that more next.

Part 2: General literature reviews, text books, and manuscripts about machine learning

Sometimes you just want to have all the content packaged up and provided to you as a single serving introduction to machine learning. I’m aware that within this lecture I did not elect to take that single serving path. This field of study is large enough and includes a diverse enough set of knowledge that I think you need to approach it in a variety of different ways based on your specific learning needs. To that end I broke my machine learning literature review into two distinct parts. This second part is about where you could pick up one source and get started, but hopefully it won’t be the final destination in the lifeline learning journey that is understanding the ever changing field of machine learning. For those of you who have been reading this series for sometime you know that my go to introductory text is from the field of artificial intelligence and would be Stuart Russel and Peter Norvig’s classic “Artificial Intelligence: A Modern Approach” which is in its 4th edition based on the Berkeley website [2]. I have the 3rd edition on my bookshelf that I picked up on eBay. The 4th edition has a whole section devoted to machine learning including: learning from examples, probabilistic models, deep learning, and reinforcement learning. That is certainly a popular place to start for people who are starting to dig into machine learning and probably more importantly want a solid foundation in artificial intelligence as well. 

You could go with a classic from 1997 and start with a book literally called “Machine Learning” by Tom Mitchell. Recently shared as a PDF by the author.

Mitchell, T. M. (1997). Machine learning. New York: McGraw-hill.
http://www.cs.cmu.edu/~tom/files/MachineLearningTomMitchell.pdf

Maybe you were looking for something a little newer than 1997. You could jump over to the freely available Deep Learning book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville that was published back in 2016. 

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. https://www.deeplearningbook.org/

A lot more books exist that could help give you an introduction to machine learning, but I’m going to close out with the three that I happen to like the best. That does not mean they are the only way to go about learning machine learning. 

Part 3: All the code based introduction to machine learning efforts

I’m going to let my TensorFlow bias run wild here for a moment and say that on my bookshelf right now are a few different works from Valliappa Lakshmanan. Within the TensorFlow community you will find a ton of well written and interesting sets of videos, courses, and other content that will help you dig into the field of machine learning. Outside of the TensorFlow content and the myriad of works by Lak I have a few other books on my bookshelf worth mentioning. I’m not sure why they are all published by O’Reilly, but that appears to be a theme of what made it to my bookshelf in terms of coding books. I know buying and subsequently keeping physical books is something that I do when I’m first learning something. I find it comforting to see them sitting next to me on my bookshelf in my office. 

1. Grus, J. (2019). Data science from scratch: first principles with python. O'Reilly Media. https://www.oreilly.com/library/view/data-science-from/9781492041122/ 

2. Hope, T., Resheff, Y. S., & Lieder, I. (2017). Learning tensorflow: A guide to building deep learning systems. O'Reilly Media. https://www.oreilly.com/library/view/learning-tensorflow/9781491978504/ 

3. Graesser, L., & Keng, W. L. (2019). Foundations of deep reinforcement learning: theory and practice in Python. Addison-Wesley Professional. https://www.oreilly.com/library/view/foundations-of-deep/9780135172490/ 

Part 4: Super brief conclusion

Within this brief introduction to machine learning literature review we covered the top 10 articles I think you should start out reading and then we dug into the top 3 textbooks that stood out to me. During the first lecture you might also remember that in terms of forecasting and statistics another book was recommended. It had nothing to do with machine learning, but it's a solid foundational textbook for people interested in understanding the statistics of forecasting. 

Armstrong, J. S. (Ed.). (2001). Principles of forecasting: a handbook for researchers and practitioners (Vol. 30). Boston, MA: Kluwer Academic.

Other introduction to statistical methods books exist and one of them might be right for you if you need to brush up on some of the mathematics that you will encounter within the machine learning space. Beyond that, hopefully this lecture has given you a brief introduction to the treasure trove of literature available to give you an introduction to machine learning. 

Part 5: Links and thoughts

You can spend hours and just scratch the surface of what they have posted in the Machine Learning Street Talk channel over on YouTube. Generally, I listen to the podcast version of this vs. watching it based on how I tend to consume things. Most of these videos have around 10,000 views (some a bit more and some a bit less) which is reflective of the general sea of humanity that consumes machine learning content. 

https://www.youtube.com/c/MachineLearningStreetTalk 

“MIT OpenCourseware: Introduction to Computational Thinking and Data Science, Fall 2016, Lecture 11: Introduction to Machine Learning”

“Deep Learning Basics: Introduction and Overview”

“Intro to Machine Learning (ML Zero to Hero - Part 1)”

Top 5 Tweets of the week:

Footnotes:

[1] https://www.ibm.com/cloud/blog/ai-vs-machine-learning-vs-deep-learning-vs-neural-networks 

[2] http://aima.cs.berkeley.edu/

Research Note:

You can find the files from the syllabus being built on GitHub. The latest version of the draft is being shared by exports when changes are being made. https://github.com/nelslindahlx/Introduction-to-machine-learning-syllabus-2022

What’s next for The Lindahl Letter?

  • Week 82: ML algorithms (ML syllabus edition 3/8)

  • Week 83: Machine learning Approaches (ML syllabus edition 4/8)

  • Week 84: Neural networks (ML syllabus edition 5/8)

  • Week 85: Neuroscience (ML syllabus edition 6/8)

  • Week 86: Ethics, fairness, bias, and privacy (ML syllabus edition 7/8)

  • Week 87: MLOps (ML syllabus edition 8/8)

I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.

Discussion about this podcast

The Lindahl Letter
The Lindahl Letter
Thoughts about technology (AI/ML) in newsletter form every Friday