Below you can see all the papers that our contributors have put together!
Want to help us? Add the papers that are missing
Year | Name | Authors | Topic | Description | Impact | Media | Link | More |
---|---|---|---|---|---|---|---|---|
1928 | On the Theory of Games of Strategy | John von Neumann | MATH | Introduced the concept of extensive-form games and the minimax theorem, establishing foundational principles for game theory. | Pioneering work that laid the groundwork for game theory, influencing diverse fields from economics to AI decision-making algorithms. | Contributions to the Theory of Games | Link to Paper | |
1948 | A Mathematical Theory of Communication | Claude Shannon | PROGRAMMING | Proposed the fundamental concepts of information theory, including entropy, channel capacity, and the source coding theorem. Revolutionized the understanding of communication and laid the groundwork for data compression and error correction. | Pioneering work that significantly influenced information theory, data science, and communication systems. | The Bell System Technical Journal | Link to Paper | |
1950 | Computing Machinery and Intelligence | Alan Turing | AI | Introduced the concept of the Turing Test, a benchmark for determining a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human. | Pioneering work that laid the foundation for discussions on machine intelligence and artificial general intelligence (AGI). | Mind | Link to Paper | |
1956 | The logic theory machine. A complex information processing system. | Allen Newell and Herbert Simon | MATH | This system is capable of discovering proofs for theorems in symbolic logic. | Unlike usual algorithms, it relies heavily on heuristic methods similar to those observed in human problem-solving tasks. | IRE Transactions on Information Theory | Link to Paper | |
1984 | Classification and Regression Trees | Leo Breiman | AI | Introduced the fundamental concepts of decision trees, a method widely used in data science | Fundamental contribution that has shaped the approach to data analysis through decision trees. | Journal of the American Statistical Association | Link to paper | |
1989 | A tutorial on hidden Markov models and selected applications in speech recognition | Lawrence Rabiner | MATH | Introduction to the basic theory of hidden Markov models and methods of implementation of it to distinct problems in speech recognition. | Crucial breakthrough in the application of Markov methods, more than 14000 citations in papers. | Proceedings of the IEEE | Link to Paper | |
1989 | Multilayer feedforward networks are universal approximators | Kurt Hornik, Maxwell Stinchcombe, Halbert White | AI | This study demonstrates that ordinary artificial neural networks with just a single intermediate layer and any chosen activation functions can approximate any measurable function from one n-dimensional space to another to any desired level of precision. | It is crucial that multilayer feedforward networks are a class of universal approximators. | Neural Networks | Link to Paper | |
1995 | Support-vector networks | Corinna Cortes, Vladimir Vapnik | AI | The support-vector network is a new learning machine for two-group classification problems. | This was previously implemented for the restricted case where the training data can be separated without errors, here is to non-separable training data. | Machine Learning | Link to Paper | |
1997 | Long Short-Term Memory | Sepp Hochreiter and Jürgen Schmidhuber | AI | Introduces Long Short-Term Memory (LSTM), a type of Recurrent Neural Network (RNN) architecture designed to overcome the vanishing gradient problem in traditional RNN. | LSTM address the challenge of learning dependencies over extended time periods, leading to more effective and efficient training on sequential tasks. | Neural Computation | Link to paper | |
1998 | Gradient-based learning applied to document recognition (CNN/GTN) | Yann Lecun, Léon Bottou, Yoshua Bengio and Patrick Haffner | AI | This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. | It is deployed commercially and reads several million cheques per day. | IEEE | Link to Paper | |
2009 | ImageNet Large Scale Visual Recognition Challenge | Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei | Visual Recognition, Challenge, Datasets | Provides a comprehensive overview of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), a benchmark in the field of object category classification and detection using large-scale datasets. The challenge has been conducted annually since 2010 and has garnered participation from over fifty institutions. | Advancements in Object Recognition, Large scale standardization for computer vision research datasets, Development of Deep Learning (Mostly CNNs) | Springer | Link to Paper | |
2012 | ImageNet Classification with Deep Convolutional Neural Networks | Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton | AI | Proposed the use of a Deep CNN architecture (also known as AlexNet) for image classification. Introduced important techniques such as the use of ReLU activation function, data augmentation and dropouts to prevent overfitting and training on multiple GPUs. The architecture proposed achieved a top-5 error rate of 16,4% on the ImageNet dataset and won the ILSVRC-2012 competition. | The introduction of deeper CNN led to an important turning point in computer vision tasks. AlexNet outperformed existing models at the time and established new standards in the evolution of CNN architectures. | Advances in neural information processing systems | Link to Paper | |
2013 | Efficient Estimation of Word Representations in Vector Space | Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean | Natural Language Processing | Proposed two novel model architectures (CBOW and Skip-gram) for computing dense vector representations of words (word embeddings) from a large corpus based on the surrounding words. These vectors provide state-of-the-art performance for measuring syntactic and semantic word similarities. | The introduction of Word2Vec proved the effectiveness of machine learning in NLP where pre-trained word embeddings are used as input features for downstream tasks like sentiment analysis and semantic similarity. | ArXiv | Link to Paper | |
2014 | Very Deep Convolutional Networks For Large-Scale Image Recognition | Karen Simonyan, Andrew Zisserman | AI | This paper analysed the impact of increased depth of CNN on image classification. The VGGNet improved the state-of-art performance using architectures with up to 19 layers and 3x3 convolutional kernel filters, which are smaller than the filters used in previous models. | This paper confirmed the importance of depth in visual representation and inspired the design of the following models. | International Conference on Learning Representations (ICLR) | Link to Paper | |
2014 | Neural Machine Translation by Jointly Learning to Align and Translate | Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio | AI | The paper investigate neural machine translation. The authotrs conjectured that the use of a fixed-length vector was a bottleneck in improving the performance of the classic RNN-based encoder–decoder architecture, and proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly | Proposed for the first time the attention mechanism that will pave the way to many other future architecture. | International Conference on Learning Representations (ICLR) | Link to Paper | |
2015 | Deep Residual Learning for Image Recognition | Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun | AI, Deep Learning, Computer Vision | This paper introduced a new architecture for deep neural networks (ResNet). The key concepts were Residual learning, which addressed the degradation problem in deep networks, and the use of skip connections. These innovations aid in optimizing the layers and improving their ability to find the most appropriate mappings between data. | This research have enhanced the potential of deep neural networks, allowing the training of architecture with many more layers (up to 152 layers). This has led to incredible performance and much higher accuracy in tasks such as image recognition (3.57% error on the ImageNet test set) | ArXiv | Link to paper | |
2016 | Mastering the Game of Go with Deep Neural Networks and Tree Search | David Silver et al. | AI | Presented AlphaGo, a model that defeated the European Go champion by 5 games to 0. It uses Monte Carlo Tree Search (MCTS) to compute its next move, running simulations of possible outcomes. | Demonstrated that AI can tackle complex challenges and achieve excellence in strategic games | Nature | Link to Paper | |
2017 | Attention is All You Need | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin | AI | Revealed the transformer, a new neural network that is a significant milestone in modern Deep Learning models. Shaped the way we think about and approach NLP problems. | Has had a profound impact on NLP research and applications | NIPS | Link to Paper | |
2021 | An Image is worth 16x16 words: Transformers for image recognition at scale | Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby | AI, Vision Transformers, Visual Recognition | This paper introduces ViT, a vision transformer that applies the Transformer architecture directly to sequences of image patches for image classification tasks. ViT achieves comparable or even superior performance to state-of-the-art convolutional networks while requiring less computational resources for training. This suggests that Transformers have the potential to revolutionize computer vision in the same way that they have revolutionized natural language processing. | Currently ViT’s are used from image classification tasks to text-to-image generators (DAll-E 2, Stable Diffusion) | ICLR | Link to Paper | |
2022 | Real-Time Big Data Processing and Analytics: Concepts, Technologies, and Domains | Uğur Kekevi, Ahmet Arif Aydin Gomez | DATA | The purpose of this paper is to provide researchers of real-time analysis and developers of data-intensive systems with a comparative perspective on real-time data processing by highlighting the key characteristics of real-time data processing technologies, NoSQL storage technologies, their application domains, and selected examples from previous studies. | Has had a profound impact on Data Science research and applications | Computer Science | Link to Paper | |
2023 | Muse: Text-To-Image Generation via Masked Generative Transformers | Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan | COMPUTER VISION | It’s a text-to-image transformer model trained on a masked modeling task in discrete token space | It’s way more efficient than diffusion or autoregressive models | Proceedings of Machine Learning Research | Link to Paper |