10 MUST LOOK ARTIFICIAL INTELLIGENCE RESEARCH PAPERS SO FAR
Source – https://www.analyticsinsight.net/
Artificial intelligence research is increasingly influencing the use of technology
From our smartphones to cars and homes, artificial intelligence is increasingly touching our every walk of life. Applications of artificial intelligence have already proved disruptive across diverse industries, including manufacturing, healthcare, retail, etc. Considering these progresses, we can say artificial intelligence has evolved much impressively in recent years. Research around this technology has also surged and is impacting the way every individual and business interacts with AI technologies. Analytics Insight has listed 10 must look artificial intelligence research papers so far worth looking at now.
Adam: A Method for Stochastic Optimization
Author(s): Diederik P. Kingma, Jimmy Ba
Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, and it is computationally efficient, invariant to a diagonal rescaling of the gradients, and has little memory requirements. It is well suited for problems that are large in terms of data and parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. Adam has been adopted as a default method of optimization algorithm for all those millions of neural networks that people train nowadays.
Towards a Human-like Open-Domain Chatbot
Author(s): Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, RomalThoppilan, Zi Yang, ApoorvKulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le
This research paper presents Meena, a multi-turn open-domain chatbot that is trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize the perplexity of the next token. The researchers also propose a new human evaluation metric to capture key elements of a human-like multi-turn conversation, dubbed Sensibleness and Specificity Average (SSA).
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Author(s): Sergey Ioffe, Christian Szegedy
Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s inputs changes during training, as the parameters of the previous layers change. The researchers refer to this phenomenon as “internal covariate shift”, and address the problem by normalizing layer inputs. Batch Normalization allows the researchers to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps and surpasses the original model by a significant margin.
Large-scale Video Classification with Convolutional Neural Networks
Author(s): Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei
Convolutional Neural Networks (CNNs) have been considered as a powerful class of models for image recognition problems. Encouraged by these results, the researchers provide an extensive empirical evaluation of CNNs on large-scale video classification. This used a new dataset of 1 million YouTube videos belonging to 487 classes. Provided by IEEE Conference on Computer Vision and Pattern Recognition, this research paper has been cited by 865 times with a HIC score of 24 and a CV of 239.
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Author(s): Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh
Through this research paper around artificial intelligence, the authors point out the inadequacies of existing approaches to evaluating the performance of NLP models. The principles of behavioural testing in software engineering inspired researchers to introduce CheckList, a task-agnostic methodology for testing NLP models. It involves a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to produce a large and diverse number of test cases quickly.
Generative Adversarial Nets
Author(s): Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, SherjilOzair, Aaron Courville, YoshuaBengio
The authors in this AI research paper propose a new framework for estimating generative models via an adversarial process. They simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Author(s): Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
Advances like SPPnet and Fast R-CNN have minimized the running time of state-of-the-art detection networks, exposing region proposal computation as a bottleneck. To this context, the authors introduce a Region Proposal Network (RPN), a fully convolutional network that simultaneously predicts object bounds and abjectness scores at each position. RPN shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
A Review on Multi-Label Learning Algorithms
Author(s): Min-Ling Zhang, Zhi-Hua Zhou
Multi-label learning studies the problem where each example is represented by a single instance while associated with a set of labels simultaneously. While there has been a significant amount of progress made toward the machine learning paradigm in the past decade, this paper aims to provide a timely review on this area with an emphasis on state-of-the-art multi-label learning algorithms.
Neural Machine Translation by Jointly Learning to Align and Translate
Author(s): DzmitryBahdanau, Kyunghyun Cho, YoshuaBengio
Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belongs to a family of encoder-decoders. It involves an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation.
Mastering the game of Go with deep neural networks and tree search
Author(s): David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, and others
The paper introduces a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves in the game of Go. Go has been perceived as the most challenging of classic games for artificial intelligence. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play.