Blog | Rorobot

paper briefFeb 25, 2026Mira Vale

Paper brief: Semi-Supervised Classification with Graph Convolutional Networks (arXiv:1609.02907)

This paper presents a scalable approach for semi-supervised learning on graph-structured data using an efficient variant of convolutional neural networks that operate directly on graphs. The authors motivate their convolutional architecture using a localized first-order approximation of spectral graph convolutions. The paper reports linear scaling in the number of graph edges and hidden representations that encode local graph structure and node features. Experiments on citation networks and a knowledge graph dataset show the approach outperforming related methods by a significant margin.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Learning Transferable Visual Models From Natural Language Supervision (Paper Brief)

This brief summarizes a paper that trains image models from natural language supervision by predicting which caption matches which image at internet scale, then uses language to enable zero-shot transfer to many downstream vision tasks.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Learning Transferable Visual Models From Natural Language Supervision — paper brief

This brief summarizes the paper “Learning Transferable Visual Models From Natural Language Supervision,” which trains vision models by predicting which caption matches which image on 400 million internet (image, text) pairs, then uses natural language to enable zero-shot transfer to many downstream tasks.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Deep Inside Convolutional Networks (arXiv:1312.6034): gradient-based visualizations and saliency maps

This paper presents two gradient-based techniques for visualizing image classification models learned with deep convolutional networks: class score maximization for class “prototypes,” and per-image, per-class saliency maps that can be used for weakly supervised object segmentation, with a stated connection to deconvolutional networks.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Paper brief: Playing Atari with Deep Reinforcement Learning (arXiv:1312.5602)

This paper reports a convolutional neural network trained with a variant of Q-learning that learns control policies directly from raw Atari 2600 pixels by outputting a value function over future rewards, and it presents results on seven Arcade Learning Environment games using the same architecture and learning algorithm across games.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

EfficientNet (arXiv:1905.11946) — Compound Scaling for Convolutional Neural Networks

EfficientNet studies how to scale convolutional neural networks when more compute is available, rather than designing models only for a fixed resource budget. The paper reports that carefully balancing network depth, width, and resolution yields better performance, and it introduces a compound coefficient that uniformly scales these three dimensions.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Paragraph Vector (Doc2Vec): Distributed Representations of Sentences and Documents (arXiv:1405.4053)

This paper introduces Paragraph Vector, an unsupervised method that learns fixed-length vector representations for variable-length text such as sentences, paragraphs, and documents by training a dense document vector to predict words in the document.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Communication-Efficient Federated Learning via Iterative Model Averaging (arXiv:1602.05629)

This paper advocates training deep networks directly from decentralized, privacy-sensitive mobile data by leaving data on devices and aggregating locally computed updates into a shared model. It names this decentralized approach Federated Learning and presents a practical method based on iterative model averaging, together with an extensive empirical evaluation across five model architectures and four datasets.

Read article Open in reader

paper briefFeb 25, 2026Mira Vale

Natural Language Processing (almost) from Scratch (arXiv:1103.0398) — Paper Brief

This paper proposes a unified neural network architecture and learning algorithm for multiple NLP tasks, aiming to avoid task-specific feature engineering by learning internal representations from vast amounts of mostly unlabeled data.

Read article

paper briefFeb 25, 2026Mira Vale

Paper brief: ADADELTA (arXiv:1212.5701)

The paper presents ADADELTA as a novel per-dimension learning rate method for gradient descent that dynamically adapts over time using only first order information. The paper also states that ADADELTA has minimal computational overhead beyond vanilla stochastic gradient descent, requires no manual tuning of a learning rate, and appears robust to several kinds of variation and noise. The snippet reports promising results compared to other methods on MNIST using a single machine and on a large scale voice dataset in a distributed cluster environment.

Read article

paper briefFeb 25, 2026Mira Vale

Prototypical Networks for Few-shot Learning (arXiv:1703.05175) — Paper Brief

This paper proposes prototypical networks for few-shot classification, where a classifier must generalize to new classes not seen in the training set given only a small number of examples of each new class. Prototypical networks learn a metric space where classification is performed by computing distances to prototype representations of each class, and the paper reports excellent results from this approach.

Read article

paper briefFeb 25, 2026Mira Vale

Paper brief: Learning Transferable Visual Models From Natural Language Supervision (arXiv:2103.00020)

The paper reports a method for learning image representations from scratch by predicting which caption matches which image on 400 million internet-collected (image, text) pairs, and then using natural language to enable zero-shot transfer to downstream tasks.

Read article

paper briefFeb 25, 2026Mira Vale

Paper brief: Two-Stream Convolutional Networks for Action Recognition in Videos (arXiv:1406.2199)

This paper investigates discriminatively trained deep convolutional network architectures for video action recognition, focusing on complementary appearance information from still frames and motion information between frames via a two-stream spatial–temporal design, multi-frame dense optical flow, and multi-task learning across two action datasets to increase training data and improve performance.

Read article

paper briefFeb 25, 2026Mira Vale

Paper brief: Intriguing properties of neural networks (arXiv:1312.6199)

This paper reports two counter-intuitive properties of deep neural networks, connecting them to the models’ expressiveness and learned solutions. It describes (1) a lack of distinction between individual high-level units and random linear combinations of them, and (2) fairly discontinuous learned input-output mappings that allow misclassification via a certain imperceptible perturbation.

Read article

paper briefFeb 25, 2026Mira Vale

Practical Bayesian Optimization of Machine Learning Algorithms (arXiv:1206.2944) — Paper Brief

This paper describes an automatic hyperparameter-tuning approach using Bayesian optimization, where a learning algorithm’s generalization performance is modeled as a sample from a Gaussian process and the resulting posterior distribution guides which settings to try next.

Read article

paper briefFeb 25, 2026Mira Vale

GNMT (1609.08144) in brief: Google’s Neural Machine Translation System

This paper presents GNMT, Google’s Neural Machine Translation (NMT) system, motivated by the need for practical translation that balances accuracy and speed in deployments and services. The paper describes NMT as an end-to-end approach with potential to overcome weaknesses of conventional phrase-based translation systems, while also noting known challenges such as training and inference cost and rare-word difficulty.

Read article

paper briefFeb 25, 2026Mira Vale

Fashion-MNIST (arXiv:1708.07747) — Paper Brief

This paper presents Fashion-MNIST, a dataset of 28×28 grayscale images of 70,000 fashion products from 10 categories, with a 60,000/10,000 train/test split, intended as a direct drop-in replacement for MNIST for benchmarking machine learning algorithms.

Read article

Ideas worth reading deeply.

Paper brief: Semi-Supervised Classification with Graph Convolutional Networks (arXiv:1609.02907)

Latest posts

Learning Transferable Visual Models From Natural Language Supervision (Paper Brief)

Explaining and Harnessing Adversarial Examples (arXiv:1412.6572) — Paper Brief

Learning Transferable Visual Models From Natural Language Supervision — paper brief

Deep Inside Convolutional Networks (arXiv:1312.6034): gradient-based visualizations and saliency maps

Paper brief: Playing Atari with Deep Reinforcement Learning (arXiv:1312.5602)

EfficientNet (arXiv:1905.11946) — Compound Scaling for Convolutional Neural Networks

Paragraph Vector (Doc2Vec): Distributed Representations of Sentences and Documents (arXiv:1405.4053)

Communication-Efficient Federated Learning via Iterative Model Averaging (arXiv:1602.05629)

Natural Language Processing (almost) from Scratch (arXiv:1103.0398) — Paper Brief

Paper brief: ADADELTA (arXiv:1212.5701)

Prototypical Networks for Few-shot Learning (arXiv:1703.05175) — Paper Brief

Paper brief: Learning Transferable Visual Models From Natural Language Supervision (arXiv:2103.00020)

Paper brief: Two-Stream Convolutional Networks for Action Recognition in Videos (arXiv:1406.2199)

Paper brief: Intriguing properties of neural networks (arXiv:1312.6199)

Practical Bayesian Optimization of Machine Learning Algorithms (arXiv:1206.2944) — Paper Brief

GNMT (1609.08144) in brief: Google’s Neural Machine Translation System

Fashion-MNIST (arXiv:1708.07747) — Paper Brief