Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction

Authors Apoorva Dornadula, Austin Narcomey, Ranjay Krishna, Michael Bernstein, Li Fei-Fei We introduce a scene graph approach that formulates predicates as learned functions, which result in an embedding space for objects that is effective for few-shot. Our formulation treats predicates as learned semantic and spatial functions, which are trained within a graph convolution network. First,

Read MoreRead More

Scene Graph Prediction with Limited Labels

Authors Vincent Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Re, Li Fei-Fei Our semi-supervised method automatically generates probabilistic relationship labels to train any scene graph model. Abstract Visual knowledge bases such as Visual Genome power numerous applications in computer vision, like visual question answering and captioning, but suffer from sparse, incomplete relationships. All scene

Read MoreRead More

AI-based Request Augmentation to Increase Crowdsourcing Participation

Authors Junwon Park, Ranjay Krishna, Pranav Khadpe, Li Fei-Fei, Michael Bernstein Abstract To support the massive data requirements of modern supervised machine learning (ML) algorithms, crowdsourcing systems match volunteer contributors to appropriate tasks. Such systems learn “what” types of tasks contributors are interested to complete. In this paper, instead of focusing on “what” to ask,

Read MoreRead More

Puppet Dubbing

Authors O. Fried and M. Agrawala Given an audio file and a puppet video, we produce a dubbed result in which the puppet is saying the new audio phrase with proper mouth articulation. Specifically, each syllable of the input audio matches a closed-open-closed mouth sequence in our dubbed result. We present two methods, one semi-automatic

Read MoreRead More

Information Maximizing Visual Question Generation

Authors Ranjay Krishna, Michael Bernstein, Li Fei-Fei Example questions generated for a set of images and answer categories. Incorrect questions are shown in grey and occur when no relevant question van be generated for a given image and answer category Abstract Though image-to-sequence generation models have become overwhelmingly popular in human-computer communications, they suffer from

Read MoreRead More

A Glimpse Far into the Future: Understanding Long-term Crowd Worker Accuracy

Authors Kenji Hata, Ranjay Krishna, Li Fei-Fei, Michael Bernstein A selection of individual workers’ accuracy over time during the question answering task. Each worker remains relatively constant throughout his or her entire lifetime. Abstract Microtask crowdsourcing is increasingly critical to the creation of extremely large datasets. As a result, crowd workers spend weeks or months

Read MoreRead More

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Authors Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li Jia-Li, David Ayman Shamma, Michael Bernstein, Li Fei-Fei An overview of the data needed to move from perceptual awareness to cognitive understanding of images. We present a dataset of images densely annotated with numerous region descriptions, objects,

Read MoreRead More