September 2018 — present

Mila, University of Montreal

PhD, Computer Science; Supervisor: Prof. Christopher Pal

My research interests are broadly in generative models for images, video, 3D. I work on large-scale image and video generation using latent dynamics.

January 2020 — present

Visiting Researcher - Prof. Graham Taylor, University of Guelph

Guelph, Canada

I work in Prof. Graham Taylor’s lab on computer vision and deep learning.


2009 — 2014

Indian Institute of Technology (IIT), Kharagpur, India

Dual Degree (B.Tech. (H) + M.Tech.) in Electrical Engineering
with master’s specialization in Instrumentation and Signal Processing


March 2020 — September 2020

NextAI, Toronto - AI Scientist in Residence

Toronto, Canada

I am a mentor/consultant for multiple startups at NextAI. I assist them in integrating artificial intelligence and machine learning into their product pipeline, and with long-term strategies in technology.

September 2019 — December 2019

Google - Research Intern

Mountain View, California, USA

I worked with the Google AI Perception team on deep models for large-scale video analysis for active speaker detection with Bryan Seybold and Sourish Chaudhuri.

September 9-13, 2019

4th IVADO / Mila Deep Learning School - Teaching Assistant

Montreal, Canada


September 2019

Fundamentals of Machine Learning (IFT 6390), by Ioannis Mitliagkas - Teaching Assistant

University of Montreal, Montreal, Canada

April 2019 — August 2019

NextAI, Montreal - AI Scientist in Residence

Montreal, Canada

I was a mentor/consultant for 6 startups at NextAI. I assisted them in integrating artificial intelligence and machine learning into their product pipeline, and with long-term strategies in technology.

May 2017 — August 2018

International Institute of Information Techonology (IIIT) - Hyderabad - Research Fellow

Hyderabad, India — with Prof. C. V. Jawahar, Centre for Visual Information Technology, IIIT Hyderabad**

Conference paper: Abhishek Jha, Vikram Voleti, Vinay P. Namboodiri, C. V. Jawahar, “Cross-Language Speech Dependent Lip-Synchronization”, in ICASSP 2019 [IEEE]
Workshop paper: Abhishek Jha, Vikram Voleti, Vinay P. Namboodiri, C. V. Jawahar, “Lip-Synchronization for Dubbed Instructional Videos”, in CVPR Workshop (FIVER), 2018

January 2018 — June 2018

Playment - Computer Vision Consultant

Bengaluru, India

Playment is a startup that offers annotation services for various computer vision tasks.

I was a consultant for the computer vision work at Playment. We focused on making more exhaustive and comprehensive semantic segmentation for autonomous driving using deep learning. We also worked at using classical computer vison as well as deep learning to solve various industrial problems including facial recognition, facial landmark detection, pedestrian detection.

January 2018 — May 2018

IIIT-Hyderabad & Talent Sprint - Mentor for Fundamentals of AI/ML

Hyderabad, India

I was a Mentor for the Foundations of Artificial Intelligence and Machine Learning certificate program by IIIT-H Machine Learning Lab and TalentSprint. I assisted in creating tutorials on machine learning, and mentor participants during lab sessions.

Feb 2016 — May 2017

GreyOrange Robotics - Image Processing Engineer

Gurgaon, India

GreyOrange Robotics is a multinational firm that designs, manufactures and deploys advanced robotics systems for automation at warehouses, distribution and fulfillment centres.

I was part of the Embedded Systems team. My job was to developed a computer vision module to perform video processing in real time for warehouse automation. We made an “Empty Carriage Detection System” (ECDS) for the “Cross-Belt Sorter” (CBS) that detects in real time whether a carriage in a conveyor belt has a packet on it or not, and relays the information to the server and mechanical systems. I also helped develop the embedded vision module in automated guided robots for warehouses, called “Butlers”.

A research paper based on some of the work has been accepted at the International Conference on Industrial Design Engineering, ICIDE 2017.

Research paper: V. Voleti, P. Mohan, S. Gupta, J. Iqbal, “Simple Real-Time Pattern Recognition for Industrial Automation”, in Proc. International Conference on Industrial Design Engineering, 2017

July 2014 — Feb 2016

Airbus, India - Associate Engineer

Bengaluru, India

Airbus is a commercial aircraft manufacturer, and the largest aeronautics & space company in Europe. I worked in the Bangalore (India) office as part of the Avionics Software and Systems Testing group. I was involved in development and integration of avionics systems in the Flight Warning Computer (FWC) for aircrafts in the long-range family.

I was part of the Avionics Software and Systems Testing group. My job was to simulate signal-level changes in the Flight Warning Computer, such as adding new signals for new functionalities, re-routing signals through different paths. This was followed by rigorous testing of the FWC for correct operation. We designed the re-routing paths, as well as the tests required to ensure all the functionalities of the FWC run correctly. For all development, standard avionics coding guidelines (DO-178B) were followed.


2013 — 2014

Image De-fencing using Microsoft Kinect — M.Tech. Thesis

IIT Kharagpur, India — under Prof. Rajiv Ranjan Sahay, Electrical Engineering

I worked on de-fencing of images using RGB-D data from Microsoft Kinect. We recorded images of scenes with fence-like occlusions and were successful in removing the fences from the scenes. We first recorded multpiple images of the same scene with slight spatial variation of the camera, and computed the approximate global shift among them. We then used loopy belief propagation to inpaint. A comparison of our technique and the erstwhile standards was made, and our method was demonstrated to be better.

A research paper based on this work has been published in IEEE Xplore in proceedings of the International Conference on Advances in Pattern Recognition, ICAPR 2015. A journal paper based on this work is under review at the International Journal of Computer Vision (IJCV).

Research paper: S. Jonna, V. S. Voleti, R. R. Sahay, and M. S. Kankanhalli, “A Multimodal Approach for Image De-fencing and Depth Inpainting”, in Proc. Int. Conf. Advances in Pattern Recognition, 2015, pp. 1—6
Journal paper: S. Jonna, S. Satapathy, V. S. Voleti, R. R. Sahay, “Unveiling the scene: A Multimodal Framework for Simultaneous Image Disocclusion and Depth Map Completion using Computational Cameras”, International Journal of Computer Vision, 2017 (rejected)
THESIS Presentation GitHub repository containing thesis, presentation, code files, and results

2012 — 2013

Identification of Bilabial Consonants in Audio and Lip Closures in Video — B.Tech. Thesis

IIT Kharagpur, India — under Prof. Rajiv Ranjan Sahay, Electrical Engineering

I worked on the identification of bilabial consonants in video and audio. The goal was to measure the time offset between the two modes using corresponding time points where bilabials occur. I learnt C++ and the OpenCV library, and detected lip closures in video using the standard Viola-Jones face detector, and a novel algorithm for lip closure detection. I trained a Gaussian Mixture Model in MATLAB on the MFCC features of bilabials in the speech signals of different speakers. A correlation was drawn between the time points of bilabials in audio and video.

THESIS Presentation GitHub repository containing thesis, presentation, code files, and results


Summer 2013

KU Leuven, Belgium

Supervisor: Prof. Ingrid Verbauwhede, Computer Security & Industrial Applications, ESAT

Implementation of Carry-Free Arithmetic Operations in FPGA

I worked on the carry-free implementations of arithmetic operations of addition, subtraction and multiplication. Binary numbers are first converted to a recoded digit format that eliminates carry propagation. I designed the truth tables for this conversion, as well as subsequent addition, subtraction and multiplication. I then simplified the circuits into Product-of-Sums form, and coded them in Verilog. The time taken by these circuits were compared with standard implementation.

A single-author research paper based on this work has been written.

Research paper: V. Voleti, “Carry-Free Implementations of Arithmetic Operations in FPGA”
Report Presentation GitHub repository containing report and presentation

Summer 2012

IIT Kharagpur, India

Supervisor: Prof. Aurobinda Routray, Electrical Engineering

Fingertip Gesture Recognizer using HMMs

I first implemented Hidden Markov Models (HMM) in MATLAB from scratch, and verified the implementation outputs with those of standard implementation. I then made a simple gesture recognizer in MATLAB using HMMs.

Report Presentation GitHub repository containing report, presentation, code files, and results

Summer 2011

Imperial College, London, UK

Supervisor: Prof. Peter Cheung, Electrical and Electronic Engineering

Measurement of Intra-die Power Variation in Sub-nm FPGA’s

I experimented with an FPGA, and measured the power consumption among the LookUp Tables (LUTs) within it. An automated workflow for the measurement of power across the FPGA was made, by first implementing a circuit in each LUT, measuring the power on an oscilloscope using the JTAG terminals on the FPGA, recording the oscilloscope’s readings in MATLAB, and plotting graphs from MATLAB.

Presentation GitHub repository containing presentation, certificate, and recommendation letter from Prof. Peter Cheung