V4V Workshop & Challenge

In conjunction with ICCV 2021, Montreal, Canada


Telehealth has the potential to offset the high demand for help during public health emergencies, such as the ongoing COVID pandemic, and in rural locations where health services and qualified treatment providers make services difficult if not impossible to obtain. Besides communication, the use of existing sensor infrastructure within modern smart devices for medical tests are compelling. Remote Photoplethysmography (rPPG) - the problem of non-invasively estimating blood volume variations in the microvascular tissue from video - would be well suited for these situations.

Over the past few years a number of research groups have made rapid advances in remote PPG methods for estimating heart rate from digital video and obtained impressive results. How these various methods compare in naturalistic conditions, where spontaneous movements, facial expressions, or illumination changes are present, is relatively unknown. Most previous benchmarking efforts focused on posed situations. No commonly accepted evaluation protocol exists for estimating vital signs in spontaneous behavior with which to compare them.

To enable comparisons among alternative methods, we present the 1st Vision for Vitals Workshop & Challenge (V4V 2021). This topic is germane to both computer vision and multimedia communities. For computer vision, it is an exciting approach to longstanding limitations of vital signs estimating approaches. For multimedia, remote vital signs estimation would enable more powerful applications.

Date: TBA

#1. Keynote title TBA

Dr. Conrad Tucker is an Arthur Hamerschlag Career Development Professor of Mechanical Engineering and holds courtesy appointments in Machine Learning, Robotics, Biomedical Engineering and CyLab Security & Privacy Institute at Carnegie Mellon University. His research focuses on the design and optimization of systems through the acquisition, integration and mining of large scale, disparate data.

Dr. Tucker has served as PI/Co-PI on federally/non-federally funded grants from the National Science Foundation (NSF), the Air Force Office of Scientific Research (AFOSR), the Defense Advanced Research Projects Agency (DARPA), the Army Research Laboratory (ARL), the Office of Naval Research (ONR) via the NSF Center for eDesign, and the Bill and Melinda Gates Foundation (BMGF). In February 2016, he was invited by National Academy of Engineering (NAE) President Dr. Dan Mote, to serve as a member of the Advisory Committee for the NAE Frontiers of Engineering Education (FOEE) Symposium. He received his Ph.D., M.S. (Industrial Engineering), and MBA degrees from the University of Illinois at Urbana-Champaign, and his B.S. in Mechanical Engineering from Rose-Hulman Institute of Technology.

#2. Seeing Inside Out: Camera-based Physiological Sensing with Applications in Telehealth and Wellbeing

Daniel McDuff is a Principal Researcher at Microsoft. Daniel completed his PhD at the MIT Media Lab in 2014 and has a B.A. and Masters from Cambridge University. Daniel's work on non-contact physiological measurement helped to popularize a new field of low-cost health monitoring using webcams. Previously, Daniel worked at the UK MoD, was Director of Research at MIT Media Lab spin-out Affectiva and a post-doctoral research affiliate at MIT. His work has received nominations and awards from Popular Science magazine as one of the top inventions in 2011, South-by-South-West Interactive (SXSWi), The Webby Awards, ESOMAR and the Center for Integrated Medicine and Innovative Technology (CIMIT). His projects have been reported in many publications including The Times, the New York Times, The Wall Street Journal, BBC News, New Scientist, Scientific American and Forbes magazine. Daniel was named a 2015 WIRED Innovation Fellow, an ACM Future of Computing Academy member and has spoken at TEDx and SXSW. Daniel has published over 100 peer-reviewer papers on machine learning (NeurIPS, ICLR, ICCV, ECCV, ACM TOG), human-computer interaction (CHI, CSCW, IUI) and biomedical engineering (TBME, EMBC).

The main track is intended to bring together computer vision researchers whose work is related to vision based vital signs estimation. We are soliciting original contributions which address a wide range of theoretical and application issues of remote vital signs estimation, including but not limited to:

  • Methods for extracting vital signals from videos, including pulse rate, respiration rate, blood oxygen, and body temperature.
  • Vision-based methods to support and augment vital signs monitoring systems, such as face/skin detection, motion tracking, video segmentation, and optimization.
  • Vision-based vital signs measurement for affective, emotional, or cognitive states.
  • Vision-based vital signs measurement to assist video surveillance in-the-wild.
  • Vision-based vital signs measurement to detect human liveness or manipulated images (deep fake detection).
  • Applications of vision-based vital signs monitoring
  • User interfaces employing vision-based vital signs estimation
  • V4V Challenge evaluates remote PPG methods for vital signs estimation on a new large corpora of face videos annotated with corresponding high-resolution videos and vital signs from contact sensors. The goal of the challenge is to reconstruct the vital signs of the subjects from the video sources. The participants will receive an annotated training set and a test set without annotations.

    There are two subtracks in the challenge - (1) Heart rate (HR) estimation and (2) Respiration rate (RR) estimation. All participants are required to submit predictions to HR sub-challenge. Although, participation in RR sub-challenge is optional, we encourage all participants to involve in both sub-challenges. To learn more, please head over to this page on Codalab.

    The datasets may be used for the V4V Challenge of ICCV 2021 only. The recipient of the datasets must be a full-time faculty, researcher or employee of an organization (not a student) and must agree to terms and conditions listed on the codalab.

    If you are interested in downloading the V4V dataset please download and sign the EULA and email the scanned copy back to lijun(at)cs(dot)binghamton(dot)edu, zli191(at)binghamton(dot)edu and laszlojeni(at)cmu(dot)edu

    ** Frequently Asked Questions (FAQs) **

    Q1: How are the V4V ground truth physiological signals collected?

    A: Details regarding the acquisition of data can be found in Section 2.2.3 in the paper: Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis (link). The ground truth device gives us a continuous BP signal. Following the signal acquisition, HR is calculated using an off-the-shelf software by computing peak-to-peak interval in the BP signal. (If you would like to read more about the off-the-shelf software, please visit this PDF).

    Q2: Can we use external data?

    A: You may use external data for training your model as long as you document it in your accompanying paper.

    Please visit the Codalab page where the competition is hosted. Additionally, the evaluation code can be downloaded here for local use by participants. The requirements file for the local environment setup can be downloaded from the same repository. Please report any bugs in the evaluation code in the issues of the repository.

    Please note that along with your submission to Codalab competition page, in order to be included in the leaderboard, you are required to submit a short paper containing the description of your method. For paper submission to the workshop, visit our CMT page. When submitting your paper to CMT, please also email arevanur(at)andrew(dot)cmu(dot)edu with your username/Team name on Codalab, and your workshop paper title, to make it easier to link your paper to the Codalab submissions.

    Challenge paper submissions must be written in English and must be sent in PDF format. Please refer to the ICCV submission guidelines for instructions regarding formatting, templates, and policies. The submissions will be reviewed by the program committee and selected papers will be published in ICCV Workshop proceedings.

    Challenge Track

    • May 21th: Challenge site opens, training data available
    • July 17th: Testing phase begins
    • July 30th: Competition ends (challenge paper submission - optional)

    Workshop Track

    • July 31th: Paper submission deadline
    • August 10th: Notification of acceptance
    • August 16th: Camera ready submission

    Data chairs

    • Sergio Escalera, Universitat of Barcelona
    • Shaun Canavan, University of South Florida
    • Vitomir Struc, University of Ljubljana
    • Itir Onal Ertugrul, Tilburg University
    • Michel Valstar, University of Nottingham
    • Abhinav Dhall, Monash University
    • Saurabh Hinduja, University of Pittsburgh
    • Tim K Marks, Mitsubishi Electric Research Labs (MERL)