3D eye gaze estimation has emerged as an interesting and challenging task in recent years. 3D gaze estimation methods can be divided into two categories: (1) appearance-based gaze estimation from images/videos; and (2) 3D eye model recovery and model-based gaze estimation. As an attractive alternative to appearance-based models, 3D model-based methods takes a different strategy that entails recovering the anatomical structure of a person’s eyeball, which can be powerful because eye anatomy and geometry is a general prior integrated into the 3D model hence they can adapt well under various head poses and il-lumination conditions. Based on the devices and data they require, 3D model-based methods can be further divided into two types: (a) personalized 3D eye model recovery from IR camera systems and (b) 3D eye shape estimation from image features using a pre-constructed deformable eye basis.
The personal 3D eye model is defined as a two-sphere system, where the larger sphere represents a 3D eyeball with center Oe and radius re and the smaller one represents the cornea with center Oc and radius rc. The intersection of the two spheres results in a circular plane whose center is defined as iris center Oi. The pupil is assumed to be a concentric circle with the iris circle; hence the pupil center overlaps with the iris center. Geometrically, Oe, Oc, Oi are co-linear points and their connection forms the optical axis.
Figure 1. Anatomy and geometry of human eyeball.
Tobii Pro Glasses2 consists of a head unit, a recording unit and controller software. The head unit contains four eye tracking sensors (two for each eye) that take infrared eye region images from different angles to analyze gaze direction and one high-resolution scene camera capturing HD videos of what is in front of the person. Additionally, there are six IR illuminators on each side that generate glints in the eye images due to corneal reflection. For each participant, a pre-calibration before recording is required to ensure that the glass is properly worn and the sensors successfully capture the pupil center of both eyes.
Figure 2. Data collection from Tobii Pro Glass2 and pre-processing.
3D pupil center We first process eye camera images for pupil ellipse detection. Then the 3D pupil center Oi (relative to the reference camera) is recovered through stereo rectification.
3D cornea center With pre-calibrated IR illuminators, we first detect glints g(1,1),g(1,2) caused by light I1 in two images to calculate the 3D virtual glints v1, similarly we can obtain another virtual glint v2 caused by light I2.The intersection of two light rays L1,L2 will be the 3D cornea center Oc.
3D eyeball center We assume that a user's head movement would not cause any position shift of the glass, hence Oe can be considered as a constant vector across one whole recording. As the rotating center of the eyeball, $O_e$ can be estimated by solving for the intersection of Oi_Oc (which is the connecting line between Oi and Oc) from multiple frames.
Eyeball, cornea and iris radius By referencing the ellipse reconstruction method we can recover the 3D circular function for the iris plane, i.e. the normal vector and iris radius ri. As in 3D eye geometry we define the iris as the intersection plane of eyeball sphere and cornea sphere, and their radius can be computed.
Figure 2. IR videos from TobiiGlass2
Figure 3. 3D head pose & eye pose fitting.
Figure 3. A two-phase framework for single-frame based 3D gaze estimation using the 3D eye shape basis.
Chenyi Kuang, Jeffrey Kephart, and Qiang Ji. Towards an accurate 3D deformable eye model for gaze estimation. Towards a Complete Analysis of People: From Face and Body to Clothes (T-CAP Workshop at ICPR), 2022 PDF