Overview:

Intelligence is determined by how efficiently a machine learns and stores knowledge about the world, enabling it to handle unanticipated tasks and new environments, learn rapidly without supervision, explain decisions, deduce the unobserved, and anticipate the likely outcomes. A key limitation of current machine learning approaches is their specialized training for different tasks, environments and contexts to achieve desirable accuracies. However, such an approach is not scalable for a visual assistant system that faces an increasing variety of tasks, contexts and environments during the lifetime of the assisted individual. In contrast, the envisioned system in this project will effectively leverage key principles embodied in human intelligence such as feedback from the continuously evolving memory, and higher cognitive processes like attention, knowledge models, and decision making. Through coordinated hardware-software codesign inspired by neuroscience principles, the project envisions a new energy-efficient visual analytics paradigm that can provide persons with visual impairments assistance in a variety of tasks comparable to that of human assistants in a portable form-factor. The envisioned AI cognitive assistant will understand the entire visual field, reason about relationships between objects, and deduce how they relate to the person?s goals (e.g., navigating to a specific location, accomplishing a specific task, or summarizing what is going on in the surroundings). These advances will be a significant leap from current assistive systems that are rudimentary in their assistive capabilities, limited primarily to object detection and navigation in controlled settings with no support for personalized adaptation or continuous learning.

Biological brains, under evolutionary and environmental pressure to survive, cannot afford the luxury of lengthy retraining for every new combination of environment, task, goal, and context. Instead, mechanisms such as attention have evolved to dynamically prioritize information based on these combined factors. This project envisions enhancing the efficiency and accuracy of traditional AI systems by embedding top-down attention mechanisms that selectively process spatial and temporal regions of the input that are most relevant. Through support for continuous lifelong learning with selective model interleaving at different scales by exploiting the hierarchical structure of knowledge, the envisioned system will dynamically adapt the effort for learning and inference to enable energy-proportional computing that is commensurate with the task and environmental complexity. Some of these neuroscience principles will be embedded in Field Programmable Gate Array (FPGA) based hardware fabrics and custom hybrid CMOS-ferroelectric-based hardware fabrics to accentuate the efficiency benefits through the co-design of hardware and algorithms. The project will evaluate our innovations in an engineered system that serves as a cognitive visual assistant for persons with visual impairments. This work will enable a new generation of cognitive vision systems that can abstract, learn, adapt and reason akin to sighted human assistants, transforming the landscape of currently available visual assist systems that are rudimentary compared to human assistants. While machine learning approaches are widely used in many applications, they are not easy to adapt to new environments or tasks. The energy consumed by training new machine learning models is placing enormous demands on global power consumption. In contrast, the proposed neuroscience-inspired visual analytics system can adapt to new environments and tasks with a fraction of the cost of traditional systems by leveraging attention and life-long learning principles. The resulting energy-efficient visual analytics will enhance the independence of persons with visual impairments, transforming current smart assistive gadgets to virtual companions, and greatly enhancing the range of activities for assistance. Educational, outreach and broadening participation initiatives are integral to the various components this project.

Funding:

National Science Foundation

Collaborators:

  • Laurent Itti –  University of Southern California
  • Anand Raghunathan – Purdue University
  • Abhronil Sengupta – Penn State University
  • Mehrdad Mahdavi – Penn State University