OPPO has 7 papers selected and wins 8 challenges at CVPR 2022
- Seven papers submitted by OPPO were selected for presentation at the 2022 Computer Vision and Pattern Recognition Conference, breaking a new record for the company. The selected papers cover OPPO’s various R&D breakthroughs in a range of artificial intelligence disciplines
- OPPO received a total of eight prizes in the CVPR challenges, including three first-place, one second place, and four third place prizes
SHENZHEN, CHINA – Media OutReach – 23 June 2022 – The annual Computer Vision and Pattern Recognition Conference (CVPR) came to an end in New Orleans today, with globally leading technology company OPPO successfully having seven of its submitted papers selected for the conference, putting it among the most successful technology companies at the event. OPPO also placed in eight of the widely watched competition events at the conference, taking home three first place, one second place, and four third place prizes.
As deep learning technology has developed over the years, artificial intelligence has shifted from perceptual intelligence to cognitive intelligence. In addition to being able to ‘see’ or ‘hear’ like humans, modern AI technology is now able to demonstrate a similar level of cognitive ability to humans too. Multimodal fusion, 3D visual intelligence technology, and automated machine learning are becoming key research topics in the field of AI, and areas in which OPPO has achieved several theoretical and technological breakthroughs of its own.
“In 2012, deep neural networks designed for image recognition tasks re-energized the research and application of artificial intelligence. Since then, AI technology has seen a decade of rapid development.” said Guo Yandong, Chief Scientist in Intelligent Perception at OPPO.”OPPO continues to promote artificial intelligence to accomplish complex perceptual and cognitive behaviors. For example, AI can learn from unlabeled massive data and realize downstream migration and reconstruct 3D information from several limited perspectives. We also empower AI with higher cognitive abilities to understand and create beauty and develop embodied AI with autonomous behavior. I’m delighted to see that seven of our papers have been selected for this year’s conference. Building on this success, we will continue to explore both fundamental AI and cutting-edge AI technology, as well as the commercial applications that will enable us to bring the benefits of AI to more people.”
The Seven papers accepted by CVPR 2022 showcase OPPO’s progress in creating humanizing AI
Seven papers submitted by OPPO for this year’s CVPR were selected for presentation at the conference. Their areas of research include multimodal information interaction, 3D human body reconstruction, personalized image aesthetics assessment, knowledge distillation, and others.
Cross-modal technology is seen as the key to ‘humanizing’ artificial intelligence. Different modal data have different characteristics. Text information often features a high level of generality, whereas visual image information contains a large amount of specific contextual details. It is a great challenge to establish effective interaction for multimodal data. OPPO researchers proposed a new CRIS framework based on the CLIP model to enable AI to get a more fine-grained understanding of the text and image modal data. The model can achieve an accurate match of a piece of relevant visual information in an image after feeding the complex text descriptions.
The biggest difference between human and artificial intelligence today lies in multimodality. Human beings are able to easily understand information in both words and images and draw associations between the two types of information. AI on the other hand is currently unable to move past the identification stage and finds it difficult to accurately match information between different modes. The novel method proposed by OPPO improves multimodal intelligence, which could potentially lead to artificial intelligence being able to truly understand and interpret the world through multiple forms of information such as language, hearing, vision, and others, making the robot and digital assistants of sci-fi movies become a reality.