| Настроение: | amused |
Performance on Bongard problems
Limitations: Research shows that even advanced models like Gemini 3 have significant difficulty with the classical set of synthetic Bongard problems.
Real-world vs. synthetic: Performance improves on Bongard problems using real-world images (e.g., Bongard-HOI), but models still struggle with tasks requiring them to improve predictions or effectively use dialogue context.
General limitations: The difficulty in solving classical synthetic Bongard problems suggests that the issue is not just domain-specific but reflects more general limitations in the models' ability to perform abstract visual reasoning
Then again, most humans also struggle with them.