the AI system responds to the user′s question based on images sourced from the Microsoft COCO dataset. In Figs.2–11 from the full text, the expected standard answers are provided in parentheses, ...