This 1-day course focuses on building intelligent applications that can see, interpret, and reason over images and documents using different multimodal models and agent-based tools. Learners explore how visual and document inputs can be combined with language models to enable structured extraction, analysis, and decision-making workflows. The course emphasizes practical patterns for extracting information, orchestrating tools, and grounding model responses in visual data.
This course is designed for developers, AI engineers, and technical professionals who want to build applications that work with images and documents using multimodal, agent-driven approaches. It’s best suited for learners with basic programming experience and a general understanding of cloud or AI concepts.
< Back to Course Search
Class times are listed Eastern time
This is a 1-day class
Class dates not listed.Please click here to contact us for available dates and times.
The classes listed are available to all United Training customers and does not reflect, in any way, the availability or support of technology within the University of Maine System.