- ChatRex Demo: Visual Prompt Interaction Guide
- Contents
- 1. Introduction 📖
- 2. Workflow 🚀
- 3. Tips and Support 💡
Welcome to the ChatRex Demo! This tool demonstrates interactive visual prompt methods for AI-powered image understanding and question answering. This document provides detailed instructions on the workflow, interface components, and how to utilize the visual prompts effectively.
We also provide a gradio demo for ChatRex. Before you use, we highly recommend you to watch the following video to understand how to use this demo:
Gradio_Demo_v2.mp4
-
Choose a Visual Prompt Method
- Select either
Interactive Visual Prompt
orProposal Visual Prompt
to define your region of interest within the image.
- Select either
-
Provide a Question Input
- Enter a valid question in the
Raw Question Input
field or use aPre-defined Question Template
. Ensure input accuracy to achieve relevant results.
- Enter a valid question in the
-
Run the Demo
- Click on the
Run ChatRex
button to process the image and display the results, including answers and visualizations.
- Click on the
-
Overview:
This mode allows you to manually annotate regions of interest by either:- Clicking on the image to add a point, or
- Drawing a bounding box over specific areas.
-
Display Visualization:
Once the annotations are complete, click onDisplay Visual Prompt
to visualize the selected regions. -
Important Notes:
- Ensure that neither
Fine Grained Proposal
norCoarse Grained Proposal
checkboxes are selected when using this mode.
- Ensure that neither
-
Overview:
This mode automatically generates bounding boxes based on the granularity of the proposal:- Fine Grained Proposal: Produces a detailed set of bounding boxes for smaller components (e.g., noses, eyes, or body parts).
- Coarse Grained Proposal: Generates fewer bounding boxes for larger objects or overall entities (e.g., a person, dog, or an whole entity).
-
Display Visualization:
ClickDisplay UPN Proposal
to view the generated bounding boxes.
- Enter your question in natural language. For example:
- What objects are present in this image?
- What is the color of the dog's collar?
- Who painted the sculpture?
- Select from a list of predefined templates to simplify the question input process.
- If you need to specify object categories (e.g., dog or cat ->
dog,cat
), enter their names or IDs in the<Object ids>
field, following the provided hints.
- If you're unsure how to interact with the application, refer to the tutorial video or browse the solved issues for additional guidance.
- For any further questions or feedback, feel free to contact us through the Issues page.
Enjoy exploring ChatRex's multimodal capabilities for seamless visual and language interaction!