update websites

OSU-NLP-Group · Oct 7, 2024 · 83a7372 · 83a7372
1 parent cffc279
commit 83a7372
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/index.html b/index.html
@@ -192,7 +192,7 @@ <h2 class="subtitle is-3 publication-subtitle">
           <p>
               <b>UGround</b> is a universal <b>visual grounding</b> model for locating the element of an action by pixel coordinates on the screen. It is trained on 10M elements from 1.3M screenshots, and substantially outperforms previous SOTA GUI visual grounding models.
 <br>
-              We propose a generic framework, <b>SeeAct-V</b>, that perceives the GUIs entirely visually, and takes pixel-level operations on screens. SeeAct-V Agents with UGround achieve SOTA prompt-only performance on six benchmarks, spanning <b>GUI Grounding</b> (web, mobile, desktop), <b>offline agent evaluation</b> (web, mobile, desktop), and <b>online agent evaluation</b> (web, mobile):
+              Different from prevalent approaches that rely on HTML/A11y trees for observation or grounding, we propose a generic framework, <b>SeeAct-V</b>, that <b>perceives the GUIs entirely visually</b>, and <b>takes pixel-level operations</b> on screens. SeeAct-V Agents with UGround achieve SOTA performance on six benchmarks, spanning <b>GUI Grounding</b> (web, mobile, desktop), <b>offline agent evaluation</b> (web, mobile, desktop), and <b>online agent evaluation</b> (web, mobile):
 <!--           <ul>-->
 <!--&lt;!&ndash;            <li>🌐 📱 💻 ScreenSpot (GUI Grounding)</li>&ndash;&gt;-->
 <!--&lt;!&ndash;            <li>🌐 Multimodal-Mind2Web </li>&ndash;&gt;-->