address comments on server start instructions

triton-inference-server · Nov 7, 2023 · 5318f2f · 5318f2f
1 parent aaf3fab
commit 5318f2f
Showing 1 changed file with 2 additions and 5 deletions.
diff --git a/Popular_Models_Guide/Llama2/trtllm_guide.md b/Popular_Models_Guide/Llama2/trtllm_guide.md
@@ -120,12 +120,9 @@ To run our Llama2-7B model, you will need to:
 
 3.  Launch Tritonserver
 
+    Use the [launch_triton_server.py](https://github.com/triton-inference-server/tensorrtllm_backend/blob/release/0.5.0/scripts/launch_triton_server.py) script. This launches multiple instances of `tritonserver` with MPI.
     ```bash
-    tritonserver --model-repository=/opt/tritonserver/inflight_batcher_llm
-    ```
-    Note if you built the engine with `--world_size X` where `X` is greater than 1, you will need to use the [launch_triton_server.py](https://github.com/triton-inference-server/tensorrtllm_backend/blob/release/0.5.0/scripts/launch_triton_server.py) script.
-    ```bash
-    python3 /tensorrtllm_backend/scripts/launch_triton_server.py --world_size=X --model_repo=/opt/tritonserver/inflight_batcher_llm
+    python3 /tensorrtllm_backend/scripts/launch_triton_server.py --world_size=<the engine's world size> --model_repo=/opt/tritonserver/inflight_batcher_llm
     ```
 
 ## Client