Prompt flow is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
With prompt flow, you will be able to:
-
Create flows that link LLMs, prompts, Python code and other tools together in a executable workflow.
-
Debug and iterate your flows, especially the interaction with LLMs with ease.
-
Evaluate your flows, calculate quality and performance metrics with larger datasets.
-
Integrate the testing and evaluation into your CI/CD system to ensure quality of your flow.
-
Deploy your flows to the serving platform you choose or integrate into your app’s code base easily.
-
(Optional but highly recommended) Collaborate with your team by leveraging the cloud version of Prompt flow in Azure AI.
Note :If you have not completed the environment installation , please visit Lab 0 -Installations
- Open the Prompt flow Extension in Visual Studio Code and create a empty flow project
- Add Inputs and Outputs parameters and Add Python Code as new flow
You can refer to this structure (flow.dag.yaml) to construct your flow
inputs:
prompt:
type: string
default: Write python code for Fibonacci serie. Please use markdown as output
outputs:
result:
type: string
reference: ${gen_code_by_phi3.output}
nodes:
- name: gen_code_by_phi3
type: python
source:
type: code
path: gen_code_by_phi3.py
inputs:
prompt: ${inputs.prompt}
- Quantify phi-3-mini
We hope to better run SLM on local devices. Generally, we quantify the model (INT4, FP16, FP32)
python -m mlx_lm.convert --hf-path microsoft/Phi-3-mini-4k-instruct
Note: default folder is mlx_model
- Add Code in Chat_With_Phi3.py
from promptflow import tool
from mlx_lm import load, generate
# The inputs section will change based on the arguments of the tool function, after you save the code
# Adding type to arguments and return value will help the system show the types properly
# Please update the function name/signature per need
@tool
def my_python_tool(prompt: str) -> str:
model_id = './mlx_model_phi3_mini'
model, tokenizer = load(model_id)
# <|user|>\nWrite python code for Fibonacci serie. Please use markdown as output<|end|>\n<|assistant|>
response = generate(model, tokenizer, prompt="<|user|>\n" + prompt + "<|end|>\n<|assistant|>", max_tokens=2048, verbose=True)
return response
- You can test the flow from Debug or Run to check generation code okay or not
- Run flow as development API in terminal
pf flow serve --source ./ --port 8080 --host localhost
You can test it in Postman / Thunder Client
-
The first run takes a long time. It is recommended to download the phi-3 model from Hugging face CLI.
-
Considering the limited computing power of Intel NPU, it is recommended to use Phi-3-mini-4k-instruct
-
We use Intel NPU Acceleration to quantize INT4 conversion, but if you re-run the service, you need to delete the cache and nc_workshop folders.
-
Learn Promptflow https://microsoft.github.io/promptflow/
-
Learn Intel NPU Acceleration https://github.com/intel/intel-npu-acceleration-library
-
Sample Code, download Local NPU Agent Sample Code