Lab 2 - Run Prompt flow with Phi-3-mini in AIPC

What's Prompt flow

Prompt flow is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.

With prompt flow, you will be able to:

Create flows that link LLMs, prompts, Python code and other tools together in a executable workflow.
Debug and iterate your flows, especially the interaction with LLMs with ease.
Evaluate your flows, calculate quality and performance metrics with larger datasets.
Integrate the testing and evaluation into your CI/CD system to ensure quality of your flow.
Deploy your flows to the serving platform you choose or integrate into your app’s code base easily.
(Optional but highly recommended) Collaborate with your team by leveraging the cloud version of Prompt flow in Azure AI.

Building generation code flows on Apple Silicon

Note ：If you have not completed the environment installation , please visit Lab 0 -Installations

Open the Prompt flow Extension in Visual Studio Code and create a empty flow project

Add Inputs and Outputs parameters and Add Python Code as new flow

You can refer to this structure (flow.dag.yaml) to construct your flow

inputs:
  prompt:
    type: string
    default: Write python code for Fibonacci serie. Please use markdown as output
outputs:
  result:
    type: string
    reference: ${gen_code_by_phi3.output}
nodes:
- name: gen_code_by_phi3
  type: python
  source:
    type: code
    path: gen_code_by_phi3.py
  inputs:
    prompt: ${inputs.prompt}

Quantify phi-3-mini

We hope to better run SLM on local devices. Generally, we quantify the model (INT4, FP16, FP32)

python -m mlx_lm.convert --hf-path microsoft/Phi-3-mini-4k-instruct

Note: default folder is mlx_model

Add Code in Chat_With_Phi3.py

from promptflow import tool

from mlx_lm import load, generate


# The inputs section will change based on the arguments of the tool function, after you save the code
# Adding type to arguments and return value will help the system show the types properly
# Please update the function name/signature per need
@tool
def my_python_tool(prompt: str) -> str:

    model_id = './mlx_model_phi3_mini'

    model, tokenizer = load(model_id)

    # <|user|>\nWrite python code for Fibonacci serie. Please use markdown as output<|end|>\n<|assistant|>

    response = generate(model, tokenizer, prompt="<|user|>\n" + prompt  + "<|end|>\n<|assistant|>", max_tokens=2048, verbose=True)

    return response

You can test the flow from Debug or Run to check generation code okay or not

Run flow as development API in terminal


pf flow serve --source ./ --port 8080 --host localhost

You can test it in Postman / Thunder Client

Note

The first run takes a long time. It is recommended to download the phi-3 model from Hugging face CLI.
Considering the limited computing power of Intel NPU, it is recommended to use Phi-3-mini-4k-instruct
We use Intel NPU Acceleration to quantize INT4 conversion, but if you re-run the service, you need to delete the cache and nc_workshop folders.

Resources

Learn Promptflow https://microsoft.github.io/promptflow/
Learn Intel NPU Acceleration https://github.com/intel/intel-npu-acceleration-library
Sample Code, download Local NPU Agent Sample Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

02.PromptflowWithMLX.md

02.PromptflowWithMLX.md

Lab 2 - Run Prompt flow with Phi-3-mini in AIPC

What's Prompt flow

Building generation code flows on Apple Silicon

Note

Resources

Files

02.PromptflowWithMLX.md

Latest commit

History

02.PromptflowWithMLX.md

File metadata and controls

Lab 2 - Run Prompt flow with Phi-3-mini in AIPC

What's Prompt flow

Building generation code flows on Apple Silicon

Note

Resources