OpenAI 创建并使用自定义模型
在上一篇文章中介绍了目前大热的 ChatGPT 人工智能机器人,其前端是人机交互界面,后端是 OpenAI 建立并提供的一套可自训练的 ML(机器学习) 模型和系统;该模型可以通过获取训练数据样本而不断训练并得以完善,同 ChatGPT 聊天其实也是在提供训练数据。本篇文章将介绍如何使用 OpenAI API 创建自定义模型,训练该模型实现鉴别垃圾评论的功能。
一、使用 Conda 创建 OpenAI 虚拟环境
本例中使用的是 Miniconda,首先根据 Python 的版本到官网下载并运行安装程序,本例用的是最新的 Conda 版本,编译并使用了 Python 3.10。
1、编译配置 Python 3.10
$mkdir work
$cd work
下载并解压 Python 源代码,进入解压后的目录,创建一个output子目录。
$./configure –prefix=/home/user/work/Python-3.10.2/output/
$make
$make install
$update-alternatives –install /usr/bin/python python /home/user/work/Python-3.10.2/output/bin/python3.10 5
如果当前使用的是其它的 Python 版本,使用下面的命令选择前面编译好的版本。
$update-alternatives –config python
$python –version
Python 3.10.2
2、安装配置 Miniconda
下载并运行 Miniconda 安装脚本,按照提示进行安装。
$wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$bash Miniconda3-latest-Linux-x86_64.sh
…
Miniconda3 will now be installed into this location:
/home/user/miniconda3
[/home/user/miniconda3] >>> /home/user/work/miniconda3
…
3、安装配置OpenAI命令行工具
创建 OpenAI 虚拟运行环境,安装 OpenAI 命令行工具。
$conda create -n openai python=3.10 -y
$conda info –env
conda environments:
base /home/user/work/miniconda3
openai * /home/user/work/miniconda3/envs/openai
$pip install –upgrade openai
$export OPENAI_API_KEY=”sk-…” #API KEY的获取方法参见这篇文章。
二、准备模型训练数据
训练数据按照下面的格式准备:
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...
比如鉴别垃圾评论的数据:
{"prompt":"Good stuff, will do.", "completion":" ham"}
{"prompt":"Today's Offer! Claim ur $250 worth of discount vouchers! Text YES to 86839609 now! ", "completion":" spam"}
...
把准备好的训练数据保存成 openai_datasets.json 的文件。openai 命令行工具也支持很多其它文件格式的数据(CSV、TSV、XLSX等)。openai 会把训练数据转换成其内部使用的数据。
$openai tools fine_tunes.prepare_data -f openai_datasets.json
三、创建并训练模型
$openai api fine_tunes.create -t “openai_datasets_prepared_train.jsonl” -v “openai_datasets_prepared_valid.jsonl” –compute_classification_metrics –classification_positive_class ” ham” -m curie –suffix “comment_antispam”
Upload progress: 100%|████████████████████████████████████████████████████████████| 10.7k/10.7k [00:00<00:00, 20.5Mit/s]
Uploaded file from openai_datasets_prepared_train.jsonl: file-P0kdvokbu0QqmGOpvTDhtSrB
Upload progress: 100%|████████████████████████████████████████████████████████████| 2.96k/2.96k [00:00<00:00, 8.13Mit/s]
Uploaded file from openai_datasets_prepared_valid.jsonl: file-IyNl2ZKie6jMzgW63hloNFCo
Created fine-tune: ft-Zely1DFHUhOQ6rRboNLl7QZ6
Streaming events until fine-tuning is complete…
(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2022-12-21 16:52:26] Created fine-tune: ft-Zely1DFHUhOQ6rRboNLl7QZ6
[2022-12-21 16:52:33] Fine-tune costs $0.03
[2022-12-21 16:52:34] Fine-tune enqueued. Queue number: 0
[2022-12-21 16:52:35] Fine-tune started
[2022-12-21 16:53:44] Completed epoch 1/4
[2022-12-21 16:54:06] Completed epoch 2/4
[2022-12-21 16:54:26] Completed epoch 3/4
[2022-12-21 16:54:46] Completed epoch 4/4
[2022-12-21 16:55:14] Uploaded model: curie:ft-personal:comment-antispam-2022-12-21-08-55-08
[2022-12-21 16:55:29] Uploaded result file: file-oFvX1Oxvs7MYwJYlP09YAY3j
[2022-12-21 16:55:31] Fine-tune succeeded
Job complete! Status: succeeded 🎉
四、使用训练模型鉴别垃圾评论
$openai api completions.create -m curie:ft-personal:comment-antispam-2022-12-21-08-55-08 -p “R u here yet? I’m wearing blue shirt n black pants.”
R u here yet? I’m wearing blue shirt n black pants. -> ham tshirt -> ham ham ham -> ham ham ham -> ham ham ham(openai)
$openai api completions.create -m curie:ft-personal:comment-antispam-2022-12-21-08-55-08 -p “REMINDER FROM O2: To get 2.50 pounds free call credit and details of great offers pls reply 2 this text with your valid name, house no and postcode”
REMINDER FROM O2: To get 2.50 pounds free call credit and details of great offers pls reply 2 this text with your valid name, house no and postcode -> spamfilter spamhaus spamcop spamcop spamcop spam -> spam_out(openai)
五、Python 程序中使用模型
import openai
openai.api_key = "sk-..."
completion = openai.Completion.create(
model="curie:ft-personal:comment-antispam-2022-12-21-08-55-08",
prompt="R u here yet? I'm wearing blue shirt n black pants. ->",
max_tokens=1,
temperature=0)
print(completion.choices[0].text)
Thank you very much for the information