Compare commits

...

10 Commits

Author SHA1 Message Date
Vinlic
2f12a5daef Release 0.0.35 2024-12-31 12:47:47 +08:00
Vinlic
2ce738b1fc 修复非预期模型名处理 2024-12-31 12:47:16 +08:00
Vinlic
719e3b682f 支持GLM-4-Plus以及Zero思考推理模型 2024-12-31 11:32:25 +08:00
Vinlic科技
57b042d187
Update README.md 2024-12-14 01:58:46 +08:00
Vinlic
05ecba5cc2 修复代码生成调用输出 2024-12-14 01:54:49 +08:00
Vinlic科技
a969acd6fb
Merge pull request #43 from Alex-Yanggg/master
Update the Doc
2024-12-12 13:48:16 +08:00
Alex
54805f2475
Update README_EN.md 2024-12-12 16:45:11 +11:00
Alex
b402f99960
Update README.md
更改目录层级,添加英文入口
2024-12-12 15:52:43 +11:00
Alex
c1f5e9ae78
Create README_EN.md 2024-12-12 15:51:19 +11:00
Vinlic科技
fe3f0784c8
Update README.md 2024-12-04 17:05:38 +08:00
6 changed files with 714 additions and 42 deletions

View File

@ -1,15 +1,19 @@
# GLM AI Free 服务 # GLM AI Free 服务
<hr>
<span>[ 中文 | <a href="README_EN.md">English</a> ]</span>
[![](https://img.shields.io/github/license/llm-red-team/glm-free-api.svg)](LICENSE) [![](https://img.shields.io/github/license/llm-red-team/glm-free-api.svg)](LICENSE)
![](https://img.shields.io/github/stars/llm-red-team/glm-free-api.svg) ![](https://img.shields.io/github/stars/llm-red-team/glm-free-api.svg)
![](https://img.shields.io/github/forks/llm-red-team/glm-free-api.svg) ![](https://img.shields.io/github/forks/llm-red-team/glm-free-api.svg)
![](https://img.shields.io/docker/pulls/vinlic/glm-free-api.svg) ![](https://img.shields.io/docker/pulls/vinlic/glm-free-api.svg)
支持高速流式输出、支持多轮对话、支持智能体对话、支持视频生成、支持AI绘图、支持联网搜索、支持长文档解读、支持图像解析零配置部署多路token支持自动清理会话痕迹。 支持GLM-4-Plus高速流式输出、支持多轮对话、支持智能体对话、支持Zero思考推理模型、支持视频生成、支持AI绘图、支持联网搜索、支持长文档解读、支持图像解析零配置部署多路token支持自动清理会话痕迹。
与ChatGPT接口完全兼容。 与ChatGPT接口完全兼容。
还有以下个free-api欢迎关注 还有以下个free-api欢迎关注
Moonshot AIKimi.ai接口转API [kimi-free-api](https://github.com/LLM-Red-Team/kimi-free-api) Moonshot AIKimi.ai接口转API [kimi-free-api](https://github.com/LLM-Red-Team/kimi-free-api)
@ -19,26 +23,29 @@ Moonshot AIKimi.ai接口转API [kimi-free-api](https://github.com/LLM-Red-
秘塔AI (Metaso) 接口转API [metaso-free-api](https://github.com/LLM-Red-Team/metaso-free-api) 秘塔AI (Metaso) 接口转API [metaso-free-api](https://github.com/LLM-Red-Team/metaso-free-api)
字节跳动豆包接口转API [doubao-free-api](https://github.com/LLM-Red-Team/doubao-free-api)
字节跳动即梦AI接口转API [jimeng-free-api](https://github.com/LLM-Red-Team/jimeng-free-api)
讯飞星火Spark接口转API [spark-free-api](https://github.com/LLM-Red-Team/spark-free-api) 讯飞星火Spark接口转API [spark-free-api](https://github.com/LLM-Red-Team/spark-free-api)
MiniMax海螺AI接口转API [hailuo-free-api](https://github.com/LLM-Red-Team/hailuo-free-api) MiniMax海螺AI接口转API [hailuo-free-api](https://github.com/LLM-Red-Team/hailuo-free-api)
深度求索DeepSeek接口转API [deepseek-free-api](https://github.com/LLM-Red-Team/deepseek-free-api) 深度求索DeepSeek接口转API [deepseek-free-api](https://github.com/LLM-Red-Team/deepseek-free-api)
聆心智能 (Emohaa) 接口转API [emohaa-free-api](https://github.com/LLM-Red-Team/emohaa-free-api) 聆心智能 (Emohaa) 接口转API [emohaa-free-api](https://github.com/LLM-Red-Team/emohaa-free-api)(当前不可用)
## 目录 ## 目录
* [免责声明](#免责声明) * [免责声明](#免责声明)
* [在线体验](#在线体验)
* [效果示例](#效果示例) * [效果示例](#效果示例)
* [接入准备](#接入准备) * [接入准备](#接入准备)
* [智能体接入](#智能体接入) * [智能体接入](#智能体接入)
* [多账号接入](#多账号接入) * [多账号接入](#多账号接入)
* [Docker部署](#Docker部署) * [Docker部署](#Docker部署)
* [Docker-compose部署](#Docker-compose部署) * [Docker-compose部署](#Docker-compose部署)
* [Render部署](#Render部署) * [Render部署](#Render部署)
* [Vercel部署](#Vercel部署) * [Vercel部署](#Vercel部署)
* [原生部署](#原生部署) * [原生部署](#原生部署)
* [推荐使用客户端](#推荐使用客户端) * [推荐使用客户端](#推荐使用客户端)
* [接口列表](#接口列表) * [接口列表](#接口列表)
@ -65,12 +72,6 @@ MiniMax海螺AI接口转API [hailuo-free-api](https://github.com/LLM-Red-T
**仅限自用,禁止对外提供服务或商用,避免对官方造成服务压力,否则风险自担!** **仅限自用,禁止对外提供服务或商用,避免对官方造成服务压力,否则风险自担!**
## 在线体验
此链接仅临时测试功能,只有一路并发,如果遇到异常请稍后重试,建议自行部署使用。
https://udify.app/chat/Pe89TtaX3rKXM8NS
## 效果示例 ## 效果示例
### 验明正身Demo ### 验明正身Demo
@ -285,8 +286,10 @@ Authorization: Bearer [refresh_token]
请求数据: 请求数据:
```json ```json
{ {
// 如果使用智能体请填写智能体ID到此处否则可以乱填 // 默认模型glm-4-plus
"model": "glm4", // zero思考推理模型glm-4-zero / glm-4-think
// 如果使用智能体请填写智能体ID到此处
"model": "glm-4-plus",
// 目前多轮对话基于消息合并实现某些场景可能导致能力下降且受单轮最大token数限制 // 目前多轮对话基于消息合并实现某些场景可能导致能力下降且受单轮最大token数限制
// 如果您想获得原生的多轮对话体验可以传入首轮消息获得的id来接续上下文 // 如果您想获得原生的多轮对话体验可以传入首轮消息获得的id来接续上下文
// "conversation_id": "65f6c28546bae1f0fbb532de", // "conversation_id": "65f6c28546bae1f0fbb532de",
@ -306,7 +309,7 @@ Authorization: Bearer [refresh_token]
{ {
// 如果想获得原生多轮对话体验此id你可以传入到下一轮对话的conversation_id来接续上下文 // 如果想获得原生多轮对话体验此id你可以传入到下一轮对话的conversation_id来接续上下文
"id": "65f6c28546bae1f0fbb532de", "id": "65f6c28546bae1f0fbb532de",
"model": "glm4", "model": "glm-4",
"object": "chat.completion", "object": "chat.completion",
"choices": [ "choices": [
{ {
@ -431,7 +434,7 @@ Authorization: Bearer [refresh_token]
```json ```json
{ {
// 如果使用智能体请填写智能体ID到此处否则可以乱填 // 如果使用智能体请填写智能体ID到此处否则可以乱填
"model": "glm4", "model": "glm-4",
"messages": [ "messages": [
{ {
"role": "user", "role": "user",
@ -458,7 +461,7 @@ Authorization: Bearer [refresh_token]
```json ```json
{ {
"id": "cnmuo7mcp7f9hjcmihn0", "id": "cnmuo7mcp7f9hjcmihn0",
"model": "glm4", "model": "glm-4",
"object": "chat.completion", "object": "chat.completion",
"choices": [ "choices": [
{ {

596
README_EN.md Normal file
View File

@ -0,0 +1,596 @@
# GLM AI Free Service
[![](https://img.shields.io/github/license/llm-red-team/glm-free-api.svg)](LICENSE)
![](https://img.shields.io/github/stars/llm-red-team/glm-free-api.svg)
![](https://img.shields.io/github/forks/llm-red-team/glm-free-api.svg)
![](https://img.shields.io/docker/pulls/vinlic/glm-free-api.svg)
Supports high-speed streaming output, multi-turn dialogues, internet search, long document reading, image analysis, zero-configuration deployment, multi-token support, and automatic session trace cleanup.
Fully compatible with the ChatGPT interface.
Also, the following free APIs are available for your attention:
Moonshot AI (Kimi.ai) API to API [kimi-free-api](https://github.com/LLM-Red-Team/kimi-free-api/tree/master)
StepFun (StepChat) API to API [step-free-api](https://github.com/LLM-Red-Team/step-free-api)
Ali Tongyi (Qwen) API to API [qwen-free-api](https://github.com/LLM-Red-Team/qwen-free-api)
ZhipuAI (ChatGLM) API to API [glm-free-api](https://github.com/LLM-Red-Team/glm-free-api)
ByteDance (Doubao) API to API [doubao-free-api](https://github.com/LLM-Red-Team/doubao-free-api)
Meta Sota (metaso) API to API [metaso-free-api](https://github.com/LLM-Red-Team/metaso-free-api)
Iflytek Spark (Spark) API to API [spark-free-api](https://github.com/LLM-Red-Team/spark-free-api)
MiniMaxHailuoAPI to API [hailuo-free-api](https://github.com/LLM-Red-Team/hailuo-free-api)
DeepSeekDeepSeekAPI to API [deepseek-free-api](https://github.com/LLM-Red-Team/deepseek-free-api)
Lingxin Intelligence (Emohaa) API to API [emohaa-free-api](https://github.com/LLM-Red-Team/emohaa-free-api) (OUT OF ORDER)
## Table of Contents
* [Announcement](#Announcement)
* [Online Experience](#Online-Experience)
* [Effect Examples](#Effect-Examples)
* [Access Preparation](#Access-Preparation)
* [Agent Access](#Agent-Access)
* [Multiple Account Access](#Multiple-Account-Access)
* [Docker Deployment](#Docker-Deployment)
* [Docker-compose Deployment](#Docker-compose-Deployment)
* [Render Deployment](#Render-Deployment)
* [Vercel Deployment](#Vercel-Deployment)
* [Native Deployment](#Native-Deployment)
* [Recommended Clients](#Recommended-Clients)
* [Interface List](#Interface-List)
* [Conversation Completion](#Conversation-Completion)
* [Video Generation](#Video-Generation)
* [AI Drawing](#AI-Drawing)
* [Document Interpretation](#Document-Interpretation)
* [Image Analysis](#Image-Analysis)
* [Refresh_token Survival Detection](#Refresh_token-Survival-Detection)
* [Notification](#Notification)
* [Nginx Anti-generation Optimization](#Nginx-Anti-generation-Optimization)
* [Token Statistics](#Token-Statistics)
* [Star History](#star-history)
## Announcement
**This API is unstable. So we highly recommend you go to the [Zhipu](https://open.bigmodel.cn/) use the offical API, avoiding banned.**
**This organization and individuals do not accept any financial donations and transactions. This project is purely for research, communication, and learning purposes!**
**For personal use only, it is forbidden to provide services or commercial use externally to avoid causing service pressure on the official, otherwise, bear the risk yourself!**
**For personal use only, it is forbidden to provide services or commercial use externally to avoid causing service pressure on the official, otherwise, bear the risk yourself!**
**For personal use only, it is forbidden to provide services or commercial use externally to avoid causing service pressure on the official, otherwise, bear the risk yourself!**
## Online Experience
This link is only for temporary testing of functions and cannot be used for a long time. For long-term use, please deploy by yourself.
https://udify.app/chat/Pe89TtaX3rKXM8NS
## Effect Examples
### Identity Verification
![Identity Verification](./doc/example-1.png)
### AI-Agent
Agent link[Comments Generator](https://chatglm.cn/main/gdetail/65c046a531d3fcb034918abe)
![AI-Agent](./doc/example-9.png)
### Combined with Dify workflow
Experience linkhttps://udify.app/chat/m46YgeVLNzFh4zRs
<img width="390" alt="image" src="https://github.com/LLM-Red-Team/glm-free-api/assets/20235341/4773b9f6-b1ca-460c-b3a7-c56bdb1f0659">
### Multi-turn Dialogue
![Multi-turn Dialogue](./doc/example-6.png)
### Video Generation
[View](https://sfile.chatglm.cn/testpath/video/c1f59468-32fa-58c3-bd9d-ab4230cfe3ca_0.mp4)
### AI Drawing
![AI Drawing](./doc/example-10.png)
### Internet Search
![Internet Search](./doc/example-2.png)
### Long Document Reading
![Long Document Reading](./doc/example-5.png)
### Using Code
![Using Code](./doc/example-12.png)
### Image Analysis
![Image Analysis](./doc/example-3.png)
## Access Preparation
Obtain `refresh_token` from [Zhipu](https://chatglm.cn/)
Enter Zhipu Qingyan and start a random conversation, then press F12 to open the developer tools. Find the value of `tongyi_sso_ticket` in Application > Cookies, which will be used as the Bearer Token value for Authorization: `Authorization: Bearer TOKEN`
![example0](./doc/example-0.png)
### Agent Access
Open a window of Agent Chat, the ID in the url is the ID of the Agent, which is the parameter of `model`.
![example11](./doc/example-11.png)
### Multiple Account Access
You can provide multiple account chatglm_refresh_tokens and use `,` to join them:
`Authorization: Bearer TOKEN1,TOKEN2,TOKEN3`
The service will pick one each time a request is made.
## Docker Deployment
Please prepare a server with a public IP and open port 8000.
Pull the image and start the service
```shell
docker run -it -d --init --name step-free-api -p 8000:8000 -e TZ=Asia/Shanghai vinlic/step-free-api:latest
```
check real-time service logs
```shell
docker logs -f glm-free-api
```
Restart service
```shell
docker restart glm-free-api
```
Shut down service
```shell
docker stop glm-free-api
```
### Docker-compose Deployment
```yaml
version: '3'
services:
glm-free-api:
container_name: glm-free-api
image: vinlic/glm-free-api:latest
restart: always
ports:
- "8000:8000"
environment:
- TZ=Asia/Shanghai
```
### Render Deployment
**Attention: Some deployment regions may not be able to connect to Kimi. If container logs show request timeouts or connection failures (Singapore has been tested and found unavailable), please switch to another deployment region!**
**Attention: Container instances for free accounts will automatically stop after a period of inactivity, which may result in a 50-second or longer delay during the next request. It is recommended to check [Render Container Keepalive](https://github.com/LLM-Red-Team/free-api-hub/#Render%E5%AE%B9%E5%99%A8%E4%BF%9D%E6%B4%BB)**
1. Fork this project to your GitHub account.
2. Visit [Render](https://dashboard.render.com/) and log in with your GitHub account.
3. Build your Web Service (`New+` -> `Build and deploy from a Git repository` -> `Connect your forked project` -> `Select deployment region` -> `Choose instance type as Free` -> `Create Web Service`).
4. After the build is complete, copy the assigned domain and append the URL to access it.
### Vercel Deployment
**Note: Vercel free accounts have a request response timeout of 10 seconds, but interface responses are usually longer, which may result in a 504 timeout error from Vercel!**
Please ensure that Node.js environment is installed first.
```shell
npm i -g vercel --registry http://registry.npmmirror.com
vercel login
git clone https://github.com/LLM-Red-Team/glm-free-api
cd glm-free-api
vercel --prod
```
## Native Deployment
Please prepare a server with a public IP and open port 8000.
Please install the Node.js environment and configure the environment variables first, and confirm that the node command is available.
Install dependencies
```shell
npm i
```
Install PM2 for process guarding
```shell
npm i -g pm2
```
Compile and build. When you see the dist directory, the build is complete.
```shell
npm run build
```
Start service
```shell
pm2 start dist/index.js --name "glm-free-api"
```
View real-time service logs
```shell
pm2 logs glm-free-api
```
Restart service
```shell
pm2 reload glm-free-api
```
Shut down service
```shell
pm2 stop glm-free-api
```
## Recommended Clients
Using the following second-developed clients for free-api series projects is faster and easier, and supports document/image uploads!
[Clivia](https://github.com/Yanyutin753/lobe-chat)'s modified LobeChat [https://github.com/Yanyutin753/lobe-chat](https://github.com/Yanyutin753/lobe-chat)
[Time@](https://github.com/SuYxh)'s modified ChatGPT Web [https://github.com/SuYxh/chatgpt-web-sea](https://github.com/SuYxh/chatgpt-web-sea)
## interface List
Currently, the `/v1/chat/completions` interface compatible with openai is supported. You can use the client access interface compatible with openai or other clients, or use online services such as [dify](https://dify.ai/) Access and use.
### Conversation Completion
Conversation completion interface, compatible with openai's [chat-completions-api](https://platform.openai.com/docs/guides/text-generation/chat-completions-api).
**POST /v1/chat/completions**
The header needs to set the Authorization header:
```
Authorization: Bearer [refresh_token]
```
Request data:
```json
{
// Default model: glm-4-plus
// zero thinking model: glm-4-zero / glm-4-think
// If using the Agent, fill in the Agent ID here
"model": "glm-4",
// Currently, multi-round conversations are realized based on message merging, which in some scenarios may lead to capacity degradation and is limited by the maximum number of tokens in a single round.
// If you want a native multi-round dialog experience, you can pass in the ids obtained from the last round of messages to pick up the context
// "conversation_id": "65f6c28546bae1f0fbb532de",
"messages": [
{
"role": "user",
"content": "Who RU"
}
],
// If using SSE stream, please set it to true, the default is false
"stream": false
}
```
Response data
```json
{
"id": "65f6c28546bae1f0fbb532de",
"model": "glm-4",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "My name is Zhipu Qingyan."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 1,
"total_tokens": 2
},
"created": 1710152062
}
```
### Video Generation
Video API
**If you're not VIP, you will wait in line for a long time.**
**POST /v1/videos/generations**
The header needs to set the Authorization header:
```
Authorization: Bearer [refresh_token]
```
Request data:
```json
{
// 模型名称
// cogvideox默认官方视频模型
// cogvideox-pro先生成图像再作为参考图像生成视频作为视频首帧引导视频效果但耗时更长
"model": "cogvideox",
// 视频生成提示词
"prompt": "一只可爱的猫走在花丛中",
// 支持使用图像URL或者BASE64_URL作为视频首帧参考图像如果使用cogvideox-pro则会忽略此参数
// "image_url": "https://sfile.chatglm.cn/testpath/b5341945-3839-522c-b4ab-a6268cb131d5_0.png",
// 支持设置视频风格卡通3D/黑白老照片/油画/电影感
// "video_style": "油画",
// 支持设置情感氛围:温馨和谐/生动活泼/紧张刺激/凄凉寂寞
// "emotional_atmosphere": "生动活泼",
// 支持设置运镜方式:水平/垂直/推近/拉远
// "mirror_mode": "水平"
}
```
Response data:
```json
{
"created": 1722103836,
"data": [
{
// 对话ID目前没啥用
"conversation_id": "66a537ec0603e53bccb8900a",
// 封面URL
"cover_url": "https://sfile.chatglm.cn/testpath/video_cover/c1f59468-32fa-58c3-bd9d-ab4230cfe3ca_cover_0.png",
// 视频URL
"video_url": "https://sfile.chatglm.cn/testpath/video/c1f59468-32fa-58c3-bd9d-ab4230cfe3ca_0.mp4",
// 视频时长
"video_duration": "6s",
// 视频分辨率
"resolution": "1440×960"
}
]
}
```
### AI Drawing
This format is compatible with the [gpt-4-vision-preview](https://platform.openai.com/docs/guides/vision) API format.
**POST /v1/images/generations**
The header needs to set the Authorization header:
```
Authorization: Bearer [refresh_token]
```
Request data:
```json
{
// 如果使用智能体请填写智能体ID到此处否则可以乱填
"model": "cogview-3",
"prompt": "A cute cat"
}
```
Response data:
```json
{
"created": 1711507449,
"data": [
{
"url": "https://sfile.chatglm.cn/testpath/5e56234b-34ae-593c-ba4e-3f7ba77b5768_0.png"
}
]
}
```
### Document Interpretation
Provide an accessible file URL or BASE64_URL to parse.
**POST /v1/chat/completions**
The header needs to set the Authorization header:
```
Authorization: Bearer [refresh_token]
```
Request data:
```json
{
// 如果使用智能体请填写智能体ID到此处否则可以乱填
"model": "glm-4",
"messages": [
{
"role": "user",
"content": [
{
"type": "file",
"file_url": {
"url": "https://mj101-1317487292.cos.ap-shanghai.myqcloud.com/ai/test.pdf"
}
},
{
"type": "text",
"text": "文档里说了什么?"
}
]
}
],
// 如果使用SSE流请设置为true默认false
"stream": false
}
```
Response data:
```json
{
"id": "cnmuo7mcp7f9hjcmihn0",
"model": "glm-4",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "根据文档内容,我总结如下:\n\n这是一份关于希腊罗马时期的魔法咒语和仪式的文本包含几个魔法仪式\n\n1. 一个涉及面包、仪式场所和特定咒语的仪式,用于使某人爱上你。\n\n2. 一个针对女神赫卡忒的召唤仪式,用来折磨某人直到她自愿来到你身边。\n\n3. 一个通过念诵爱神阿芙罗狄蒂的秘密名字,连续七天进行仪式,来赢得一个美丽女子的心。\n\n4. 一个通过燃烧没药并念诵咒语,让一个女子对你产生强烈欲望的仪式。\n\n这些仪式都带有魔法和迷信色彩使用各种咒语和象征性行为来影响人的感情和意愿。"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 1,
"total_tokens": 2
},
"created": 100920
}
```
### Image Analysis
Provide an accessible image URL or BASE64_URL to parse.
This format is compatible with the [gpt-4-vision-preview](https://platform.openai.com/docs/guides/vision) API format. You can also use this format to transmit documents for parsing.
**POST /v1/chat/completions**
The header needs to set the Authorization header:
```
Authorization: Bearer [refresh_token]
```
Request data:
```json
{
"model": "65c046a531d3fcb034918abe",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "http://1255881664.vod2.myqcloud.com/6a0cd388vodbj1255881664/7b97ce1d3270835009240537095/uSfDwh6ZpB0A.png"
}
},
{
"type": "text",
"text": "图像描述了什么?"
}
]
}
],
"stream": false
}
```
Response data:
```json
{
"id": "65f6c28546bae1f0fbb532de",
"model": "glm",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "图片中展示的是一个蓝色背景下的logo具体地左边是一个由多个蓝色的圆点组成的圆形图案右边是“智谱·AI”四个字字体颜色为蓝色。"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 1,
"total_tokens": 2
},
"created": 1710670469
}
```
### Refresh_token Survival Detection
Check whether refresh_token is alive. If live is not true, otherwise it is false. Please do not call this interface frequently (less than 10 minutes).
**POST /token/check**
Request data:
```json
{
"token": "eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9..."
}
```
Response data:
```json
{
"live": true
}
```
## Notification
### Nginx Anti-generation Optimization
If you are using Nginx reverse proxy `glm-free-api`, please add the following configuration items to optimize the output effect of the stream and optimize the experience.
```nginx
# Turn off proxy buffering. When set to off, Nginx will immediately send client requests to the backend server and immediately send responses received from the backend server back to the client.
proxy_buffering off;
# Enable chunked transfer encoding. Chunked transfer encoding allows servers to send data in chunks for dynamically generated content without knowing the size of the content in advance.
chunked_transfer_encoding on;
# Turn on TCP_NOPUSH, which tells Nginx to send as much data as possible before sending the packet to the client. This is usually used in conjunction with sendfile to improve network efficiency.
tcp_nopush on;
# Turn on TCP_NODELAY, which tells Nginx not to delay sending data and to send small data packets immediately. In some cases, this can reduce network latency.
tcp_nodelay on;
#Set the timeout to keep the connection, here it is set to 120 seconds. If there is no further communication between client and server during this time, the connection will be closed.
keepalive_timeout 120;
```
### Token Statistics
Since the inference side is not in glm-free-api, the token cannot be counted and will be returned as a fixed number!!!!!
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=LLM-Red-Team/glm-free-api&type=Date)](https://star-history.com/#LLM-Red-Team/glm-free-api&Date)

View File

@ -1,6 +1,6 @@
{ {
"name": "glm-free-api", "name": "glm-free-api",
"version": "0.0.32", "version": "0.0.35",
"description": "GLM Free API Server", "description": "GLM Free API Server",
"type": "module", "type": "module",
"main": "dist/index.js", "main": "dist/index.js",

View File

@ -17,6 +17,8 @@ import util from "@/lib/util.ts";
const MODEL_NAME = "glm"; const MODEL_NAME = "glm";
// 默认的智能体IDGLM4 // 默认的智能体IDGLM4
const DEFAULT_ASSISTANT_ID = "65940acff94777010aa6b796"; const DEFAULT_ASSISTANT_ID = "65940acff94777010aa6b796";
// zero推理模型智能体ID
const ZERO_ASSISTANT_ID = "676411c38945bbc58a905d31";
// access_token有效期 // access_token有效期
const ACCESS_TOKEN_EXPIRES = 3600; const ACCESS_TOKEN_EXPIRES = 3600;
// 最大重试次数 // 最大重试次数
@ -165,13 +167,13 @@ async function removeConversation(
* *
* @param messages gpt系列消息格式 * @param messages gpt系列消息格式
* @param refreshToken access_token的refresh_token * @param refreshToken access_token的refresh_token
* @param assistantId ID使GLM4原版 * @param model ID使GLM4原版
* @param retryCount * @param retryCount
*/ */
async function createCompletion( async function createCompletion(
messages: any[], messages: any[],
refreshToken: string, refreshToken: string,
assistantId = DEFAULT_ASSISTANT_ID, model = MODEL_NAME,
refConvId = "", refConvId = "",
retryCount = 0 retryCount = 0
) { ) {
@ -189,6 +191,13 @@ async function createCompletion(
// 如果引用对话ID不正确则重置引用 // 如果引用对话ID不正确则重置引用
if (!/[0-9a-zA-Z]{24}/.test(refConvId)) refConvId = ""; if (!/[0-9a-zA-Z]{24}/.test(refConvId)) refConvId = "";
let assistantId = /^[a-z0-9]{24,}$/.test(model) ? model : DEFAULT_ASSISTANT_ID;
if(model.indexOf('think') != -1 || model.indexOf('zero') != -1) {
assistantId = ZERO_ASSISTANT_ID;
logger.info('使用思考模型');
}
// 请求流 // 请求流
const token = await acquireToken(refreshToken); const token = await acquireToken(refreshToken);
const result = await axios.post( const result = await axios.post(
@ -200,8 +209,11 @@ async function createCompletion(
meta_data: { meta_data: {
channel: "", channel: "",
draft_id: "", draft_id: "",
if_plus_model: true,
input_question_type: "xxxx", input_question_type: "xxxx",
is_test: false, is_test: false,
platform: "pc",
quote_log_id: ""
}, },
}, },
{ {
@ -231,7 +243,7 @@ async function createCompletion(
const streamStartTime = util.timestamp(); const streamStartTime = util.timestamp();
// 接收流为输出文本 // 接收流为输出文本
const answer = await receiveStream(result.data); const answer = await receiveStream(model, result.data);
logger.success( logger.success(
`Stream has completed transfer ${util.timestamp() - streamStartTime}ms` `Stream has completed transfer ${util.timestamp() - streamStartTime}ms`
); );
@ -251,7 +263,7 @@ async function createCompletion(
return createCompletion( return createCompletion(
messages, messages,
refreshToken, refreshToken,
assistantId, model,
refConvId, refConvId,
retryCount + 1 retryCount + 1
); );
@ -266,13 +278,13 @@ async function createCompletion(
* *
* @param messages gpt系列消息格式 * @param messages gpt系列消息格式
* @param refreshToken access_token的refresh_token * @param refreshToken access_token的refresh_token
* @param assistantId ID使GLM4原版 * @param model ID使GLM4原版
* @param retryCount * @param retryCount
*/ */
async function createCompletionStream( async function createCompletionStream(
messages: any[], messages: any[],
refreshToken: string, refreshToken: string,
assistantId = DEFAULT_ASSISTANT_ID, model = MODEL_NAME,
refConvId = "", refConvId = "",
retryCount = 0 retryCount = 0
) { ) {
@ -290,6 +302,13 @@ async function createCompletionStream(
// 如果引用对话ID不正确则重置引用 // 如果引用对话ID不正确则重置引用
if (!/[0-9a-zA-Z]{24}/.test(refConvId)) refConvId = ""; if (!/[0-9a-zA-Z]{24}/.test(refConvId)) refConvId = "";
let assistantId = /^[a-z0-9]{24,}$/.test(model) ? model : DEFAULT_ASSISTANT_ID;
if(model.indexOf('think') != -1 || model.indexOf('zero') != -1) {
assistantId = ZERO_ASSISTANT_ID;
logger.info('使用思考模型');
}
// 请求流 // 请求流
const token = await acquireToken(refreshToken); const token = await acquireToken(refreshToken);
const result = await axios.post( const result = await axios.post(
@ -301,8 +320,11 @@ async function createCompletionStream(
meta_data: { meta_data: {
channel: "", channel: "",
draft_id: "", draft_id: "",
if_plus_model: true,
input_question_type: "xxxx", input_question_type: "xxxx",
is_test: false, is_test: false,
platform: "pc",
quote_log_id: ""
}, },
}, },
{ {
@ -354,7 +376,7 @@ async function createCompletionStream(
const streamStartTime = util.timestamp(); const streamStartTime = util.timestamp();
// 创建转换流将消息格式转换为gpt兼容格式 // 创建转换流将消息格式转换为gpt兼容格式
return createTransStream(result.data, (convId: string) => { return createTransStream(model, result.data, (convId: string) => {
logger.success( logger.success(
`Stream has completed transfer ${util.timestamp() - streamStartTime}ms` `Stream has completed transfer ${util.timestamp() - streamStartTime}ms`
); );
@ -372,7 +394,7 @@ async function createCompletionStream(
return createCompletionStream( return createCompletionStream(
messages, messages,
refreshToken, refreshToken,
assistantId, model,
refConvId, refConvId,
retryCount + 1 retryCount + 1
); );
@ -407,8 +429,11 @@ async function generateImages(
meta_data: { meta_data: {
channel: "", channel: "",
draft_id: "", draft_id: "",
if_plus_model: true,
input_question_type: "xxxx", input_question_type: "xxxx",
is_test: false, is_test: false,
platform: "pc",
quote_log_id: ""
}, },
}, },
{ {
@ -904,14 +929,15 @@ function checkResult(result: AxiosResponse, refreshToken: string) {
/** /**
* *
* *
* @param model
* @param stream * @param stream
*/ */
async function receiveStream(stream: any): Promise<any> { async function receiveStream(model: string, stream: any): Promise<any> {
return new Promise((resolve, reject) => { return new Promise((resolve, reject) => {
// 消息初始化 // 消息初始化
const data = { const data = {
id: "", id: "",
model: MODEL_NAME, model,
object: "chat.completion", object: "chat.completion",
choices: [ choices: [
{ {
@ -923,6 +949,8 @@ async function receiveStream(stream: any): Promise<any> {
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 }, usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
created: util.unixTimestamp(), created: util.unixTimestamp(),
}; };
const isSilentModel = model.indexOf('silent') != -1;
let thinkingText = "";
let toolCall = false; let toolCall = false;
let codeGenerating = false; let codeGenerating = false;
let textChunkLength = 0; let textChunkLength = 0;
@ -930,6 +958,7 @@ async function receiveStream(stream: any): Promise<any> {
let lastExecutionOutput = ""; let lastExecutionOutput = "";
let textOffset = 0; let textOffset = 0;
let refContent = ""; let refContent = "";
logger.info(`是否静默模型: ${isSilentModel}`);
const parser = createParser((event) => { const parser = createParser((event) => {
try { try {
if (event.type !== "event") return; if (event.type !== "event") return;
@ -957,6 +986,7 @@ async function receiveStream(stream: any): Promise<any> {
textChunkLength = 0; textChunkLength = 0;
innerStr += "\n"; innerStr += "\n";
} }
if (type == "text") { if (type == "text") {
if (toolCall) { if (toolCall) {
innerStr += "\n"; innerStr += "\n";
@ -965,11 +995,20 @@ async function receiveStream(stream: any): Promise<any> {
} }
if (partStatus == "finish") textChunkLength = text.length; if (partStatus == "finish") textChunkLength = text.length;
return innerStr + text; return innerStr + text;
} else if ( } else if (type == "text_thinking" && !isSilentModel) {
if (toolCall) {
innerStr += "\n";
textOffset++;
toolCall = false;
}
thinkingText = text;
return innerStr;
}else if (
type == "quote_result" && type == "quote_result" &&
status == "finish" && status == "finish" &&
meta_data && meta_data &&
_.isArray(meta_data.metadata_list) _.isArray(meta_data.metadata_list) &&
!isSilentModel
) { ) {
refContent = meta_data.metadata_list.reduce((meta, v) => { refContent = meta_data.metadata_list.reduce((meta, v) => {
return meta + `${v.title} - ${v.url}\n`; return meta + `${v.title} - ${v.url}\n`;
@ -991,7 +1030,7 @@ async function receiveStream(stream: any): Promise<any> {
textOffset += imageText.length; textOffset += imageText.length;
toolCall = true; toolCall = true;
return innerStr + imageText; return innerStr + imageText;
} else if (type == "code" && partStatus == "init") { } else if (type == "code" && status == "init") {
let codeHead = ""; let codeHead = "";
if (!codeGenerating) { if (!codeGenerating) {
codeGenerating = true; codeGenerating = true;
@ -1003,7 +1042,7 @@ async function receiveStream(stream: any): Promise<any> {
return innerStr + codeHead + chunk; return innerStr + codeHead + chunk;
} else if ( } else if (
type == "code" && type == "code" &&
partStatus == "finish" && status == "finish" &&
codeGenerating codeGenerating
) { ) {
const codeFooter = "\n```\n"; const codeFooter = "\n```\n";
@ -1014,7 +1053,7 @@ async function receiveStream(stream: any): Promise<any> {
} else if ( } else if (
type == "execution_output" && type == "execution_output" &&
_.isString(content) && _.isString(content) &&
partStatus == "done" && status == "finish" &&
lastExecutionOutput != content lastExecutionOutput != content
) { ) {
lastExecutionOutput = content; lastExecutionOutput = content;
@ -1032,6 +1071,8 @@ async function receiveStream(stream: any): Promise<any> {
); );
data.choices[0].message.content += chunk; data.choices[0].message.content += chunk;
} else { } else {
if(thinkingText)
data.choices[0].message.content = `[思考开始]\n${thinkingText}[思考结束]\n\n${data.choices[0].message.content}`;
data.choices[0].message.content = data.choices[0].message.content =
data.choices[0].message.content.replace( data.choices[0].message.content.replace(
/【\d+†(来源|源|source)】/g, /【\d+†(来源|源|source)】/g,
@ -1059,18 +1100,22 @@ async function receiveStream(stream: any): Promise<any> {
* *
* gpt兼容流格式 * gpt兼容流格式
* *
* @param model
* @param stream * @param stream
* @param endCallback * @param endCallback
*/ */
function createTransStream(stream: any, endCallback?: Function) { function createTransStream(model: string, stream: any, endCallback?: Function) {
// 消息创建时间 // 消息创建时间
const created = util.unixTimestamp(); const created = util.unixTimestamp();
// 创建转换流 // 创建转换流
const transStream = new PassThrough(); const transStream = new PassThrough();
const isSilentModel = model.indexOf('silent') != -1;
let content = ""; let content = "";
let thinking = false;
let toolCall = false; let toolCall = false;
let codeGenerating = false; let codeGenerating = false;
let textChunkLength = 0; let textChunkLength = 0;
let thinkingText = "";
let codeTemp = ""; let codeTemp = "";
let lastExecutionOutput = ""; let lastExecutionOutput = "";
let textOffset = 0; let textOffset = 0;
@ -1078,7 +1123,7 @@ function createTransStream(stream: any, endCallback?: Function) {
transStream.write( transStream.write(
`data: ${JSON.stringify({ `data: ${JSON.stringify({
id: "", id: "",
model: MODEL_NAME, model,
object: "chat.completion.chunk", object: "chat.completion.chunk",
choices: [ choices: [
{ {
@ -1116,6 +1161,11 @@ function createTransStream(stream: any, endCallback?: Function) {
innerStr += "\n"; innerStr += "\n";
} }
if (type == "text") { if (type == "text") {
if(thinking) {
innerStr += "[思考结束]\n\n"
textOffset = thinkingText.length + 8;
thinking = false;
}
if (toolCall) { if (toolCall) {
innerStr += "\n"; innerStr += "\n";
textOffset++; textOffset++;
@ -1123,11 +1173,26 @@ function createTransStream(stream: any, endCallback?: Function) {
} }
if (partStatus == "finish") textChunkLength = text.length; if (partStatus == "finish") textChunkLength = text.length;
return innerStr + text; return innerStr + text;
} else if (type == "text_thinking" && !isSilentModel) {
if(!thinking) {
innerStr += "[思考开始]\n";
textOffset = 7;
thinking = true;
}
if (toolCall) {
innerStr += "\n";
textOffset++;
toolCall = false;
}
if (partStatus == "finish") textChunkLength = text.length;
thinkingText += text.substring(thinkingText.length, text.length);
return innerStr + text;
} else if ( } else if (
type == "quote_result" && type == "quote_result" &&
status == "finish" && status == "finish" &&
meta_data && meta_data &&
_.isArray(meta_data.metadata_list) _.isArray(meta_data.metadata_list) &&
!isSilentModel
) { ) {
const searchText = const searchText =
meta_data.metadata_list.reduce( meta_data.metadata_list.reduce(
@ -1154,7 +1219,7 @@ function createTransStream(stream: any, endCallback?: Function) {
textOffset += imageText.length; textOffset += imageText.length;
toolCall = true; toolCall = true;
return innerStr + imageText; return innerStr + imageText;
} else if (type == "code" && partStatus == "init") { } else if (type == "code" && status == "init") {
let codeHead = ""; let codeHead = "";
if (!codeGenerating) { if (!codeGenerating) {
codeGenerating = true; codeGenerating = true;
@ -1166,7 +1231,7 @@ function createTransStream(stream: any, endCallback?: Function) {
return innerStr + codeHead + chunk; return innerStr + codeHead + chunk;
} else if ( } else if (
type == "code" && type == "code" &&
partStatus == "finish" && status == "finish" &&
codeGenerating codeGenerating
) { ) {
const codeFooter = "\n```\n"; const codeFooter = "\n```\n";
@ -1177,7 +1242,7 @@ function createTransStream(stream: any, endCallback?: Function) {
} else if ( } else if (
type == "execution_output" && type == "execution_output" &&
_.isString(content) && _.isString(content) &&
partStatus == "done" && status == "finish" &&
lastExecutionOutput != content lastExecutionOutput != content
) { ) {
lastExecutionOutput = content; lastExecutionOutput = content;

View File

@ -5,6 +5,9 @@ import Response from '@/lib/response/Response.ts';
import chat from '@/api/controllers/chat.ts'; import chat from '@/api/controllers/chat.ts';
import logger from '@/lib/logger.ts'; import logger from '@/lib/logger.ts';
// zero推理模型智能体ID
const ZERO_ASSISTANT_ID = "676411c38945bbc58a905d31";
export default { export default {
prefix: '/v1/chat', prefix: '/v1/chat',
@ -21,15 +24,15 @@ export default {
// 随机挑选一个refresh_token // 随机挑选一个refresh_token
const token = _.sample(tokens); const token = _.sample(tokens);
const { model, conversation_id: convId, messages, stream } = request.body; const { model, conversation_id: convId, messages, stream } = request.body;
const assistantId = /^[a-z0-9]{24,}$/.test(model) ? model : undefined
if (stream) { if (stream) {
const stream = await chat.createCompletionStream(messages, token, assistantId, convId); const stream = await chat.createCompletionStream(messages, token, model, convId);
return new Response(stream, { return new Response(stream, {
type: "text/event-stream" type: "text/event-stream"
}); });
} }
else else
return await chat.createCompletion(messages, token, assistantId, convId); return await chat.createCompletion(messages, token, model, convId);
} }
} }

View File

@ -18,6 +18,11 @@ export default {
"object": "model", "object": "model",
"owned_by": "glm-free-api" "owned_by": "glm-free-api"
}, },
{
"id": "glm-4-plus",
"object": "model",
"owned_by": "glm-free-api"
},
{ {
"id": "glm-4v", "id": "glm-4v",
"object": "model", "object": "model",