Files

Upload File

Note: Each user can upload a maximum of 1,000 files, with each file not exceeding 100MB in size. The total size of all uploaded files must not exceed 10GB. If you need to upload more files, you will need to delete some of the files that are no longer needed. The file parsing service is currently free, but during peak request periods, the platform may implement rate-limiting strategies.

Request Endpoint

POST https://api.moonshot.ai/v1/files

Once the file is successfully uploaded, we will process it accordingly.

Example Request

Python Example

# The file can be of various types
# The purpose currently supports "file-extract", "image", and "video" types
file_object = client.files.create(file=Path("xlnet.pdf"), purpose="file-extract")

For purpose="file-extract", the file contents will be extracted. Additionally, you can use purpose="image" or purpose="video" to upload images and videos respectively for visual understanding.

The file interface is the same as the one used in the Kimi intelligent assistant for uploading files, and it supports the same file formats. These include .pdf, .txt, .csv, .doc, .docx, .xls, .xlsx, .ppt, .pptx, .md, .jpeg, .png, .bmp, .gif, .svg, .svgz, .webp, .ico, .xbm, .dib, .pjp, .tif, .pjpeg, .avif, .dot, .apng, .epub, .tiff, .jfif, .html, .json, .mobi, .log, .go, .h, .c, .cpp, .cxx, .cc, .cs, .java, .js, .css, .jsp, .php, .py, .py3, .asp, .yaml, .yml, .ini, .conf, .ts, .tsx, etc.

For File Content Extraction

When uploading a file, selecting purpose="file-extract" allows the model to obtain information from the file as context.

Example Request

from pathlib import Path
from openai import OpenAI
 
client = OpenAI(
    api_key = "$MOONSHOT_API_KEY",
    base_url = "https://api.moonshot.ai/v1",
)
 
# xlnet.pdf is an example file; we support pdf, doc, and image formats. For images and pdf files, we provide OCR capabilities.
file_object = client.files.create(file=Path("xlnet.pdf"), purpose="file-extract")
 
# Retrieve the result
# file_content = client.files.retrieve_content(file_id=file_object.id)
# Note: The previous retrieve_content API is marked as deprecated in the latest version. You can use the following line instead.
# If you are using an older version, you can use retrieve_content.
file_content = client.files.content(file_id=file_object.id).text
 
# Include it in the request
messages = [
    {
        "role": "system",
        "content": "You are Kimi, an AI assistant provided by Moonshot AI. You are particularly skilled in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You will refuse to answer any questions involving terrorism, racism, pornography, or violence. Moonshot AI is a proper noun and should not be translated into other languages.",
    },
    {
        "role": "system",
        "content": file_content,
    },
    {"role": "user", "content": "Please give a brief introduction of what xlnet.pdf is about"},
]
 
# Then call chat-completion to get Kimi's response
 
completion = client.chat.completions.create(
  model="kimi-k2-turbo-preview",
  messages=messages,
  temperature=0.6,
)
 
print(completion.choices[0].message)

Replace the $MOONSHOT_API_KEY part with your own API Key. Alternatively, you can set it as an environment variable before making the call.

Multi-File Chat Example

If you want to upload multiple files at once and have a conversation with Kimi based on these files, you can refer to the following example:

from typing import *
 
import os
import json
from pathlib import Path
 
from openai import OpenAI
 
client = OpenAI(
    base_url="https://api.moonshot.ai/v1",
    # We will get the value of MOONSHOT_DEMO_API_KEY from the environment variable as the API Key.
    # Please make sure you have correctly set the value of MOONSHOT_DEMO_API_KEY in the environment variable.
    api_key=os.environ["MOONSHOT_DEMO_API_KEY"],
)
 
 
def upload_files(files: List[str]) -> List[Dict[str, Any]]:
    """
    upload_files will upload all the files (paths) through the file upload interface '/v1/files' and get the uploaded file content to generate file messages.
    Each file will be an independent message, and the role of these messages will be system. The Kimi large language model will correctly identify the file content in these system messages.
 
    :param files: A list containing the paths of the files to be uploaded. The paths can be absolute or relative, and please pass the file paths in the form of strings.
    :return: A list of messages containing the file content. Please add these messages to the context, i.e., the messages parameter when requesting the `/v1/chat/completions` interface.
    """
    messages = []
 
    # For each file path, we will upload the file, extract the file content, and finally generate a message with the role of system, and add it to the final list of messages to be returned.
    for file in files:
        file_object = client.files.create(file=Path(file), purpose="file-extract")
        file_content = client.files.content(file_id=file_object.id).text
        messages.append({
            "role": "system",
            "content": file_content,
        })
 
    return messages
 
 
def main():
    file_messages = upload_files(files=["upload_files.py"])
 
    messages = [
        # We use the * syntax to deconstruct the file_messages messages, making them the first N messages in the messages list.
        *file_messages,
        {
            "role": "system",
            "content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You will refuse to answer any questions related to terrorism, racism, pornography, or violence. Moonshot AI is a proper noun and should not be translated into other languages.",
        },
        {
            "role": "user",
            "content": "Summarize the content of these files.",
        },
    ]
 
    print(json.dumps(messages, indent=2, ensure_ascii=False))
 
    completion = client.chat.completions.create(
        model="kimi-k2-turbo-preview",
        messages=messages,
    )
 
    print(completion.choices[0].message.content)
 
 
if __name__ == '__main__':
    main()

For Image or Video Understanding

When uploading files, select purpose="image" or purpose="video". Uploaded images and videos can be used for native understanding by the model. Please refer to Using Vision Models

List Files

This feature is used to list all the files that a user has uploaded.

Request Address

GET https://api.moonshot.ai/v1/files

Example Request

Python Request

file_list = client.files.list()
 
for file in file_list.data:
    print(file) # Check the information of each file

Delete File

This feature can be used to delete files that are no longer needed.

Request Address

DELETE https://api.moonshot.ai/v1/files/{file_id}

Example Request

Python Request

client.files.delete(file_id=file_id)

Get File Information

This feature is used to obtain the basic information of a specified file.

Request Address

GET https://api.moonshot.ai/v1/files/{file_id}

Example Request

Python Request

client.files.retrieve(file_id=file_id)
# FileObject(
#     id='clg681objj8g9m7n4je0',
#     bytes=761790,
#     created_at=1700815879,
#     filename='xlnet.pdf',
#     object='file',
#     purpose='file-extract',
#     status='ok', status_details='') # If status is error, extraction has failed

Get File Content

This feature can retrieve the extraction results for files with the purpose of "file content extraction". Typically, it is a valid JSON formatted string and aligns with our recommended format. If you need to extract multiple files, you can concatenate them into a large string separated by newline characters \n in a message, and add them to the history with the role set to system.

Request Address

GET https://api.moonshot.ai/v1/files/{file_id}/content

Example Request

# file_content = client.files.retrieve_content(file_id=file_object.id)
# The type of file_content is `str`
# Note: The previous retrieve_content API is marked with a warning in the latest version. You can use the following line instead.
# If you are using an older version, you can use retrieve_content.
file_content = client.files.content(file_id=file_object.id).text
# Our output is currently a JSON in an internally agreed format, but it should be placed in the message as text.

Partial Mode Estimate Token