Skip to main content
POST
/
parse
/
async
Python
import json
import requests

url = "https://somark.tech/api/v1/parse/async"

data = {
    "output_formats": ["markdown", "json"],
    "api_key": "sk-***",
    "element_formats": json.dumps({
        "image": "url",
        "formula": "latex",
        "table": "html",
        "cs": "image",
    }),
    "feature_config": json.dumps({
        "enable_text_cross_page": False,
        "enable_table_cross_page": False,
        "enable_title_level_recognition": False,
        "enable_inline_image": True,
        "enable_table_image": True,
        "enable_image_understanding": True,
        "keep_header_footer": False,
    }),
}

files = {"file": ("example.pdf", open("example.pdf", "rb"))}

response = requests.post(url, data=data, files=files)
task_id = response.json()["data"]["task_id"]
print(f"任务已提交,task_id: {task_id}")
{
  "code": 0,
  "message": "任务已提交",
  "data": {
    "task_id": "c5e6c983f28a4e6eb5d6c061343a8642",
    "status": "queuing"
  }
}
Path change: This endpoint path has been changed from /extract/async to /parse/async. The old path will be discontinued on December 31, 2026. Please migrate to the new path before then.
Async parsing requires both endpoints. Calling the submit endpoint alone does not return the final parsing result.
  1. Call this endpoint to submit the task. It immediately returns a task_id.
  2. Use that task_id to poll the result query endpoint.
  3. Read the parsing result from the result query endpoint after the task status becomes successful. The recommended polling interval is 3~5 seconds.
The parameter definitions for output_formats, element_formats, and feature_config are the same as in Sync parsing.

Body

multipart/form-data
file
file
required

待解析的文件,支持 PDF、图片、Office 格式

api_key
string
required

API 密钥,格式 sk-***

output_formats
enum<string>[]

输出格式,可多选。不传时默认为 ["markdown", "json"]。支持 json / markdown / zip,其中 zip 将所有输出文件打包为压缩包

Available options:
json,
markdown,
zip
element_formats
object

元素格式配置,控制各类元素的输出格式

feature_config
object

解析行为配置

Response

200 - application/json

任务提交成功

code
integer

状态码,0 为成功,非 0错误码

Example:

0

message
string
Example:

"任务已提交"

data
object