feat(ocr): 添加精准识别 OCR 功能并更新相关配置 #1

xiaomizhoubaobei · 2025-08-29T14:55:29Z

User description

修改发布到 PyPI 的条件，只在 push 或合并 pull request 时执行
更新代码同步到多个仓库的条件，只在 push 或合并 pull request时执行
修复提取版本号的正则表达式
优化 GitHub Release 创建步骤
统一配置 Git 用户信息的方式

PR Type

Enhancement, Other

Description

Add GeneralAccurateOCR class for precise OCR recognition
Update version from 0.0.4 to 0.0.5
Optimize CI/CD workflows with conditional execution
Fix configuration and deployment settings

Diagram Walkthrough

flowchart LR
  A["New GeneralAccurateOCR"] --> B["Enhanced OCR Module"]
  C["Version Update"] --> D["Package Release"]
  E["CI/CD Optimization"] --> F["Conditional Workflows"]
  G["Config Updates"] --> H["Deployment Settings"]

File Walkthrough

Relevant files

Enhancement

4 files

__init__.py `Update version and export new OCR class`	+2/-2
GeneralAccurateOCR.py `Add new precise OCR recognition class`	+110/-0
GeneralBasicOCR.py `Optimize logging configuration and parameters`	+16/-16
__init__.py `Import new GeneralAccurateOCR module`	+2/-1

Configuration changes

5 files

setup.py `Bump version to 0.0.5`	+1/-1
.cnb.yml `Update repository URL configuration`	+1/-1
publish-to-pypi.yml `Add conditional execution and remove Aliyun upload`	+4/-9
release.yml `Add conditional execution and fix version extraction`	+6/-4
sync-to-coding.yml `Add conditional execution for all sync jobs`	+12/-4

Bug fix

1 files

Dockerfile `Fix typo in package installation command`	+1/-1

- 新增 GeneralAccurateOCR 类，提供更精准的 OCR 识别功能 - 更新 .cnb.yml 文件中的制品库地址 - 修改 mzapi/tencent/ocr/__init__.py，引入新的 OCR 模块 - 更新 mzapi/__init__.py，增加新功能并修改版本号 - 调整 Dockerfile 中的安装命令 -移除 GitHub Actions 中的阿里云 PyPI 上传步骤 - 更新 setup.py 文件版本号

- 修改发布到 PyPI 的条件，只在 push 或合并 pull request 时执行 - 更新代码同步到多个仓库的条件，只在 push 或合并 pull request时执行 - 修复提取版本号的正则表达式 - 优化 GitHub Release 创建步骤 - 统一配置 Git 用户信息的方式

bolt-new-by-stackblitz · 2025-08-29T14:55:32Z

Run & review this pull request in StackBlitz Codeflow.

sourcery-ai · 2025-08-29T14:55:34Z

Reviewer's Guide

This PR implements precise OCR capabilities by adding a new GeneralAccurateOCR client, refactors the existing Basic OCR logging logic, overhauls CI workflows to trigger only on push or merged PRs with improved release steps, bumps the package version to 0.0.5, and corrects auxiliary configuration typos.

Entity relationship diagram for all exports in mzapi package

erDiagram
    mzapi {
        string __version__
        string __author__
        string __email__
        list __all__
    }
    mzapi ||--o| GeneralBasicOCR : exports
    mzapi ||--o| GeneralAccurateOCR : exports

Class diagram for new and updated OCR clients

classDiagram
    class GeneralBasicOCR {
        - logger
        - cred
        - client
        __init__(secret_id, secret_key, token, log_level=None)
        recognize(ImageBase64, ImageUrl, Scene, LanguageType, IsPdf, PdfPageNumber, IsWords)
    }
    class GeneralAccurateOCR {
        - logger
        - cred
        - client
        - validate_url
        __init__(secret_id, secret_key, token, log_level=None)
        recognize(ImageBase64, ImageUrl, IsWords, EnableDetectSplit, IsPdf, PdfPageNumber, EnableDetectText, ConfigID)
    }
    GeneralAccurateOCR --> ImageValidator

Class diagram for ImageValidator usage in GeneralAccurateOCR

classDiagram
    class ImageValidator {
        validate_url(url, formats)
    }
    GeneralAccurateOCR --> ImageValidator

File-Level Changes

Change	Details	Files
Add precise OCR client (GeneralAccurateOCR) and expose it in the package API	New GeneralAccurateOCR class with full init and recognize implementation Expose GeneralAccurateOCR in package all	`mzapi/tencent/ocr/GeneralAccurateOCR.py` `mzapi/__init__.py`
Refactor Basic OCR client logging behavior	Set default log_level to None to disable logging by default Guard logger setup behind log_level check Remove unconditional debug logging of request parameters	`mzapi/tencent/ocr/GeneralBasicOCR.py`
Revise CI workflows to run on push or merged PR and streamline release process	Add conditions to run sync and publish jobs only on push or merged PRs Restrict pull_request triggers to 'closed' events Fix version extraction regex in release workflow Remove outdated Aliyun PyPI upload step Standardize Git user/email env variables across sync jobs	`.github/workflows/sync-to-coding.yml` `.github/workflows/publish-to-pypi.yml` `.github/workflows/release.yml`
Bump package version to 0.0.5	Update version in mzapi/init.py Update version in setup.py	`mzapi/__init__.py` `setup.py`
Correct auxiliary configurations	Fix typo 'sduo' to 'sudo' in Dockerfile Update TWINE_REPOSITORY_URL in .cnb.yml	`.ide/Dockerfile` `.cnb.yml`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

reviewabot

Issues in the PR:

Missing Newline at End of File:
- .cnb.yml
- .github/workflows/publish-to-pypi.yml
- .github/workflows/release.yml
- mzapi/tencent/ocr/GeneralAccurateOCR.py
- mzapi/tencent/ocr/__init__.py
Files should always end with a newline.

Typo in Dockerfile:

.ide/Dockerfile

-RUN apt-get update && apt-get install -y curl wget unzip openssh-server sduo
+RUN apt-get update && apt-get install -y curl wget unzip openssh-server sudo

Inconsistent Indentation and Formatting:

mzapi/tencent/ocr/GeneralAccurateOCR.py

def recognize(self,ImageBase64,ImageUrl,IsWords,EnableDetectSplit,IsPdf,PdfPageNumber,EnableDetectText,ConfigID):

The parameters should be spaced properly for better readability:

def recognize(self, ImageBase64, ImageUrl, IsWords, EnableDetectSplit, IsPdf, PdfPageNumber, EnableDetectText, ConfigID):

Unnecessary Commented Code:

.github/workflows/publish-to-pypi.yml

-    - name: Upload package to Aliyun PyPI repository
-      run: |
-        twine upload --repository-url https://packages.aliyun.com/686a57a36024b2147d89fbc0/pypi/repo-ssctu -u ${{secrets.USERNAME }} -p ${{ secrets.PASSWORD }} dist/mzapi_python-${{ env.version }}-py3-none-any.whl
-      env:
-        ALIYUN_PYPI_USERNAME: ${{ secrets.USERNAME }}
-        ALIYUN_PYPI_PASSWORD: ${{ secrets.PASSWORD }}

If this code is no longer needed, it should be removed instead of being commented out.

Inconsistent Logging Levels:
- mzapi/tencent/ocr/GeneralBasicOCR.py
- mzapi/tencent/ocr/GeneralAccurateOCR.py
Ensure that the logging levels and their usage are consistent across both files.
Hardcoded Secrets:
- Ensure that no secrets are hardcoded in the code. Although it seems like secrets are being used from GitHub secrets, double-check to ensure no secrets are exposed.

Recommendations:

Ensure all files end with a newline.
Fix the typo in the Dockerfile.
Improve the readability of function parameters by adding spaces.
Remove any unnecessary commented-out code.
Ensure consistent logging practices across the codebase.
Double-check for any hardcoded secrets and ensure they are securely managed.

gemini-code-assist

Summary of Changes

Hello @xiaomizhoubaobei, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求主要引入了腾讯云OCR服务的精准识别功能，扩展了SDK的能力。同时，它还包含了一些维护性更新，如PyPI发布配置的调整、Dockerfile中拼写错误的修正以及日志配置的优化，旨在提升项目的可用性和稳定性。

Highlights

新增精准识别 OCR 功能: 引入了 GeneralAccurateOCR 类，为腾讯云OCR服务提供了更精确的识别能力，并将其集成到SDK中。
更新 PyPI 发布配置: 修改了 .cnb.yml 文件中 PyPI 制品库的 URL，以指向新的发布路径。
修复 Dockerfile 中的拼写错误: 将 .ide/Dockerfile 中的 sduo 命令更正为正确的 sudo，确保了容器环境的正确配置。
版本号更新: 将项目版本从 0.0.4 升级到 0.0.5，反映了新功能的加入和现有代码的优化。
优化日志级别设置: 调整了 GeneralBasicOCR 类的初始化逻辑，现在 log_level 默认为 None，即默认不输出日志，提供了更灵活的日志控制。

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

xiaomizhoubaobei

🎉 感谢您提交首个Pull Request！我们感谢您的贡献，团队将尽快审核您的代码。

qodo-code-review · 2025-08-29T14:56:04Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 Security concerns Sensitive information exposure: Logging of raw request parameters in `GeneralAccurateOCR.recognize` (and similar logging patterns) can leak image URLs, Base64 image content, and configuration IDs into logs. Recommend masking `ImageBase64` entirely, truncating long fields, and avoiding logging secrets or personally identifiable content.
⚡ Recommended focus areas for review Possible Issue The method `recognize` logs full input parameters, which may include large Base64 payloads or sensitive URLs; this can bloat logs and risk leaking data. Consider redacting or truncating `ImageBase64` and sanitizing URLs in logs. def recognize(self,ImageBase64,ImageUrl,IsWords,EnableDetectSplit,IsPdf,PdfPageNumber,EnableDetectText,ConfigID): """' :param ImageBase64: 图片/PDF的 Base64 值。要求图片经Base64编码后不超过 10M，分辨率建议600800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片的 ImageUrl、ImageBase64 必须提供一个，如果都提供，只使用 ImageUrl。 :param ImageUrl: 图片/PDF的 Url 地址。要求图片经Base64编码后不超过10M，分辨率建议600800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片下载时间不超过 3 秒。图片存储于腾讯云的 Url 可保障更高的下载速度和稳定性，建议图片存储于腾讯云。非腾讯云存储的 Url 速度和稳定性可能受一定影响。 :param IsWords: 是否返回单字信息，默认关 :param EnableDetectSplit: 是否开启原图切图检测功能，开启后可提升“整图面积大，但单字符占比面积小”（例如：试卷）场景下的识别效果，默认关 :param IsPdf: 是否开启PDF识别，默认值为false，开启后可同时支持图片和PDF的识别。 :param PdfPageNumber: 需要识别的PDF页面的对应页码，仅支持PDF单页识别，当上传文件为PDF且IsPdf参数值为true时有效，默认值为1。 :param EnableDetectText: 文本检测开关，默认为true。设置为false可直接进行单行识别，适用于仅包含正向单行文本的图片场景。 :param ConfigID: 配置ID支持： OCR -- 通用场景 MulOCR--多语种场景 """ try: self.logger.info("开始执行OCR识别") self.logger.debug(f"输入参数: ImageBase64={ImageBase64}, ImageUrl={ImageUrl}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}") Export Issue `__all__` is set to a list of names without quoting them; in a module context it should contain strings. Also ensure those symbols are imported into the package namespace before exporting. __all__ = [GeneralBasicOCR,GeneralAccurateOCR] Workflow Condition The job uses a merged condition on pull_request events; ensure the workflow is triggered on the 'closed' type for PRs, otherwise `github.event.pull_request.merged` may be undefined. Double-check the trigger and condition alignment. runs-on: ubuntu-latest needs: release-build if: ${{ github.event_name == 'push' \|\| (github.event_name == 'pull_request' && github.event.pull_request.merged) }}

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

Consider extracting the common logging initialization logic in both OCR clients into a shared helper to avoid code duplication.
The GeneralAccurateOCR.recognize signature has many positional parameters—refactor to use keyword-only args or a config object for better readability and maintainability.
Your GitHub Actions workflows repeat the same if: github.event.pull_request.merged conditions across multiple jobs—consider using reusable workflows or YAML anchors to DRY up the CI configuration.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- Consider extracting the common logging initialization logic in both OCR clients into a shared helper to avoid code duplication.
- The GeneralAccurateOCR.recognize signature has many positional parameters—refactor to use keyword-only args or a config object for better readability and maintainability.
- Your GitHub Actions workflows repeat the same `if: github.event.pull_request.merged` conditions across multiple jobs—consider using reusable workflows or YAML anchors to DRY up the CI configuration.

## Individual Comments

### Comment 1
<location> `mzapi/tencent/ocr/GeneralAccurateOCR.py:78` </location>
<code_context>
+            self.logger.info("开始执行OCR识别")
+            self.logger.debug(f"输入参数: ImageBase64={ImageBase64}, ImageUrl={ImageUrl}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}")
+
+            if ImageBase64 is None and ImageUrl is None:
+                error_msg = "ImageBase64和ImageUrl必须提供一个"
+                self.logger.error(error_msg)
</code_context>

<issue_to_address>
The check for both ImageBase64 and ImageUrl being None is appropriate, but does not handle empty strings.

Update the condition to also check for empty strings to ensure a valid image source is provided.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
            if ImageBase64 is None and ImageUrl is None:
                error_msg = "ImageBase64和ImageUrl必须提供一个"
                self.logger.error(error_msg)
                raise ValueError(error_msg)
=======
            if (ImageBase64 is None or str(ImageBase64).strip() == "") and (ImageUrl is None or str(ImageUrl).strip() == ""):
                error_msg = "ImageBase64和ImageUrl必须提供一个"
                self.logger.error(error_msg)
                raise ValueError(error_msg)
>>>>>>> REPLACE

</suggested_fix>

### Comment 2
<location> `mzapi/tencent/ocr/GeneralAccurateOCR.py:83` </location>
<code_context>
+                self.logger.error(error_msg)
+                raise ValueError(error_msg)
+
+            if ImageUrl:
+                self.logger.debug(f"验证图片URL: {ImageUrl}")
+                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
+                self.logger.debug("图片URL验证通过")
+            req = models.GeneralAccurateOCRRequest()
+            params = {
</code_context>

<issue_to_address>
ImageValidator is only used for ImageUrl, not for ImageBase64.

If ImageBase64 has specific format or size requirements, add validation to ensure compliance with API expectations and improve user feedback.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
            if ImageUrl:
                self.logger.debug(f"验证图片URL: {ImageUrl}")
                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
                self.logger.debug("图片URL验证通过")
=======
            if ImageBase64:
                self.logger.debug("验证图片Base64格式")
                import base64
                try:
                    # 检查是否为有效的Base64字符串
                    base64_bytes = base64.b64decode(ImageBase64, validate=True)
                    # 检查大小限制（例如腾讯OCR API最大支持4MB）
                    max_size_bytes = 4 * 1024 * 1024
                    if len(base64_bytes) > max_size_bytes:
                        error_msg = "ImageBase64图片大小超过4MB限制"
                        self.logger.error(error_msg)
                        raise ValueError(error_msg)
                except Exception as e:
                    error_msg = f"ImageBase64格式无效: {str(e)}"
                    self.logger.error(error_msg)
                    raise ValueError(error_msg)
                self.logger.debug("图片Base64验证通过")

            if ImageUrl:
                self.logger.debug(f"验证图片URL: {ImageUrl}")
                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
                self.logger.debug("图片URL验证通过")
>>>>>>> REPLACE

</suggested_fix>

### Comment 3
<location> `mzapi/tencent/ocr/GeneralAccurateOCR.py:87` </location>
<code_context>
+                self.logger.debug(f"验证图片URL: {ImageUrl}")
+                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
+                self.logger.debug("图片URL验证通过")
+            req = models.GeneralAccurateOCRRequest()
+            params = {
+                "ImageBase64": ImageBase64,
</code_context>

<issue_to_address>
No validation is performed for other parameters such as IsWords, EnableDetectSplit, etc.

Validate the types and acceptable values for parameters like IsWords and EnableDetectSplit before the API call to improve error handling.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-08-29T14:56:40Z

mzapi/tencent/ocr/GeneralAccurateOCR.py

+            if ImageBase64 is None and ImageUrl is None:
+                error_msg = "ImageBase64和ImageUrl必须提供一个"
+                self.logger.error(error_msg)
+                raise ValueError(error_msg)


suggestion: The check for both ImageBase64 and ImageUrl being None is appropriate, but does not handle empty strings.

Update the condition to also check for empty strings to ensure a valid image source is provided.

Suggested change

if ImageBase64 is None and ImageUrl is None:

error_msg = "ImageBase64和ImageUrl必须提供一个"

self.logger.error(error_msg)

raise ValueError(error_msg)

if (ImageBase64 is None or str(ImageBase64).strip() == "") and (ImageUrl is None or str(ImageUrl).strip() == ""):

error_msg = "ImageBase64和ImageUrl必须提供一个"

self.logger.error(error_msg)

raise ValueError(error_msg)

mzapi/tencent/ocr/GeneralAccurateOCR.py

sourcery-ai · 2025-08-29T14:56:40Z

mzapi/tencent/ocr/GeneralAccurateOCR.py

+                self.logger.debug(f"验证图片URL: {ImageUrl}")
+                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
+                self.logger.debug("图片URL验证通过")
+            req = models.GeneralAccurateOCRRequest()


suggestion: No validation is performed for other parameters such as IsWords, EnableDetectSplit, etc.

Validate the types and acceptable values for parameters like IsWords and EnableDetectSplit before the API call to improve error handling.

sourcery-ai · 2025-08-29T14:56:40Z

mzapi/tencent/ocr/GeneralAccurateOCR.py

+            self.logger.info("OCR客户端初始化完成")
+        except Exception as e:
+            self.logger.error(f"初始化失败: {str(e)}")
+            raise TencentCloudSDKException("初始化失败", str(e))


suggestion (code-quality): Explicitly raise from a previous error (raise-from-previous-error)

Suggested change

raise TencentCloudSDKException("初始化失败", str(e))

raise TencentCloudSDKException("初始化失败", str(e)) from e

sourcery-ai · 2025-08-29T14:56:40Z

mzapi/tencent/ocr/GeneralAccurateOCR.py

+            raise err
+        except Exception as e:
+            self.logger.error(f"处理OCR请求时发生意外错误: {str(e)}", exc_info=True)
+            raise TencentCloudSDKException("OCR处理错误", str(e))


suggestion (code-quality): Explicitly raise from a previous error (raise-from-previous-error)

Suggested change

raise TencentCloudSDKException("OCR处理错误", str(e))

raise TencentCloudSDKException("OCR处理错误", str(e)) from e

gemini-code-assist

Code Review

本次代码审查主要关注新增的精准识别 OCR 功能。整体来看，功能实现完整，但新代码在遵循 Python 编码规范（PEP 8）、日志处理和代码效率方面有提升空间。我提供了一些具体的代码建议来改进命名、优化请求构建过程，并使日志配置更加健壮且避免副作用。此外，我还指出了日志配置逻辑在不同类之间的重复问题，并建议进行重构。这些修改将有助于提高代码的可读性、性能和可维护性。

gemini-code-assist · 2025-08-29T14:57:46Z

mzapi/tencent/ocr/GeneralAccurateOCR.py

+    def recognize(self,ImageBase64,ImageUrl,IsWords,EnableDetectSplit,IsPdf,PdfPageNumber,EnableDetectText,ConfigID):
+        """'
+        :param ImageBase64: 图片/PDF的 Base64 值。要求图片经Base64编码后不超过 10M，分辨率建议600*800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片的 ImageUrl、ImageBase64 必须提供一个，如果都提供，只使用 ImageUrl。
+        :param ImageUrl: 图片/PDF的 Url 地址。要求图片经Base64编码后不超过10M，分辨率建议600*800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片下载时间不超过 3 秒。图片存储于腾讯云的 Url 可保障更高的下载速度和稳定性，建议图片存储于腾讯云。非腾讯云存储的 Url 速度和稳定性可能受一定影响。
+        :param IsWords: 是否返回单字信息，默认关
+        :param EnableDetectSplit: 是否开启原图切图检测功能，开启后可提升“整图面积大，但单字符占比面积小”（例如：试卷）场景下的识别效果，默认关
+        :param IsPdf: 是否开启PDF识别，默认值为false，开启后可同时支持图片和PDF的识别。
+        :param PdfPageNumber: 需要识别的PDF页面的对应页码，仅支持PDF单页识别，当上传文件为PDF且IsPdf参数值为true时有效，默认值为1。
+        :param EnableDetectText: 文本检测开关，默认为true。设置为false可直接进行单行识别，适用于仅包含正向单行文本的图片场景。
+        :param ConfigID: 配置ID支持：  OCR -- 通用场景  MulOCR--多语种场景
+        """
+        try:
+            self.logger.info("开始执行OCR识别")
+            self.logger.debug(f"输入参数: ImageBase64={ImageBase64}, ImageUrl={ImageUrl}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}")
+
+            if ImageBase64 is None and ImageUrl is None:
+                error_msg = "ImageBase64和ImageUrl必须提供一个"
+                self.logger.error(error_msg)
+                raise ValueError(error_msg)
+
+            if ImageUrl:
+                self.logger.debug(f"验证图片URL: {ImageUrl}")
+                self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"])
+                self.logger.debug("图片URL验证通过")
+            req = models.GeneralAccurateOCRRequest()
+            params = {
+                "ImageBase64": ImageBase64,
+                "ImageUrl":ImageUrl,
+                "IsWords":IsWords,
+                "EnableDetectSplit":EnableDetectSplit,
+                "IsPdf":IsPdf,
+                "PdfPageNumber":PdfPageNumber,
+                "EnableDetectText":EnableDetectText,
+                "ConfigID":ConfigID
+            }
+            req.from_json_string(json.dumps(params))
+            self.logger.info("正在向腾讯云OCR API发送请求...")
+            resp = self.client.GeneralAccurateOCR(req)
+            self.logger.info("OCR识别请求成功完成")
+            self.logger.debug(f"响应数据: {resp.to_json_string()}")
+            return resp.to_json_string()
+
+        except TencentCloudSDKException as err:
+            self.logger.error(f"OCR识别失败: {str(err)}", exc_info=True)
+            raise err
+        except Exception as e:
+            self.logger.error(f"处理OCR请求时发生意外错误: {str(e)}", exc_info=True)
+            raise TencentCloudSDKException("OCR处理错误", str(e))


这个方法存在几个可以改进的地方，以提高代码质量和可维护性：

命名规范: 根据 PEP 8 编码规范，Python 函数的参数和变量名应使用 snake_case (例如 image_base64)，而不是 PascalCase (例如 ImageBase64)。这有助于提高代码的可读性和与 Python 生态系统的一致性。¹

Docstring 格式: 文档字符串 (docstring) 的起始 """ 后多了一个单引号，应予以修正。

日志记录: 在调试日志中直接记录完整的 ImageBase64 字符串可能会导致日志文件变得非常庞大，并可能暴露敏感数据。更好的做法是只记录其是否存在或其长度。

请求对象构建: 通过 json.dumps 和 req.from_json_string 来构造请求对象，相比直接设置属性，效率较低且不够直观。

以下是结合了上述建议的重构版本，它还为参数添加了默认值 None，这与项目中其他类的做法保持了一致。

def recognize(self, image_base64=None, image_url=None, is_words=None, enable_detect_split=None, is_pdf=None, pdf_page_number=None, enable_detect_text=None, config_id=None): """ :param image_base64: 图片/PDF的 Base64 值。要求图片经Base64编码后不超过 10M，分辨率建议600*800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片的 ImageUrl、ImageBase64 必须提供一个，如果都提供，只使用 ImageUrl。 :param image_url: 图片/PDF的 Url 地址。要求图片经Base64编码后不超过10M，分辨率建议600*800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片下载时间不超过 3 秒。图片存储于腾讯云的 Url 可保障更高的下载速度和稳定性，建议图片存储于腾讯云。非腾讯云存储的 Url 速度和稳定性可能受一定影响。 :param is_words: 是否返回单字信息，默认关 :param enable_detect_split: 是否开启原图切图检测功能，开启后可提升“整图面积大，但单字符占比面积小”（例如：试卷）场景下的识别效果，默认关 :param is_pdf: 是否开启PDF识别，默认值为false，开启后可同时支持图片和PDF的识别。 :param pdf_page_number: 需要识别的PDF页面的对应页码，仅支持PDF单页识别，当上传文件为PDF且IsPdf参数值为true时有效，默认值为1。 :param enable_detect_text: 文本检测开关，默认为true。设置为false可直接进行单行识别，适用于仅包含正向单行文本的图片场景。 :param config_id: 配置ID支持： OCR -- 通用场景 MulOCR--多语种场景 """ try: self.logger.info("开始执行OCR识别") self.logger.debug(f"输入参数: image_base64 is set: {image_base64 is not None}, image_url={image_url}, is_words={is_words}, enable_detect_split={enable_detect_split}, is_pdf={is_pdf}, pdf_page_number={pdf_page_number}, enable_detect_text={enable_detect_text}, config_id={config_id}) if image_base64 is None and image_url is None: error_msg = "image_base64 和 image_url 必须提供一个" self.logger.error(error_msg) raise ValueError(error_msg) if image_url: self.logger.debug(f"验证图片URL: {image_url}") self.validate_url.validate_url(image_url, ["png", "jpg", "jpeg", "bmp", "pdf"]) self.logger.debug("图片URL验证通过") req = models.GeneralAccurateOCRRequest() req.ImageBase64 = image_base64 req.ImageUrl = image_url req.IsWords = is_words req.EnableDetectSplit = enable_detect_split req.IsPdf = is_pdf req.PdfPageNumber = pdf_page_number req.EnableDetectText = enable_detect_text req.ConfigID = config_id self.logger.info("正在向腾讯云OCR API发送请求...") resp = self.client.GeneralAccurateOCR(req) self.logger.info("OCR识别请求成功完成") self.logger.debug(f"响应数据: {resp.to_json_string()}") return resp.to_json_string() except TencentCloudSDKException as err: self.logger.error(f"OCR识别失败: {str(err)}", exc_info=True) raise err except Exception as e: self.logger.error(f"处理OCR请求时发生意外错误: {str(e)}", exc_info=True) raise TencentCloudSDKException("OCR处理错误", str(e))

Style Guide References

Footnotes

根据 PEP 8 指南，函数和变量名应使用 snake_case（小写字母和下划线）。 ↩

gemini-code-assist · 2025-08-29T14:57:46Z

mzapi/tencent/ocr/GeneralBasicOCR.py

+                self.logger.setLevel(log_level)
+                # 只在没有处理器时添加处理器
+                if not self.logger.handlers:
+                    handler = logging.StreamHandler()
+                    handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
+                    self.logger.addHandler(handler)
+                else:
+                    # 确保现有处理器的格式一致
+                    for h in self.logger.handlers:
+                        h.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s'))
+                self.logger.info("初始化腾讯云OCR客户端，日志级别: %s", logging.getLevelName(log_level))


当前的日志配置逻辑存在一些潜在问题，并且在 GeneralAccurateOCR 类中出现了代码重复。

修改现有处理器: 代码中遍历 self.logger.handlers 并重置它们的格式化程序 (setFormatter) 的行为可能会产生意料之外的副作用。如果使用此库的应用程序对日志有自定义配置，这种做法会强制覆盖它们，破坏了模块化。库代码应避免修改不属于自己的配置。

代码重复: 这段逻辑与新添加的 GeneralAccurateOCR 类中的 __init__ 方法几乎完全相同。代码重复会增加未来的维护成本。

建议将此日志设置逻辑提取到一个辅助函数中，并进行修改，使其只在 logger 没有任何处理器时才添加新的处理器，而不是修改现有的。一个更优的实践是，库本身只负责获取 logger 实例并使用它，将所有配置（如设置级别和添加处理器）的责任留给库的使用者（即应用程序）。

self.logger.setLevel(log_level) # 只在没有处理器时添加处理器，以避免修改已有的处理器 if not self.logger.handlers: handler = logging.StreamHandler() handler.setFormatter(logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')) self.logger.addHandler(handler) self.logger.info("初始化腾讯云OCR客户端，日志级别: %s", logging.getLevelName(log_level))

qodo-code-review · 2025-08-29T14:57:59Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
High-level	Restore PyPI upload step The publish-to-PyPI workflow removes the package upload step (Aliyun upload was deleted and no PyPI upload was added), so it now only creates a GitHub release and never publishes to PyPI—contradicting the PR’s goal. Add an explicit publishing step (e.g., pypa/gh-action-pypi-publish with OIDC or twine upload with a PyPI token) to the pypi-publish job and ensure it runs under your existing push/merged conditions with id-token permissions enabled. Examples: .github/workflows/publish-to-pypi.yml [38-72] pypi-publish: runs-on: ubuntu-latest needs: release-build if: ${{ github.event_name == 'push' \|\| (github.event_name == 'pull_request' && github.event.pull_request.merged) }} environment: name: pypi url: https://pypi.org/project/MZAPI steps: ... (clipped 25 lines) Solution Walkthrough: Before: # .github/workflows/publish-to-pypi.yml jobs: pypi-publish: runs-on: ubuntu-latest needs: release-build if: ${{ github.event_name == 'push' \|\| (github.event.pull_request.merged) }} steps: - uses: actions/download-artifact@v4 ... - name: Install dependencies run: \| python -m pip install twine # The step to upload to a repository was removed, and no replacement was added. - name: Create GitHub release uses: softprops/action-gh-release@v2 ... After: # .github/workflows/publish-to-pypi.yml permissions: id-token: write # Add permission for trusted publishing jobs: pypi-publish: runs-on: ubuntu-latest needs: release-build if: ${{ github.event_name == 'push' \|\| (github.event.pull_request.merged) }} steps: - uses: actions/download-artifact@v4 ... - name: Publish package to PyPI uses: pypa/gh-action-pypi-publish@release/v1 - name: Create GitHub release uses: softprops/action-gh-release@v2 ... Suggestion importance[1-10]: 10 __ Why: This suggestion correctly identifies a critical flaw where the `publish-to-pypi.yml` workflow no longer publishes the package, completely defeating the workflow's purpose.	High
Security	Avoid logging sensitive image data Avoid logging full `ImageBase64` and raw URLs to prevent leaking sensitive data and large payloads. Log only presence/length and redact the actual content. mzapi/tencent/ocr/GeneralAccurateOCR.py [76] -self.logger.debug(f"输入参数: ImageBase64={ImageBase64}, ImageUrl={ImageUrl}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}") +b64_info = f"[len={len(ImageBase64)}]" if ImageBase64 else None +url_info = "[provided]" if ImageUrl else None +self.logger.debug(f"输入参数: ImageBase64={b64_info}, ImageUrl={url_info}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}") Apply / Chat Suggestion importance[1-10]: 9 __ Why: This is a critical security suggestion that prevents logging potentially large and sensitive data like `ImageBase64`, which is essential for protecting user data and avoiding excessive log sizes.	High
Possible issue	Make params optional and filtered Make method parameters optional and avoid sending keys with null values to the API. Filter out `None` entries (and prefer `ImageUrl` when both are provided) to prevent SDK/API validation errors. mzapi/tencent/ocr/GeneralAccurateOCR.py [63-98] -def recognize(self,ImageBase64,ImageUrl,IsWords,EnableDetectSplit,IsPdf,PdfPageNumber,EnableDetectText,ConfigID): - """' - :param ImageBase64: 图片/PDF的 Base64 值。要求图片经Base64编码后不超过 10M，分辨率建议600800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片的 ImageUrl、ImageBase64 必须提供一个，如果都提供，只使用 ImageUrl。 - :param ImageUrl: 图片/PDF的 Url 地址。要求图片经Base64编码后不超过10M，分辨率建议600800以上，支持PNG、JPG、JPEG、BMP、PDF格式。图片下载时间不超过 3 秒。图片存储于腾讯云的 Url 可保障更高的下载速度和稳定性，建议图片存储于腾讯云。非腾讯云存储的 Url 速度和稳定性可能受一定影响。 - :param IsWords: 是否返回单字信息，默认关 - :param EnableDetectSplit: 是否开启原图切图检测功能，开启后可提升“整图面积大，但单字符占比面积小”（例如：试卷）场景下的识别效果，默认关 - :param IsPdf: 是否开启PDF识别，默认值为false，开启后可同时支持图片和PDF的识别。 - :param PdfPageNumber: 需要识别的PDF页面的对应页码，仅支持PDF单页识别，当上传文件为PDF且IsPdf参数值为true时有效，默认值为1。 - :param EnableDetectText: 文本检测开关，默认为true。设置为false可直接进行单行识别，适用于仅包含正向单行文本的图片场景。 - :param ConfigID: 配置ID支持： OCR -- 通用场景 MulOCR--多语种场景 - """ +def recognize(self, ImageBase64=None, ImageUrl=None, IsWords=None, EnableDetectSplit=None, IsPdf=None, PdfPageNumber=None, EnableDetectText=None, ConfigID=None): + """执行通用印刷体高精度OCR识别""" try: self.logger.info("开始执行OCR识别") - self.logger.debug(f"输入参数: ImageBase64={ImageBase64}, ImageUrl={ImageUrl}, IsWords={IsWords}, EnableDetectSplit={EnableDetectSplit}, IsPdf={IsPdf}, PdfPageNumber={PdfPageNumber}, EnableDetectText={EnableDetectText}, ConfigID={ConfigID}") -... + if ImageBase64 is None and ImageUrl is None: + error_msg = "ImageBase64和ImageUrl必须提供一个" + self.logger.error(error_msg) + raise ValueError(error_msg) + if ImageUrl: + # 当提供了URL时，忽略Base64 + ImageBase64 = None + self.logger.debug(f"验证图片URL: {ImageUrl}") + self.validate_url.validate_url(ImageUrl, ["png", "jpg", "jpeg", "bmp", "pdf"]) + self.logger.debug("图片URL验证通过") + + req = models.GeneralAccurateOCRRequest() params = { "ImageBase64": ImageBase64, - "ImageUrl":ImageUrl, - "IsWords":IsWords, - "EnableDetectSplit":EnableDetectSplit, - "IsPdf":IsPdf, - "PdfPageNumber":PdfPageNumber, - "EnableDetectText":EnableDetectText, - "ConfigID":ConfigID + "ImageUrl": ImageUrl, + "IsWords": IsWords, + "EnableDetectSplit": EnableDetectSplit, + "IsPdf": IsPdf, + "PdfPageNumber": PdfPageNumber, + "EnableDetectText": EnableDetectText, + "ConfigID": ConfigID } + # 过滤掉为None的参数，避免传递null + params = {k: v for k, v in params.items() if v is not None} req.from_json_string(json.dumps(params)) `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: This suggestion significantly improves the robustness of the new `recognize` method by making parameters optional and filtering out `None` values, which prevents potential API errors and makes the method more flexible.	Medium
Possible issue	Fix __all__ to string names The `all` variable must contain string names, not objects. Using objects can break tools relying on `all` and cause import/export issues. mzapi/init.py [28] -__all__ = [GeneralBasicOCR,GeneralAccurateOCR] +__all__ = ["GeneralBasicOCR", "GeneralAccurateOCR"] Apply / Chat Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies that `__all__` should contain a list of strings, not objects, which is a Python best practice that improves code quality and compatibility with static analysis tools.	Low
More

xiaomizhoubaobei added 2 commits August 29, 2025 22:40

github-actions bot added core/核心代码 dependencies/依赖 ci/持续集成 build/构建 tencent/腾讯 labels Aug 29, 2025

reviewabot bot approved these changes Aug 29, 2025

View reviewed changes

gemini-code-assist bot reviewed Aug 29, 2025

View reviewed changes

xiaomizhoubaobei commented Aug 29, 2025

View reviewed changes

qodo-code-review bot added Review effort 3/5 Possible security concern labels Aug 29, 2025

sourcery-ai bot reviewed Aug 29, 2025

View reviewed changes

gemini-code-assist bot reviewed Aug 29, 2025

View reviewed changes

xiaomizhoubaobei merged commit a5a105f into master Aug 29, 2025
5 checks passed

xiaomizhoubaobei temporarily deployed to pypi August 29, 2025 15:03 — with GitHub Actions Inactive

	raise TencentCloudSDKException("初始化失败", str(e))
	raise TencentCloudSDKException("初始化失败", str(e)) from e

	raise TencentCloudSDKException("OCR处理错误", str(e))
	raise TencentCloudSDKException("OCR处理错误", str(e)) from e

feat(ocr): 添加精准识别 OCR 功能并更新相关配置 #1

feat(ocr): 添加精准识别 OCR 功能并更新相关配置 #1

Uh oh!

Conversation

xiaomizhoubaobei commented Aug 29, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

bolt-new-by-stackblitz bot commented Aug 29, 2025

Uh oh!

sourcery-ai bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Entity relationship diagram for all exports in mzapi package

Class diagram for new and updated OCR clients

Class diagram for ImageValidator usage in GeneralAccurateOCR

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

reviewabot bot left a comment

Choose a reason for hiding this comment

Issues in the PR:

Recommendations:

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

xiaomizhoubaobei left a comment

Choose a reason for hiding this comment

Uh oh!

qodo-code-review bot commented Aug 29, 2025

PR Reviewer Guide 🔍

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Style Guide References

Footnotes

Uh oh!

gemini-code-assist bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

qodo-code-review bot commented Aug 29, 2025

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

Uh oh!

Reviewers

Assignees

xiaomizhoubaobei commented Aug 29, 2025 •

edited by qodo-code-review bot

Loading

sourcery-ai bot commented Aug 29, 2025 •

edited

Loading