Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

批量OCR转存Excel,打开文件文字乱码怎么处理? #237

Closed
ErichTao opened this issue Nov 26, 2023 · 1 comment
Closed

批量OCR转存Excel,打开文件文字乱码怎么处理? #237

ErichTao opened this issue Nov 26, 2023 · 1 comment

Comments

@ErichTao
Copy link

ErichTao commented Nov 26, 2023

批量OCR,语言简体中文,保存文件类型Excel,txt显示没问题,但是Excel打开后中文乱码、数字正常,尝试过Excel改编码和字体,但是无效。请问下这个要怎么解决?
image

@hiroi-sora
Copy link
Owner

方法一:用文本编辑器(如记事本)打开csv文件,另存为 → 修改编码为 ANSI 。如下图所示。

image

方法二:Excel中 → 数据→ 从文本/csv 。如下图所示。

image

修正程序

你可以通过以下步骤,修改Umi-OCR的代码,使其以后输出ANSI编码的csv以兼容office:

  1. 用记事本打开 UmiOCR-data/py_src/ocr/output/output_csv.py
  2. 在最后面找到一行 with open(self.outputPath, "a", encoding="utf-8", newline="") as f: # 追加写入本地文件
  3. utf-8 改为 ansi 。(注意不要添加或删除原有的空格)
            with open(self.outputPath, "a", encoding="ansi", newline="") as f:
  1. 保存,关闭文件。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants