Skip to content

Commit 0809fa8

Browse files
kforcodeailolipopshock
andauthoredFeb 2, 2022
fix to issue #94 (#95)
* fix to #94 (comment) now all text will inferred as string and the user can change it to their desired data type. * maybe a simpler solution Co-authored-by: lolipopshock <[email protected]>
1 parent cd295de commit 0809fa8

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed
 

‎src/layoutparser/ocr/tesseract_agent.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,11 @@ def _detect(self, img_content):
9191
)
9292
_data = pytesseract.image_to_data(img_content, lang=self.lang, **self.configs)
9393
res["data"] = pd.read_csv(
94-
io.StringIO(_data), quoting=csv.QUOTE_NONE, encoding="utf-8", sep="\t"
94+
io.StringIO(_data),
95+
quoting=csv.QUOTE_NONE,
96+
encoding="utf-8",
97+
sep="\t",
98+
converters={"text": str},
9599
)
96100
return res
97101

0 commit comments

Comments
 (0)
Please sign in to comment.