从图像中转录文本可能会很痛苦。当文本以图像或其他一些不可选择的格式呈现时,学校和工作(school and work)变得困难。唯一的解决方案是让这些眼睛和手指开始工作并开始打字——或者是吗?
最佳字符识别(Character Recognition)( OCR ) 是将打字或手写文本从扫描文档或照片等媒体转换为纯文本的过程。
虽然会出错,但根据文本的清晰度,使用OCR从图像中提取文本可以为您节省数小时的单调工作。OCR的一个用例(use case)是,如果您是一名大学生(college student),需要教科书中的特定页面。如果朋友要向您发送页面的照片,您可以使用OCR从图像中提取所有文本,以便轻松阅读和复制。
在本文中,让我们探索三个最好的在线 OCR 工具来从图像中提取文本,这些工具都不需要下载任何OCR 软件(OCR software)或插件。
OnlineOCR是将图像或 PDF 文件(image or PDF file)转换为多种不同文本格式的最简单、最快捷的方法之一。
如果没有帐户,OnlineOCR.net将允许您每小时将多达 15 个文件转换为文本。注册帐户可让您访问转换多页PDF文档等功能。
OnlineOCR.net支持从PDF、JPG、BMP、TIFF和GIF格式转换,将它们输出为DOCX、XLSX或TXT。
OnlineOCR.net 可以识别英语(English)、南非荷兰语(Afrikaans)、阿尔巴尼亚语(Albanian)、巴斯克语(Basque)、巴西(Brazilian)语、保加利亚语(Bulgarian)、加泰罗尼亚语(Catalan)、中文(Chinese)、克罗地亚语(Croatian)、捷克语(Czech)、丹麦语(Danish)、荷兰语(Dutch)、世界语(Esperanto)、爱沙尼亚语(Estonian)、芬兰语(Finnish)、法语(French)、加利西亚语(Galician)、德语(German)、希腊语(Greek)、匈牙利语(Hungarian)、冰岛语(Icelandic)、印度尼西亚语(Indonesian)的文本,意大利语(Italian)、日语(Japanese)、韩语(Korean)、拉丁语(Latin)、拉脱维亚语(Latvian)、立陶宛语(Lithuanian)、马其顿语(Macedonian)、马来语(Malay)、摩尔达维亚语(Moldavian)、挪威语(Norwegian)、波兰语(Polish)、葡萄牙语(Portuguese)、罗马尼亚语(Romanian)、俄语(Russian)、塞尔维亚语(Serbian)、斯洛伐克语(Slovak)、斯洛文尼亚语(Slovenian)、西班牙语(Spanish)、瑞典语(Swedish)、他加禄语(Tagalog)、土耳其语(Turkish)和乌克兰语。
转换过程(conversion process)需要三个简单的步骤。您上传一个文件,上限为 15 MB,选择您的语言和输出格式(language and output format),然后单击转换(Convert)按钮。
无论您选择哪种输出格式(output format),转换的纯文本预览都将显示在链接下方的字段中,以您选择的格式下载文件。这有助于防止用户将下载浪费在可能不准确的提取上。
NewOCR目前仅提供从图像文件中提取文本的功能,但它支持许多在线OCR提供商不提供的其他一些有趣的功能。
要开始使用NewOCR,只需单击选择文件(Choose File)按钮,选择要从中提取文本的图像,然后单击蓝色的预览(Preview)按钮。然后,这将显示您的图像预览并提供几个附加选项。
与大多数其他在线图像到文本转换器不同,NewOCR实际上允许您设置多种识别语言。如果您不确定图像中的文本是用什么语言编写的,这可能会很有帮助,但是您有一个很好的猜测并希望从其纯文本(plain text)中获得正确的翻译。
如果您的图像偏向一侧,您还可以动态旋转它。应用必要的选项后,您可以单击蓝色的OCR按钮来提取图像的文本。
从这里,您可以下载TXT、DOC或PDF 格式(PDF format)的提取文本,或将其直接发送到谷歌翻译或谷歌文档(Google Translate or Google Docs)进行进一步编辑。
最后但并非最不重要的一点是,OCR.space绝对是我们发现的最强大的选项之一,它应该可以让您涵盖几乎任何图像到文本的操作。
OCR.space 是支持WEBP 文件(WEBP file)格式的最佳 OCR 工具之一。除此之外,还支持PNG、JPG和PDF 。此外,您不必上传文件——如果在线某处可用,您可以远程链接到该文件。
其他利基功能包括自动旋转(auto-rotation)、收据扫描(receipt scanning)、表格识别(table recognition)和自动缩放(auto-scaling)。OCR .space 是唯一支持将文件输出为可搜索 PDF (具有可见或不可见文本)的在线(searchable PDFs)OCR工具之一,您甚至可以在两种不同的OCR 引擎(OCR engines)中进行选择,以获得最佳的提取效果。
您所要做的就是上传或链接文件,单击Start OCR!按钮,然后您的结果预览将在同一页面上动态加载。如果您已将输出选择为可搜索的PDF,则“下载(Download)”和“显示叠加”(Show Overlay)按钮也将可用。
OCR.space最有趣和独特的功能之一是它可以将您的提取输出为JSON。此JSON将包含包含文本中的每个单词及其在图像本身上的坐标的字段。如果您是一名试图以编程方式从图像中提取文本的编码人员,这是一个非常受欢迎的功能。
使用上面的三个网络工具,从几乎任何清晰易读的图像中提取文本应该是小菜一碟。即使您是一个拥有多台显示器的快速打字员,也无需为自己转录文本图像而受苦。制作OCR(OCR)是有原因的,这些网站可帮助您充分利用它!
如果您想分享有关最佳 OCR 工具或服务的任何其他提示,或者在使用上述任何一种方法方面需要帮助,请随时在下面的评论中给我们留言。
3 Best Online OCR Tools To Extract Text From Images
Transcribing text from images can be a real pain. When text іs presented as an image оr some other non-selectable format, school and work become difficult. The only solution is to put those eyes and fingers to work and get to typing it—or is it?
Optimal Character Recognition, or OCR, is the process of converting typed or handwritten text from media such as scanned documents or photos into plain text.
Although it’s subject to mistakes, depending on the clarity of the text, using OCR to extract text from images can save you hours of monotonous work. One use case of OCR would be for if you’re a college student needing a particular page out of a textbook. If a friend were to send you a photo of the page, you could use OCR to extract all of the text from the image to easily read and copy it.
In this article, let’s explore three of the best OCR tools online to extract text from images, none of which require any OCR software or plugins to download.
OnlineOCR is one of the simplest and quickest ways to convert an image or PDF file into multiple different text formats.
Without an account, OnlineOCR.net will allow you to convert up to 15 files to text per hour. Registering for an account gives you access to features such as converting multi-page PDF documents and more.
OnlineOCR.net supports converting from the PDF, JPG, BMP, TIFF, and GIF formats, outputting them as DOCX, XLSX, or TXT.
OnlineOCR.net can recognize text in English, Afrikaans, Albanian, Basque, Brazilian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, Esperanto, Estonian, Finnish, French, Galician, German, Greek, Hungarian, Icelandic, Indonesian, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Macedonian, Malay, Moldavian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Tagalog, Turkish, and Ukrainian.
The conversion process requires three simple steps. You upload a file, capped at 15 MB, select your language and output format, and click the Convert button.
Regardless of the output format you select, a plain text preview of the conversion will appear in a field below a link to download the file in your selected format. This helps prevent users from wasting a download on an extraction that may be inaccurate.
NewOCR currently only offers text extraction from image files, but it supports a few other interesting features that many online OCR providers don’t.
To begin using NewOCR, simply click the Choose File button, select the image you wish to extract text from, and then click on the blue Preview button. This will then bring up a preview of your image and present several additional options.
Unlike most other online image-to-text converters, NewOCR will actually allow you to set multiple recognition languages. This can be quite helpful if you’re unsure of what language the text in an image is written in, but you have a good guess and wish to get a proper translation from its plain text.
If your image is skewed to one side, you can also dynamically rotate it. When you’ve applied the necessary options, you can click the blue OCR button to extract the image’s text.
From here, you can download the extracted text in TXT, DOC, or PDF format, or send it straight to Google Translate or Google Docs for further editing.
Last but not least, OCR.space is definitely one of the most robust options we’ve found, and it should have you covered for just about any image-to-text operation.
OCR.space is one of the best OCR tools that supports the WEBP file format. Other than that, PNG, JPG, and PDF are also supported. Additionally, you don’t have to upload a file—you can remotely link to it if it’s available somewhere online.
Other niche features include auto-rotation, receipt scanning, table recognition, and auto-scaling. OCR.space is one of the only online OCR tools that supports outputting files as searchable PDFs (with visible or invisible text), and you can even choose between one of two different OCR engines for the best possible extraction.
All you have to do is upload or link a file, click the Start OCR! button, and then a preview of your results will dynamically load on the same page. If you’ve selected your output as a searchable PDF, the Download and Show Overlay buttons will also be available.
One of the most interesting and unique features of OCR.space is that it can output your extraction as JSON. This JSON will have fields that include each word in the text and their coordinates on the image itself. This is a very appreciated feature if you’re a coder out there trying to programmatically extract text from images.
With the three web tools above, extracting the text from just about any clear and legible image should be a piece of cake. Even if you’re a fast typer with multiple monitors, there’s no need to suffer through transcribing text images yourself. OCR was made for a reason, and these websites help you make the best use of it!
If you have any other tips for the best OCR tools or services you’d like to share, or you’d like help with using one of the above, feel free to drop us a message in the comments below.