[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[InetBib] Freidok. Google. OCRopus

Date: Tue, 30 Jun 2009 12:50:10 +0200
From: Karl Dietz <karl.dietz@xxxxxxxxx>
Subject: [InetBib] Freidok. Google. OCRopus

Klaus Graf schrieb:

Google entnimmt, wie hinreichende
Beispiele zeigen, diesen Freidok-Dokumenten den Text


Nur der Vollständigkeit halber:
Google indexiert auch Bilder aus freidok

Vener-Beitrags von mir nicht das
geringste, er war lediglich ein Beispiel.

KG


War aber ganz OK, das Bsp., und es kam einiges an Infos zusammen im 
Kontext zu freidok, metager, base, kpdf, ...


Noch zur Software mit der Google die PDF-s scannt und indexiert:

"

OCRopus(tm) is a state-of-the-art document analysis and OCR system, 
featuring pluggable layout analysis, pluggable character recognition, 
statistical natural language modeling, and multi-lingual capabilities.

The OCRopus engine is based on two research projects: a high-performance 
handwriting recognizer developed in the mid-90's and deployed by the US 
Census bureau, and novel high-performance layout analysis methods.

OCRopus is development is sponsored by Google and is initially intended 
for high-throughput, high-volume document conversion efforts. We expect 
that it will also be an excellent OCR system for many other applications.

"

via google


MfG,
K.

max140 zum schmunzeln, wer mag kann ummodeln ...
Kunde in Buchhandlung: "Guten Tag ich suche 'Der Mann, das überlegene 
Wesen'" - Buchhändlerin: "Schauen Sie mal in der Märchenabteilung"
via twitter

-- 
http://www.inetbib.de

Follow-Ups:
- Re: [InetBib] Freidok. Google. OCRopus
  - From: Walther Umstaetter

References:
- [InetBib] PDF-dateien im GoogleIndex
  - From: Karl Dietz
- Re: [InetBib] PDF-dateien im GoogleIndex
  - From: Karl Dietz
- Re: [InetBib] PDF-dateien im GoogleIndex
  - From: Karl Dietz
- [InetBib] Esch
  - From: Klaus Graf
- Re: [InetBib] Esch
  - From: Karl Dietz
- Re: [InetBib] Esch
  - From: Klaus Graf

Prev by Date: Re: [InetBib] nicht-proprietärer Metadatenformatstandard für Zeitschriftenartikel
Next by Date: RE: [InetBib] nicht-proprietärer Metadatenformatstandard für Zeitschriftenartikel
Previous by thread: Re: [InetBib] Esch
Next by thread: Re: [InetBib] Freidok. Google. OCRopus
Index(es):
- Date
- Thread

Listeninformationen unter http://www.inetbib.de.