graphai.core.image.ocr module
- graphai.core.image.ocr.is_valid_latex(text)
- graphai.core.image.ocr.cleanup_json(text)
- class graphai.core.image.ocr.AbstractOCRModel(api_key, model_class, model_name)
Bases:
ABC
- establish_connection()
Lazily connects to the OCR API :returns: True if a connection already exists or if a new connection is successfully established, False otherwise
- abstract perform_ocr(input_filename_with_path)
- class graphai.core.image.ocr.GoogleOCRModel(api_key)
Bases:
AbstractOCRModel
- perform_ocr(input_filename_with_path)
Performs OCR with two methods (text_detection and document_text_detection) :param input_filename_with_path: Full path of the input image file
- Returns:
Text results of the two OCR methods
- wait_for_ocr_results(image_object, method='dtd', retries=6)
Makes call to Google OCR API and waits for the results :param image_object: Image object for the Google API :param method: Method to use, ‘td’ for text detection and ‘dtd’ for document text detection :param retries: Number of retries to perform in case of failure
- Returns:
OCR results
- class graphai.core.image.ocr.OpenAIOCRModel(api_key)
Bases:
AbstractOCRModel
- perform_ocr(input_filename_with_path, validate_latex=True, model_type=None)
- class graphai.core.image.ocr.GeminiOCRModel(api_key)
Bases:
AbstractOCRModel
- perform_ocr(input_filename_with_path, model_type=None)
- graphai.core.image.ocr.get_ocr_colnames(method)
- graphai.core.image.ocr.perform_tesseract_ocr_on_pdf(pdf_path, language=None, in_pages=True)
Performs OCR using tesseract on a pdf file :param pdf_path: Path to the PDF file :param language: Language of the PDF file :param in_pages: Whether to return the results as a separate pages (in a JSON string) or as a singular string.
- Returns:
String containing the entire PDF file’s extracted contents