I believe you must execute OCR (optical character recognition). Tesseract and OpenCV can be used to accomplish it. Typically, the result is a structured document, however, occasionally it might even be a database that can be imported back into Python.