I have a large corpus of notes humans wrote to summarize articles. As is they will give you the gist but are not very readable. I would like to use a gen model and ask “please write a short sentence that will be nice to read describing the following facts” and feed it the notes, to obtain a brief readable summary.
Language is Italian.
Suggestions on models or workflows?
Thanks
You can try taking a picture of the notes and having a multimodal model try and read and extracts its text. You can either use chatgpt4(paid but probably more accurate) or run llama.cpp llava multimodal function with a llava model locally(free but might hallucinate).
Maybe scanning your notes into PDF format and trying a RAG approach might yield some results too. You can upload the PDF to GPT/Claude or run a local RAG project like h2oGPT or privategpt and see how well they can transcribe your notes.
I doubt gpt4V will be perfect at reading detailed handwritten notes that too in Italian. For proper results, google lens is good at handwritten structured OCR, but needs manual work.
I apologize for my misleading english, but they are not handwritten, but regular text in a mongodb array of strings