davejonescue
Puritan Board Junior
Hello all. Just wanted to share a transcription/editing formula I found useful using the new ChatGPT4o Model. It goes as follows:
1. Select and cut single pages from facsimiles, I prefer to use the "Windows Button+Shift+S" method.
2. Paste into ChatGPT 4o, and ask it to "Can you transcribe this page for me." While pasting the cut facsimile page into the same request.
3. Copy the outputted transcription.
4. Paste it, and how ever many pages you want to do into Microsoft Word.
5. Download WordTalk (or a similar program,) install, so that it is within your Word as an add-in.
6. Adjust the speed of the Text-to-Speech, so that you can easily follow along while reading a PDF of the Facsimile.
7. Make real-time adjustments to the transcribed text from Chat, by noticing the discrepancies listening to the outputted transcriptions, compared to reading the PDF facsimiles.
I find this method much more easier than trying to manually type out large portions of text. Been working on Ambrose's "Prima Media et Ultima," and found out that in later editions there were a few sections in Media that Ambrose added, that were not in the Project Puritas editions. So I spent about 2 days only getting to around 10 pages typed. Then I remembered the discussion I had with a brother on here a while back, and decided to try this method.
--- Take-Aways.
While this model of ChatGPT (4o) is probably the best OCR to date for archaic facsimiles, it still makes mistakes. So, you cannot yet trust it to simply transcribe and keep it moving. You do have to go over the text and make adjustments. But by listening to the audio of the Chat transcriptions, while simultaneously reading the Facsimile PDF's, it is easy to notice the errors, and edit accordingly. This saves time of trying to tediously do visual comparisons.
The one downside to Chat transcriptions page by page, is that it tends to format each page as a singular page; so, if you have sequences, like 1.,2.,3., etc. you will have to adjust those according to the original facsimiles. But all in all this does save a lot of time and effort of transcribing, and not bad at all for the present $20 a month subscription price. Considering hiring someone to type out stuff can be very expensive, and typing stuff out by hand can take a alot of time; this method has the potential to save both time and money. Lastly, I am sure it is like all OCR's in that the quality of the output majorly depends on the quality of the input; but, this model of Chat has the capability, which OCR's didnt have previously, to OCR regardless if a text is intermingled with various fonts like most Puritan texts are.
The exciting thing about this break-through in OCR tech, is that it will make it that much more easier to transcribe and edit Puritan and Reformed texts that have yet to be done.
Hope this helps someone, God bless.
1. Select and cut single pages from facsimiles, I prefer to use the "Windows Button+Shift+S" method.
2. Paste into ChatGPT 4o, and ask it to "Can you transcribe this page for me." While pasting the cut facsimile page into the same request.
3. Copy the outputted transcription.
4. Paste it, and how ever many pages you want to do into Microsoft Word.
5. Download WordTalk (or a similar program,) install, so that it is within your Word as an add-in.
6. Adjust the speed of the Text-to-Speech, so that you can easily follow along while reading a PDF of the Facsimile.
7. Make real-time adjustments to the transcribed text from Chat, by noticing the discrepancies listening to the outputted transcriptions, compared to reading the PDF facsimiles.
I find this method much more easier than trying to manually type out large portions of text. Been working on Ambrose's "Prima Media et Ultima," and found out that in later editions there were a few sections in Media that Ambrose added, that were not in the Project Puritas editions. So I spent about 2 days only getting to around 10 pages typed. Then I remembered the discussion I had with a brother on here a while back, and decided to try this method.
--- Take-Aways.
While this model of ChatGPT (4o) is probably the best OCR to date for archaic facsimiles, it still makes mistakes. So, you cannot yet trust it to simply transcribe and keep it moving. You do have to go over the text and make adjustments. But by listening to the audio of the Chat transcriptions, while simultaneously reading the Facsimile PDF's, it is easy to notice the errors, and edit accordingly. This saves time of trying to tediously do visual comparisons.
The one downside to Chat transcriptions page by page, is that it tends to format each page as a singular page; so, if you have sequences, like 1.,2.,3., etc. you will have to adjust those according to the original facsimiles. But all in all this does save a lot of time and effort of transcribing, and not bad at all for the present $20 a month subscription price. Considering hiring someone to type out stuff can be very expensive, and typing stuff out by hand can take a alot of time; this method has the potential to save both time and money. Lastly, I am sure it is like all OCR's in that the quality of the output majorly depends on the quality of the input; but, this model of Chat has the capability, which OCR's didnt have previously, to OCR regardless if a text is intermingled with various fonts like most Puritan texts are.
The exciting thing about this break-through in OCR tech, is that it will make it that much more easier to transcribe and edit Puritan and Reformed texts that have yet to be done.
Hope this helps someone, God bless.
Last edited: