Hi, I have a task at hand (see details below) and I’m wondering whether Clause9 or ClauseBuddy has the features and AI capability to assist here?
Task details:
File A is an excel workbook containing date entries and corresponding item descriptions.
File B is a pdf document containing date periods and generic item descriptions.
I want to be able to do a review between File A and File B in order to (1) identify whether the date entries in File A would match any date period entries in File B, and (2) and the item descriptions in File A would relate or correspond to the generic item descriptions in File B.
In September (estimate) we will release a new feature in Clause9 that allows you to deeply integrate Excel-files & formulas into Clause9-clauses. This would allow to for example do lookups in your Excel-workbook, e.g. to get an item description based on some date entry. However, that won’t help your use case here.
If this is a one-time job that you need to do, then I would take the following approach:
Convert the Excel-file to a text-based table (e.g., Markdown) that can be digested by an LLM.
Convert the PDF-file into text using OCR or some PDF text extraction tool.
Send the contents of both into an LLM endpoint (e.g., ChatGPT) and write a prompt in which you ask it to do the comparisons and provide you back a table-based answer. Do take into account that some minor mistakes will likely be made — in our experience, LLM can do a reasonably good job, but at the same time make mistakes (similar to how humans would make mistakes when they have to compare hundreds of items).
If this is a repeating task, then I would suggest to instead:
Convert the Excel workbook to JSON (= something that LLMs happily process)
Convert the PDF into text
Submit a prompt to the LLM API in which you basically say “I’ll give you a JSON file with an array with the following fields: …; I’ll also give you a text file with the following content: … ; match the dates and extract the descriptions, and return me a JSON answer”.
As you can see, Clause9/ClauseBuddy isn’t really involved in this flow.