I’m really not sure where to start with this one, which might ultimately involve hiring a developer, but thought I’d check here first. My client is an apparel company and has a shop management program. The program supports EDP and accepts text files containing order data. Customers currently send in PDF files that are always formatted the same way.

Right now, a user has to sit there and copy / paste the data from the PDF to the shop management software which can take an hour or more for a large order. I’d say CSRs spend on average of 2 - 4 hours a day just copying and pasting data from PDFs to the shop management software.

I’m looking for a way to take data from a PDF, extract the text and then format it for the programs EDP. Even if we could get 1/2 of it in there, it would still be a tremendous time savings.

This is an example of the EDP format, which looks pretty straight forward.

---- Start Product ----

PartNumber: 5180
PartColorRange: Whites
PartColor: White
cur_UnitPriceUserEntered: 8.77
OrderInstructions: Make sure this is printed properly.Last time we had errors with this.

Size01_Req:
Size02_Req:
Size03_Req: 22
Size04_Req: 13
Size05_Req:
Size06_Req:

sts_Prod_Product_Override: 0
PartDescription:
cur_UnitCost:
sts_EnableCommission:
id_ProductClass:

---- End Product ----

Is there software that will allow me to read a PDF and teach it how to create the output file?

Here is what the PO looks like that the customer submits.

Thanks so much for the help!

Yes, there are a lot of programs out there to extract text from pdfs.

If copy/paste works, then the OCR mess isn’t required.

How you come at this depends on your skills - I’d be tempted to do pdf → text, then massage the results in python or powershell.

If you’d like a more end user friendly style of process, you could look into something like https://docs.paperless-ngx.com Depending on the time you have to invest, you may be able to automate an awful lot of that process…

I’m sure there are quite a few commercial options in the space too, but I have no experience with any of them.