How to process PDF documents?

I require some suggestions on processing PDF documents. These files are yearly declarations and includes quantities and dollar figures that I have to fix up.

I saw some suggestions on

1) iTextSharp,
2) PDFBox (IKVM).
3) PDFSharp.
4) PDFEdit API (from Adobe).

Which ones would you recomend and if there are any limitations that I should know? Open source, I do not mind paying for an industrial item as long as it is well supported and totally included.

The PDFs are all produced by the exact same third party vendor. Not all the PDFs have the very same structure – there are about 10 different structures (design templates).

I do not have a write requirement on PDF.

You might likewise look at PDFText. We use this in numerous cases for drawing out raw data from PDF files. He likewise has other inexpensive libraries to aid with other elements of PDF control.

This assumes that the document is not scanned and has information that can be drawn out.

we selected this for another task that had to pull data from thousands of pdf files from various origins. it ended up being the only library that dealt with all files, specifically ones from oracle xml publisher which were all malformed. since it worked so well we rely on it each time we need pdf text extraction and have actually written an entire set of wrappers for it to pull from different zones and so on. For the price, we discover it extremely beneficial. Assistance has actually been great too from the designer.

Have a look at http://www.pdftron.com/. We use it to both read and write PDF files- very trustworthy.

I am aiming to open PDF files in Adobe reader utilizing C#’s Process.Start().

However courses and pdf files consisting of white areas do not open when I supply a path without white areas it works great.

This is my code:
Button btn = (Button)sender;
ProcessStartInfo info = new ProcessStartInfo();
info.FileName = "AcroRd32";
string s = btn.Tag.ToString();
//btn.Tag Contains the full file path
info.Arguments = s;
Process.Start(info);

I have actually gone through numerous questions connected to this topic in SO but it won’t work. As I cannot find out how to use @ prefix in my string s.

Simply a little technique there is a default PDF reader set on the client: simply utilize the file name as FileName if the process. Typically you do not care which program to use, so then this solution simply works.

You should enquote the path supplied in the argument list. This will trigger it to see the path as a single argument instead of numerous space separated arguments.
info.Arguments = "\"" + s + "\"";

Leave a Reply

Your email address will not be published. Required fields are marked *