PDA

View Full Version : How do you stop pdf docs being uploaded?


smitho
04-03-2008, 11:55 AM
Is there a way to tell if a pdf has more than one page and if so stop it from being uploaded?

davidj
04-03-2008, 02:23 PM
this is really difficult to do as the actual file is parsed when opened creating the (Virtual) pages.

There may be a marker in the file which denotes pages but im not 100%.

student101
04-08-2008, 08:46 PM
Is there a way to tell if a pdf has more than one page and if so stop it from being uploaded?

When you say uploaded you mean viewed or opened in the browser?
I may not be on the same level as you and DavidJ but I don't understand.

Cheers

smitho
04-09-2008, 01:46 AM
Give davidj's tut a go and you'll see what I mean.

When uploading a file to the server there is some information available to you. You can get info on the name, type etc...

Array
(
[uploadedfile] => Array
(
[name] => Test.pdf
[type] => application/pdf
[tmp_name] => /private/var/tmp/php8aPrc2
[error] => 0
[size] => 1855806
)

)

http://www.dreamweaverclub.com/vtm/uploading-files.php

student101
04-09-2008, 07:02 AM
I've done it ages ago,
Here are the results on page 3 of the entire thread.

http://www.dreamweaverclub.com/forum/showthread.php?t=27547&page=3

I am not sure that it's possible to find out how may pages are in a pdf file?
as it's a compiled file - hence the name "portable document format"

Cheers

smitho
04-09-2008, 12:53 PM
Thanks student101,

I had a feeling it might be asking a bit much at the upload stage. I know that once the file is uploaded I could probably do a couple of checks and tell if it's a doc or a single page but it would of been nice to stop the user there and then.

Cheers.

student101
04-09-2008, 01:07 PM
I never thought there would be a question like yours though.
Why do you want it to stop on one page?

You could write your own PDF file checker / creator
Have a form where users input the text and your form converts it to a PDF file.

Cheers

pete
04-21-2008, 02:38 PM
There is a way to find the number of pages but only after the file is uploaded, ImageMagick can do it, this will return the number of pages:

identify -format %n mypdf.pdf

Unfortunately on larger pdfs it takes a long time and gobbles up all the CPU. On a 5 page pdf it is quite quick but on a 200 page pdf I had to quit because my machine was heating up and the fan kicked in.

student101
04-21-2008, 02:45 PM
This ImageMagick seems quite interesting.
I must check it out.

Cheers

pete
04-21-2008, 02:55 PM
You can hide secret messages (comments) in images that nobody can see unless they know how, most people don't.

As an example, I have an image called mytest.jpg, to add a comment you do this:


convert -comment "This is secret text for student101" mytest.jpg mytest.jpg

Then to see that in the image do this:


identify -verbose mytest.jpg
which outputs:


mytest.jpg JPEG 155x200 DirectClass 21kb 0.010u 0:01
Image: mytest.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Geometry: 155x200
Class: DirectClass
Type: TrueColor
Endianess: Undefined
Colorspace: RGB
Depth: 8 bits
Channel depth:
Red: 8-bits
Green: 8-bits
Blue: 8-bits
Channel statistics:
Red:
Min: 0 (0)
Max: 255 (1)
Mean: 139.77 (0.548118)
Standard deviation: 89.2771 (0.350106)
Green:
Min: 0 (0)
Max: 255 (1)
Mean: 124.339 (0.487605)
Standard deviation: 81.4909 (0.319572)
Blue:
Min: 0 (0)
Max: 255 (1)
Mean: 108.416 (0.425159)
Standard deviation: 83.0061 (0.325514)
Colors: 7287
Rendering-intent: Undefined
Resolution: 72x72
Units: PixelsPerInch
Filesize: 21kb
Interlace: None
Background Color: white
Border Color: #DFDFDF
Matte Color: grey74
Dispose: Undefined
Iterations: 0
Compression: JPEG
Quality: 100
Orientation: Undefined
Comment: This is secret text for student101
JPEG-Colorspace: 2
JPEG-Sampling-factors: 2x2,1x1,1x1
Signature: 837b42b5d05a77f11976c5a18f841f730e08cfe70e15c2838d 1d1727c4c894bf
Tainted: False
User Time: 0.010u
Elapsed Time: 0:01
Version: ImageMagick 6.1.8 03/22/06 Q16 http://www.imagemagick.org

So you can run through a whole directory of images and add 'Copyright mysite.co.uk" then if someone steals an image you can check to ensure it was yours by finding that in the image. Obviously people can overwrite it but you get the idea.

pete
04-21-2008, 03:35 PM
Just found another method, if you do something like this:

convert -thumbnail 200\>x200\> smarty.pdf[2000] mytest.jpg

It returns:

Requested FirstPage is greater than the number of pages in the file: 190
No pages will be processed (FirstPage > LastPage).
convert: Postscript delegate failed `smarty.pdf'.

So that gives the pages in the file eg 190 without frying the machine. So therefore it is possible to write code to extract the number but I have probably missed a command that will return the number of pages. I doubt there isn't a way, the above clearly shows ImageMagick can spew out the number of pages quickly.

smitho
04-22-2008, 01:15 AM
Thanks pete,

When you mentioned the imagemagick will convert the first page of a pdf I thought there might be a way to tell if the pdf was a multiple page. Could you try and out page 2 of the pdf and if it does this would mean it is a multi page pdf? If you tried to output the second page of a single page pdf would you just get a white page or and error?

pete
04-22-2008, 09:51 AM
Thanks pete,

When you mentioned the imagemagick will convert the first page of a pdf I thought there might be a way to tell if the pdf was a multiple page. Could you try and out page 2 of the pdf and if it does this would mean it is a multi page pdf? If you tried to output the second page of a single page pdf would you just get a white page or and error?

It would return this error:

Requested FirstPage is greater than the number of pages in the file: 190
No pages will be processed (FirstPage > LastPage).
convert: Postscript delegate failed `smarty.pdf'.



So modify your flow based on the above output.