Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VeraPdf does not remove Temporary Files #1048

Closed
manuelpache opened this issue Jan 7, 2020 · 4 comments
Closed

VeraPdf does not remove Temporary Files #1048

manuelpache opened this issue Jan 7, 2020 · 4 comments

Comments

@manuelpache
Copy link

manuelpache commented Jan 7, 2020

Hi,
I have problems with the VeraPdf-PDFAParser. The Parser does not delete the Temp-Files of the validated PdfA-Document. Manually closing the Parser does not help. This Problem appears both locally on Mac and on our Linux-Servers.

The Dependency:

<dependency>
   <groupId>org.verapdf</groupId>
   <artifactId>validation-model</artifactId>
   <version>1.14.104</version>
 </dependency>

The Code:

VeraGreenfieldFoundryProvider.initialise();
PDFValidatorAnswer answer = new PDFValidatorAnswer();
PDFAFlavour flavour = PDFAFlavour.valueOf(pdfNorm);
PDFAValidator validator = Foundries.defaultInstance().createValidator(flavour, false);

try (PDFAParser parser = Foundries.defaultInstance().createParser(new ByteArrayInputStream(Base64.getDecoder().decode(base64String)), flavour)) {
  ValidationResult result = validator.validate(parser);

  answer.setIsCompliant(result.isCompliant());
  answer.setMessages(result.getTestAssertions().stream().map(TestAssertion::getMessage).collect(Collectors.toList()));
}

catch (ValidationException | EncryptedPdfException | ModelParsingException | IOException e) {
  answer.addMessagesItem(e.getMessage());
}

If you require any further information, let me know.
Thanks in advance

** I was able to narrow the problem a little bit down. The Temp-File gets correctly deleted, if the validation ends without problems. By debugging the Code I found that the InternalInputStream::closeResource() Method is the problem. I case that there are problems, the numOfFileUsers-Variable never reaches 0. So the File never gets deleted. I hopes this helps you a little bit. **

@bdoubrov
Copy link
Contributor

We were not able to reproduce the issue with your code and a random set of documents. As soon as the parser is closed, all temp files are correctly removed.

Would you please provide more details:

  • if the issue is reproduced only on some of the documents, it would be great if you could share any of these
  • if the issue is reproduced on your side for any PDF document, then please provide more info on your runtime environment

@bdoubrov
Copy link
Contributor

Could you provide any PDF document when this issue is reproduced?

@nico1a
Copy link

nico1a commented Feb 14, 2020

attached are two test documents. One passes validation, the other does not. Temp files remain with both documents.

As far as I understand the problem, the buffers in the class ASMemoryInStream are not closed correctly (buffer = zero; in the close() method is never reached). I think this causes the variable numOfFileUsers not to be decremented correctly and the tmp-files remain.

@bdoubrov
Copy link
Contributor

The issue is fixed in the current dev version (1.15.33 - see https://software.verapdf.org/develop/1.15/verapdf-greenfield-1.15.33-installer.zip) and will be included into the coming stable version 1.16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants