Invalid files from FileUpload fields stored in PersistentResource table

Hi there,

I recently noticed that the PersistentResource table of one Neos installation conaints over 100000 rows. Some of these files contain SQL commands, which look like an attempted SQL injection. Luckily it does not work, because the SQL commands are never executed, but the files are kept anyway.

I think the reason for these files is a public form with a FileUpload field. The field has a FileTypeValidator with some allowedExtensions and the form also has a captcha field.

The problem is: If I select an invalid file (i.e. evil-sql.txt file, where txt is not in the list of allowed extensions) and submit the form, the file is added to the PersistentResource table and only after that validated. So if any validator adds an error (regardless if it is the captcha validator or the file validator) the file evil-sql.txt is still saved as a PersistentResource.

I looked at the FileTypeValidator class and I am wondering how I could avoid creating the PersistentResources if the validation fails. Since the isValid method already expects a PersistentResource, I think that the PersistentResource is already persisted before it has been validated.

Is this a “by design” issue?
Can I work around this problem?

I thought about deleting all PersistentResources which are not referenced by any Asset or Thumbnail, but what if another extension references PersistentResources? I might delete files that are still referenced (depending on the foreign key definition).

Regards
Leif

Hi Leif

We had exactly the same problem. There is an ongoing discussion in Slack containing references to pull requests.

I created a Gist with my temporary workaround. As the Image Upload field (if used by you) is also affected, it also includes the relevant code to adjust.

1 Like

Regarding cleanup: In the installation affected, a lot of files had the same filename, therefore our approach was to identify the files to delete by their name:

SELECT filename, COUNT(filename) AS counter
FROM neos_flow_resourcemanagement_persistentresource
GROUP BY filename
ORDER BY counter DESC;

From the result, we extracted a list of affected filenames and evaluated which files need to be deleted in the file system:

SELECT sha1, COUNT(*) AS counter
FROM neos_flow_resourcemanagement_persistentresource
WHERE filename IN ('1', 'e''''e', '1''', '"', '(select convert(int,CHAR(65)))', '?''?', 'JyI=', 'Applet5692.jar', 'Applet9016.class', 'Applet5038.class', 'Applet7975.jar')
OR filename LIKE "applet%"
GROUP BY sha1
ORDER BY counter DESC;

We saved this information (the affected SHA1 hashes) in a file called infected_resources.txt, and then executed a script that copied possibly “infected” resources to an own folder (for backup purposes) while deleting them (and their symlinks) in the Persistent Resources directory:

#!/bin/bash

resourcesFolder=./Data/Persistent/Resources/
webFolder=./Web/_Resources/Persistent/
backupFolder=./Data/InfectedResources/

mkdir ${backupFolder}

cat infected_resources.txt | while read R; do
    echo "Deleting ${R}"
    resource=${resourcesFolder}${R:0:1}/${R:1:1}/${R:2:1}/${R:3:1}/${R}
    cp ${resource} ${backupFolder}
    rm ${resource}
    rm -rf ${webFolder}${R}
done

Then we ran ./flow resource:clean to remove all database entries that don’t have any associated files anymore (those we deleted).

I’m posting this just as an inspiration - maybe your scenario is different but it might be useful.

1 Like

Hi Lorenz,

thanks for all the detailed information. I was thinking about the same workaround, but I do not like the idea of a validator deleting files. Also I would have to delete the files in other validators aswell. If the captcha is invalid, I need to discard the files too.

In the meantime I came up with an idea to delete bad files in the future. I am going to create a separate resource collection for file uploads and a resource target, that does not make the files public (by not linking anything into the public directory):

Neos:
  Flow:
    resource:
      storages:
        myUploadsStorage:
          storage: 'Neos\Flow\ResourceManagement\Storage\WritableFileSystemStorage'
          storageOptions:
            path: '%FLOW_PATH_DATA%Persistent/MyUploads/'

      targets:
        myNullTarget:
          target: 'My\Site\Resource\NullTarget'

      collections:
        myUploads:
          storage: 'myUploadsStorage'
          target: 'myNullTarget'

The NullTarget simply doesn’t do anything:

<?php

namespace Teha\Site\Resource;

use Neos\Flow\ResourceManagement\CollectionInterface;
use Neos\Flow\ResourceManagement\PersistentResource;
use Neos\Flow\ResourceManagement\Target\TargetInterface;

class NullTarget implements TargetInterface
{

    public function getName()
    {
        return 'Null';
    }

    public function publishCollection(CollectionInterface $collection)
    {
    }

    public function publishResource(PersistentResource $resource, CollectionInterface $collection)
    {
    }

    public function unpublishResource(PersistentResource $resource)
    {
    }

    public function getPublicStaticResourceUri($relativePathAndFilename)
    {
        return null;
    }

    public function getPublicPersistentResourceUri(PersistentResource $resource)
    {
        return null;
    }
}

The NullTarget is useful for my usecase, because we just want to send the uploaded files as email attachments and explicitly do not want them to be saved and not be made public via a URL.

I can now use the new collection myUploads in the FormFactory:

$upload = $page->createElement("file_0", 'Neos.Form:FileUpload');
$upload->setProperty('resourceCollection', 'tehaUploads');

I guess I will create a Cronjob to cleanup all files older than x hours from the myUploads collection. This seems to be safer, than to delete all files without references.

As for the one time cleanup, I see the same thing: Lots of duplicate files. I can identify them easily, because all files I need are currently references by the media assets table.

Anyway, thanks for you info.
I think we could use some alternative to the FileUpload field, that won’t create persistent resources, but only temporary PHP uploads, until we really save the files explicity.

Regards
Leif

A totylly different approach would be to simply not import uploaded files as a resource but store them in a cache instead. Played with that some time ago but did not finish back then. I now took some time to dig out my old experiments and wrap it up.

https://github.com/PackageFactory/PackageFactory.CachedFileUploads … totally exprimental

Consists of:

  • A serializable UploadedFile Object
  • An UploadedFileCache (persistent, 1 day)
  • A type converter from FlowUploadedFile to the serializable UploadedFile … that uses the cache internally for resubmissions
  • A UploadedFileValidator
  • Some example code for Fusion.Form in the readme … would have to be adapted for classic Forms