Several features are still to be desired in file galleries. Among others, alternate storage engines, using file galleries for attachments across Tiki and customizable views may be desired. Some features are partially available, like uploading files to public locations to avoid the overhead of permission checking for files part of the site.
In order to see what was possible and explore the current code, some low level refactoring was performed and included:
- Conversion of simple SQL queries to helper functions
- Replacement of low level file access for higher level functionalities available in PHP5
- Extraction of common patterns to functions to remove duplication
- Extract file gallery related functions from tikilib
- Extract some logic from root PHP files to the libraries
These changes left the code in a substancially better condition, leaving the library at just over 3000 lines of code. However, the process highlighted several flaws:
- Incorrect handling of file upload
- Depending on which entry point is used, different validation and handling may be used
- File validation relies mostly on what is provided by the browser (mime type, extension) rather than inspecting the file
- Exceptions are hard-coded (and arbitrary) rather than configured
- Lack of design
- Some valid handling was coded for a narrow use case rather than being generalized.
- Confusion between view and storage
- Confusion between file and metadata
These multiple factors lead to inconsistencies in how Tiki behaves, breaking user expectations. Cleaner rules are needed to identify which files are valid and where they should be stored. One of the major issues while refactoring was the widely duplicate condiitions to identify where podcast gallery files should be stored. While this is mostly resolved at this time, there are other cases where this occurs, and other conditions where it should.
There are multiple entry points which are unavoidable and desired, including file upload, batch upload, attachment upload, webdav upload, batch import from directory, and such. Right now, each of those handles the full path to the storage. The initial refactoring brought some helpers to handle common parts, but the differences in handling could not be unified without breaking some existing behavior, no matter how wrong they are.
The code needs to transition to a state where only the input is identified and then moved along to a single code path to handle validation:
- Identify input as a wrapper
- Identify destination
- Collect file information (using fileinfo when available)
- Perform validation
- Store the file and reference
- Store meta-data (Optionally)
- Fill missing meta-data (default names and such)
- Index content
- Handle various notifications, watches, ...
This transition is a significant refactoring effort that will lead to some rules to some validation rules to change.
File galleries serve multiple purposes at this time.
- They define physical properties of the galleries such as the amount of files that can be contained, disk quotas, wether revision will be kept and depending on the gallery type, where files will be stored.
- They define view properties such as the thumbnail size, displayed information and templates.
- They serve as the primary purpose of navigation through the hierarchy, replicating a typical filesystem.
All of these concepts serve different users and could be maintained separately. By using categories as the means for organization and navigation, a regular filesystem could be replicated, except that files could naturally live in multiple locations to serve different navigation requirements. The user would simply select the desired view, like a thumbnail view, just like it is done in operating systems. These views could be configured by administrators, but would remain independent from the individual galleries.
This would leave the physical properties to file galleries, leaving them as mere partitions determining where files are stored and the available capacity, a tool for administrators to handle system requirements. Files could be migrated between galleries over time without affecting the navigation. For example, old files that are not accessed frequently could be stored on a remote SAN with slower access times, but higher storage capacity. Public files could be stored directly in a gallery where the directory is web-accessible, avoiding the PHP overhead for images on the site. By detecting this at link creation, the correct link could be built when using the appropriate plugin.
This model is a significant change from the current implementation and is not without migration challenges. One of the earliest changes that could be performed is around removing the view properties from the galleries and introducing a separate view concept.
To combine with the effort to reduce the amount of code paths, the file gallery should probably optionally define a location to store files, removing this condition that is currently based on the file type.
One aspect where different decisions in the code must be made is around all of the meta-data surrounding files. When uploading a single file, adding a name and description field comes in naturally, but when uploading multiple files are uploaded or files are uploaded without a form, those gaps are left empty and different behaviors may be used. More broadly, a user may question the very use of having a different name and a description. Once the content is indexed, those serve a much smaller purpose. Depending on the usage, alternate meta-data may be preferable altogether.
The files should only contain the file relevant to the file itself. Other properties could be deferred to trackers, perhaps using specific values per file type.
If tracker attachments were stored in file galleries and benefited from the complete indexing capabilities, the result with separated concerns would not be much different. Files that do not need meta-data could simply use file galleries, those that do would go through trackers. Some user interface could be added to attach meta-data to an existing file, essentially creating a new tracker item and automatically adding the attachment.
Similarly, other attachments could be treated the same way. In order to make the attachments available through WebDAV, some category synchronization between the container object and the attachment would be required, or at least user interfaces to allow selecting them.