November 19th, Strasbourg

Louis-Philippe Huberdeau Thursday 18 November, 2010

Still in a hotel lobby. I hope this won't become a recurring theme. On the road from Berlin to Strasbourg, I figured out a to reuse existing piece of abstraction to achieve a design objective I was struggling with.

The objective was quite simple. The indexer gathers a whole lot of information and stores it in various fields to be indexed. Depending on the type, some of those fields are can be retrieved, some not. You can configure the index store a copy of all the data, but that has a huge impact on the index size and memory requirements. Zend_Search_Lucene is a PHP implementation and comes with several limitations. Some fields are transformed to allow indexing anyway and cannot be reverted back to the original form.

The objective was to be able to retrieve the information from the database on the fly for the results. Essentially, it's the same work the content sources and global sources do. The issue is that the format was not quite right. The sources return the data encapsulated in objects indicating their type to allow different indexes to index them optimally. For example, multi-value fields like those used to index categories become a string of individual tokens generated by hashing values and replacing numbers, because lucene does not seem to like those.

The solution was quite simple. Sources already used a factory to select the proper implementation to use based on a reduced list of supported value types. Really all that was needed was to provide a different factory that would only provide pass-through implementations and retrieve the value. Simple, but not that obvious. I was really scared I would have to duplicate code for this, but it turns out the sources did not require any change to retrieve the data.

I implemented the design yesterday. Or maybe it was the day before. Can't tell. Now the unifies search can display any information it indexes, allowing for really powerful formatting that does not require knowing where the information is actually from.

I also added a value formatter to render any value as a link to the object. There wasn't really a way to link to the object before that. It did make the thing unusable, but it wasn't really critical. Anyway, going through there made be realize the way URLs are generated really is inconsistent. There are two smarty plugins, one function that is useless and one modifier that is really the only thing anyone should use. The category library also attempts to do it, but entirely ignores any sefurl configuration you may have. It should use the modifier's implementation. But later. There are more fun issues to deal with.

Next step is to write a generic table formatter for the unified search, then I think it will be ready for massive implementation.