Loading...
 

Rubix ML

Please help test the code: Machine Learning and provide use cases for Machine Learning for Email and Machine Learning for SEO


Rubix ML banner

What

Integrate https://github.com/RubixML/RubixML in Tiki. Rubix ML is a "high-level machine learning and deep learning library"

What is machine learning? Please see: https://docs.rubixml.com/en/latest/what-is-machine-learning.html

Both Tiki and Rubix ML are written in PHP, which will facilitate the integration. This is major: Most of the alternatives to Rubix ML are written in Java or Python. We could use them but since they would not be built-in, only a tiny fraction of the Tiki community would have access.

Tiki already has mature data management tools

Started in 2002, its source code has been contributed to by more than 350 individuals from multiple organizations.

https://info.tiki.org/Benefits

. Now, in close collaboration with the Rubix ML community, we will add the necessary tools and interfaces to become a complete machine learning platform (managing data, choosing a model, training, evaluation, etc.) accessible to power users, like the rest of Tiki. We will contribute to Rubix ML and make it easier for all other PHP Open Source projects to also integrate with Rubix ML.

Where

Why

  • Permits multiple new features.
    • See "Related" section below for some examples. Many of the these features have been desired for years but we didn't have a clean solution. Both Rubix ML and Tiki have a large feature set and a "one stop shop" philosophy.
    • See various sections at https://github.com/RubixML/RubixML/tree/master/src like Anomaly Detection, Classification, Clustering, etc.


On what types of data?

  • On Tiki system data (ex.: logins logs) so will be providing insight for all Tiki instances!
    • Spammy registrations
  • On standard features like forums, wiki page, comments, email, etc.
    • Email classification, Spam detection
  • On ad hoc data structures made with https://tikitrackers.org/

Who

  • Marc (instigator)
  • Andrew (Lead dev of Rubix ML) is providing guidance)
  • Roberto (developer) will coordinate the project
  • Victor (Back-end code) will do initial integration
  • Jonny (Front-end code)
  • Alain Désilets (advisor)
  • Ricardo Melo (advisor)
  • Simon (junior dev)
  • Kevin (junior dev)
  • Michael I. (tester/requirements for a multilingual project)

How

We'll start with some simple use cases, like reproducing some of the "Project Spotlight" on https://rubixml.com/, but directly within Tiki.

About performance

Performance is very important to train the model. Here is Andrew DalPino, the founder of Rubix ML:

Slides: https://docs.google.com/presentation/d/1a08XvUzA_9RHtBf5S-FOv1XBLgMN7u9dY-Z_EI3VXPE/edit?usp=sharing

For developers

If you are new to Rubix ML

If you are new to Tiki

Reading all the documentation and even a quick scan of all the source code is an unrealistic goal because the project is huge. So just focus at first on Tiki Trackers

  1. Join
  2. Read all the content and watch all the videos at https://tikitrackers.org/
  3. Install Tiki
  4. Explore Tiki features for a few hours
  5. Build a simple tracker for yourself
  6. Contribute code to Tiki: Git Workflow

Once you know both Rubix ML and Tiki Trackers

  • Think about how we could add a graphical user interface (GUI) to Tiki to leverage of Rubix ML.
  • Think about how the Rubix ML demos could be handled within Tiki
  • Think about how we can have something like MLT without Elasticsearch: https://github.com/RubixML/RubixML/issues/75

Other Ideas For Rubix ML in Tiki

Actually more like questions - could Rubix ML do things like this?

  • Duplicate content prevention?
    (to display a warning when posting something covered by other content, e.g. in Forums or Trackers)
  • Automatic forum moderation - more than just avoiding prevented vocabulary?
  • Reading/scraping .pdf's, e.g. resumes? First, each person formats their resume differently, second, there can be different teminology (e.g. 'work experience' 'employment history'). Maybe in combination with OCR?
  • Mobile receipt capture application
    • similar to: "Sensibill Capture is a mobile receipt capture application. The Sensibill Receipt Data Extraction APITrack this API uses machine learning and Optical Character Recognition (OCR) to extract pertinent data from receipts and other documents. The API enables users to extract over 150 data points, including merchant, total, items, taxes, and more from a given receipt. Developers need to contact the provider for API access and documentation. This API is listed in the OCR category of the ProgrammableWeb API directory." (source: programmableweb.com)
  • Reading Merchandiser Quotation similar to the above, 'Mobile Receipt Capture Application'
    • A merchandisers receive quotations for products that have product names, and/or product codes and quantities in them. The quotations are in the form of plain email texts, word documents or spreadsheets. Each quotation has explanatory labels and descriptive sentences but each one has its own way of presenting the information. This would traditionally be solved by a person manually going through the sources and adding the info in a database. ML would need to open and read these various formats than populate the database.
  • Farming Automation: When deploying ground bots for mechanical weed elimination, use ML to differentiate all weeds from the particular crop being grown. Example video from the University of Illinois program:
  • Farming Automation - a very advanced use case (but it's good to set lofty goals): while a group of drones (100+) are completing an assignment to spray crop protection products (herbicides, pesticides, etc.) on a specific field, several of them will fail. RubixML should recognize when they fail, send out other drones to pick them up, re-organize the workload to account for the sudden changes and send out replacement drones (all controlled by Trackers / List Execute / Scheduler / ML)
  • ...

Stats

Projects

With Drones

https://www.sciencedirect.com/science/article/pii/S2095809918308130

Machine Learning vs Artificial Intellience

In French

Video in French about Machine Learning in PHP

Slides: https://docs.google.com/presentation/d/1XuxgQtIcXuSnLRxSJNPaZBjEPyzg8tj-RzE4wOykyDg/edit?usp=sharing

Other examples

Keywords

The following is a list of keywords that should serve as hubs for navigation within the Tiki development and should correspond to documentation keywords.

Each feature in Tiki has a wiki page which regroups all the bugs, requests for enhancements, etc. It is somewhat a form of wiki-based project management. You can also express your interest in a feature by adding it to your profile. You can also try out the Dynamic filter.

Accessibility (WAI & 508)
Accounting
Administration
Ajax
Articles & Submissions
Backlinks
Banner
Batch
BigBlueButton audio/video/chat/screensharing
Blog
Bookmark
Browser Compatibility
Calendar
Category
Chat
Comment
Communication Center
Consistency
Contacts Address book
Contact us
Content template
Contribution
Cookie
Copyright
Credits
Custom Home (and Group Home Page)
Database MySQL - MyISAM
Database MySQL - InnoDB
Date and Time
Debugger Console
Diagram
Directory (of hyperlinks)
Documentation link from Tiki to doc.tiki.org (Help System)
Docs
DogFood
Draw -superseded by Diagram
Dynamic Content
Preferences
Dynamic Variable
External Authentication
FAQ
Featured links
Feeds (RSS)
File Gallery
Forum
Friendship Network (Community)
Gantt
Group
Groupmail
Help
History
Hotword
HTML Page
i18n (Multilingual, l10n, Babelfish)
Image Gallery
Import-Export
Install
Integrator
Interoperability
Inter-User Messages
InterTiki
jQuery
Kaltura video management
Karma
Live Support
Logs (system & action)
Lost edit protection
Mail-in
Map
Menu
Meta Tag
Missing features
Visual Mapping
Mobile
Mods
Modules
MultiTiki
MyTiki
Newsletter
Notepad
OS independence (Non-Linux, Windows/IIS, Mac, BSD)
Organic Groups (Self-managed Teams)
Packages
Payment
PDF
Performance Speed / Load / Compression / Cache
Permission
Poll
Profiles
Quiz
Rating
Realname
Report
Revision Approval
Scheduler
Score
Search engine optimization (SEO)
Search
Security
Semantic links
Share
Shopping Cart
Shoutbox
Site Identity
Slideshow
Smarty Template
Social Networking
Spam protection (Anti-bot CATPCHA)
Spellcheck
Spreadsheet
Staging and Approval
Stats
Survey
Syntax Highlighter (Codemirror)
Tablesorter
Tags
Task
Tell a Friend
Terms and Conditions
Theme
TikiTests
Timesheet
Token Access
Toolbar (Quicktags)
Tours
Trackers
TRIM
User Administration
User Files
User Menu
Watch
Webmail and Groupmail
WebServices
Wiki History, page rename, etc
Wiki plugins extends basic syntax
Wiki syntax text area, parser, etc
Wiki structure (book and table of content)
Workspace and perspectives
WYSIWTSN
WYSIWYCA
WYSIWYG
XMLRPC
XMPP




Useful Tools