The supervised classification of Machine Learning: how to automate it?

The Editorial Staff
Content Intelligence Network

Within the macro-category AI, Machine Learning, and within Machine Learning, Deep Learning. It's almost like Russian dolls.

The term Machine Learning refers to those techniques that allow computers to extract information from data, a precondition for intelligence. By receiving input from the outside world, machines should be able to learn continuously, giving meaning to the data they collect.

This is totally natural in animals: thanks to the evolution of their brains, they can detect any anomalies in their action range. However it's more difficult for a computer, as explained by Prof. Riccardo Zecchina of Bocconi University during a Tedx event.

Imagine you have to write a pseudo code to instruct an algorithm how to correctly classify a chair. If an object has four "legs", it has a flat surface to sit on and a backrest, it is to be identified as a "chair". This works in the case of “standard” chairs, but what about a swivel chair, with 5 wheels? You have to update the code. And in the case of a chair designed in an uncommon shape or no legs, how do we describe it?


New challenges ahead

As Zecchina pointed out, a completely different strategy is required: it is better for the system to learn from experience, rather than in advance. Deep neural networks are made up of many layers, formed of several nodes, or artificial neurons (a stylized model of biological neurons); you should recondition those connections through an eidetic process, so that the network is able to properly classify the examples provided.

With this training, over time the network learns to extract all the fundamental characteristics of the input (model) and manages to find them in images it has never seen before: for example, it can recognize the same person in a photo from 25 years ago. 

According to Zecchina, there are two challenges on the horizon: the first is creating algorithms capable of extracting information to identify models without any supervision from humans; the second is making them capable of recognizing causality, or the relationships between cause and effect.

A similar drive is also pushing AI solutions for business, like the ones dealing with Content Management. No need to remind you how, with the proliferation of corporate content, it is increasingly necessary to sort and classify it in order to easily find and reuse it over time.


The THRON case

An Italian Saas DAM is working in this direction. We’re talking about THRON: one of its goals is to make the classification of content more and more automatic, minimizing as much as possible the manual interventions to enrich it.

This approach was taken to eliminate training maintenance costs, an unsustainable expense for companies. What does it mean? THRON was previously equipped with a learning library, a sort of standard vocabulary, which the user had to review manually through the Tag Center, to adapt it to the brand’s specific taxonomy. The new AI engines that are going to be implemented towards the end of 2019 will employ a "learning by doing" method instead.

This means that as an editor tags content in alignment with the brand's taxonomy, AI engines will automatically learn the right tags and how to apply them. At first THRON will not know whether a tag suits a certain piece of content, but with just a few examples it will be ready to classify content on its own.

Tagging time is completely taken out, because the engines no longer require expensive, prior training; learning will become itinerant. By the way, semantic information will not disappear, but it will remain "under the hood" and be used for search purposes.