Scaling in ML/AI media workflows

Some weeks back, I was invited to a DPP panel discussion where one of the topics was how to scale your machine-learning (ML) workflows. I’m a woman of many hats, and in addition to my work at Umeå University (find Stockholm on a map, and let your eyes follow the coast upwards - we’re just below the arctic circle), I’m also acting CTO at Adlede AB and Codemill AB, two media tech companies with a strong emphasis on AI/ML in their product offerings. This means that when I was later asked to write a short blog on the same topic, I had many good people to call on, and there are a few points we wanted to share.

Adlede, which was part of the DPP and Digital Catapult’s AI in Media event in 2020, offers contextual programmatic advertising. This means we use the programmatic ecosystem to place ads in the best possible media context. A sample scenario is an international furniture retailer that wants to target online articles about major life changes. If computing resources were free and infinitely elastic, we’d classify everything that is published online the instant it appears. In the actual world, this is not likely to happen. Our solution is to focus on high-quality publications in popular demand. Anyone who has tried their hands at natural language processing knows that it is easier to have good results with clean data, and analysing a piece of content is only a meaningful investment if the result is used sufficiently many times. In short: When you are searching for a target class in a vast heterogeneous dataset, start with the part of the dataset where you would be happiest to find positive instances. If you are lucky, you find enough material for your purposes without having to traverse into the murkier parts of the data.

Codemill, the sister company of Adlede, helps major Media & Entertainment customers to remaster their video workflows, moving hardware-demanding on-premise workflows to cloud. This also gives access to a built-out ML/AI machinery that helps increase the speed and quality of content production by improving searchability, automating editing tasks, and adding additional layers of security in compliance checking. A simple example is to use ML to locate the start and end of the intros, and add the metadata needed for users to skip through them. This tagging would otherwise be manual, with the human annotator having to step back and forth through the video frames to find the exact time points.

Another typical use case for ML is to detect explicit content, that is, violence, guns, and rock n roll. This is difficult to solve exactly, but a common way forward is to use over-sensitive classifiers that flag everything that seems remotely problematic, and then a human annotator checks the flagged parts and identifies the true occurrences. This hybrid solution is still not perfect, but it may be the only option when the data stream moves so rapidly that a completely manual inspection is not possible. To improve performance, we can recognise that when it comes to training data, a little gold is better than a lot of garbage - correcting a small percentage of misclassifications in the data can have the same effect on classifier accuracy as doubling the size of the dataset. Many of the solutions that we build on top of AWS Rekognition includes a feedback loop, to allow the user to correct erroneous metadata, so that the system can improve with time.

When it comes to training data, a little gold is better than a lot of garbage

Finally, we can scale not only the technology, but also the team. We think that distributed production is the future, and have put together Accurate Video, a suite of tools for solving everyday use-cases with video. Tasks like preparing content, tagging, viewing and even editing can be done via cloud using a standard web browser, instead of using dedicated equipment, installed software and large storage servers. This makes scaling teams easy, and enables remote working that otherwise would not have been possible. Like operating out of Umeå.

Get involved

To learn more about the DPP's work in AI and ML, contact Rowan.

If your company is not a DPP member, you can learn more about the benefits of membership, or contact Michelle to discuss joining.

Rowan de Pomerai

CTO


If your company is not a DPP member, you can learn more about the benefits of membership, or contact Michelle to discuss joining.

×

MEMBERS

Exclusive to DPP Members

If you work for a DPP member company, enter your email to download the file or create your account.

Not sure if your company is a member? Check our Members Directory

Great news. Your company is a member!

Register in a few quick and easy steps to get access to all DPP documents.

Create your login

Exclusive to DPP Members

Become a member to get access to all DPP documents. Or if you used the wrong email address, try again with your work address.

Try Again

Think there's a mistake? Contact us.

Thanks!

We have emailed you a link to download this file.

Continue

Oops!

Something went wrong.

Start again

If the problem persists, please get in touch.

Account suspended

Unfortunately it looks like your account has been suspended.

Use different email

Please verify your account

We can see you have created an account with us, but you need to verify your email address.

Thanks

Please check your email to verify your account.

Just a few more details!

We would like to send you occasional updates on our work, publications, and events. You can unsubscribe at any time. Can we keep you informed by email?


It's time to reverify your account

From time to time we need to reverify your email address for security reasons. Click the button below and we'll send you a link straightaway.

System unavailable

We're sorry, it looks like we can't register your account right now. Please check again later.

If the problem persists, please contact membership@thedpp.com.

Account Problem

We're very sorry, it looks like we can't retrieve your account right now. The membership team have been notified and should be in touch shortly.

If the problem persists, please contact membership@thedpp.com.