News Organisations Collaborate Through the DPP

For a number of years we have been working with news organisations to facilitate collaboration and information exchange. In 2018 we released the DPP Metadata Exchange For News framework, which provides guidance on using IPTC NewsML-G2 to exchange an agreed set of minimum metadata between systems and parties that share news content.

In November, the DPP hosted a small group of news leaders and specialists, to share information on their progress in the quest for metadata enrichment and automation. In attendance were Al Jazeera, CBC/Radio-Canada, the BBC, and Reuters News Agency.

Al Jazeera started the session with an overview of their various systems. “Metadata enrichment” was the term coined to describe their programme of activities, which is now well advanced, with work commencing to augment the metadata of their content. Their automated systems have evolved and now undertake contextual analysis of content. It’s now able to add contextual information that goes beyond traditional object detection; for example, a crowd scene could be classified not just as a crowd but by the type of crowd it is, e.g. a protest, or group of sports fans. Further still, it is able to identify a protest’s political context – whether it is left or right leaning, for example.

Al Jazeera’s partnership with the Qatar Computer Research Institute was discussed, which has developed an automated news aggregator to classify a source and stories based on stance, bias and propaganda metrics including the ability to detect fake news propagation.

Then Jeremy Tarling, Lead Information Architect, Curation, Authoring and Metadata, BBC talked about content tagging as being the first step to automation. The BBC content catalogue not only spans news but also online, TV and Radio. Their teams face challenges owing to each area employing different approaches to tagging content, thus making search and discovery extremely difficult.

In BBC News, they have been manually tagging content for well over 6 years, and have evolved tens of thousands of terms that describe the content. These are then clustered into broader topics and these topics are used to surface related and recommended content to the user. Meanwhile, the approaches used by other areas of the BBC vary; non-news television programmes may simply be tagged based on genre and format.

This manual approach presents challenges due to the human effort required, issues of consistency and accuracy, and scale. Jeremy highlighted how the same tag can mean different things in different contexts, creating problems. He shared an example of a crime story which could easily be recommended as related to an Olympic sports story, owing to both being tagged with “shooting”. Viewers may not be so happy if they’re recommended content about murder alongside their favourite coverage of clay pigeon shooting.

In order to solve this, the BBC has developed a common metadata model for all content, which they apply across the content tagging process – making use of mentions, “aboutness” and editorial tone of a piece of content. The “aboutness” of a story is a collection of tags that describe the organisation, place, theme and event related to a story. BBC Research & Development have also developed a classification system called Starfruit that automatically suggests tags using machine learning.

Unlike Al Jazeera or the BBC, Reuters is a global news agency that produces and distributes news content used by broadcasters and other organisations. Ian McLaren, Technical Director, provided an overview of the Reuters Connect platform. It’s an online portal used as a catalogue for the distribution of news assets. Ian talked about the evolution of NewsML-G2, the foundation of the DPP Metadata Exchange for News framework.

Reuters have been using NewsML-G2 for over a decade, and now have the capability to integrate with third-party APIs. This enables seamless exchange of metadata with other organisations. For content classification, Reuters are using a range of IPTC and internal category codes and topics, augmented by automated entity extraction.

We also heard from Bruce MacCormack, Senior Director Corporate Strategy and Development, CBC/Radio-Canada, about the work they’re doing in collaboration with the New York Times, BBC and the Partnership on AI around news validation. Authenticating genuine news content and combatting deep fakes has become a high priority. They are exploring the use of a unique identifier that platforms such as YouTube and Facebook could query, to authenticate the source of news content. This work continues, and further updates will be shared on the DPP blog in due course.

Get involved

To find out more or to get involved with this work, please contact:

If your company is not a DPP member, you can learn more about the benefits of membership, or contact Michelle to discuss joining.

Abdul Hakim

Head of Business Development


If your company is not a DPP member, you can learn more about the benefits of membership, or contact Michelle to discuss joining.

×

MEMBERS

Enter your email to download

If you work for a DPP member company, you'll be able to log in or create your account. Otherwise, we'll help you learn more about how to access this download.

Not sure if your company is a member? Check our Members Directory

Download this file

Download Now Actionable Insight Summary

Great news. Your company is a member!

Register in a few quick and easy steps to get access to all DPP documents.

Create your login

Join the DPP for full access

This download is exclusively available to DPP members. If you think that your company is a member, you might want to try again.

If you're not yet a member, you can find out more about the benefits of membership.

Learn more

Think there's a mistake? Contact us.

Thanks!

We have emailed you a link to download this file.

Continue

Oops!

Something went wrong.

Start again

If the problem persists, please get in touch.

Account suspended

Unfortunately it looks like your account has been suspended.

Use different email

Please verify your account

We can see you have created an account with us, but you need to verify your email address.

Thanks

Please check your email to verify your account.

Just a few more details!

We would like to send you occasional updates on our work, publications, and events. You can unsubscribe at any time. Can we keep you informed by email?

Please choose an option

It's time to reverify your account

From time to time we need to reverify your email address for security reasons. Click the button below and we'll send you a link straightaway.

System unavailable

We're sorry, it looks like we can't register your account right now. Please check again later.

If the problem persists, please contact membership@thedpp.com.

Account Problem

We're very sorry, it looks like we can't retrieve your account right now. The membership team have been notified and should be in touch shortly.

If the problem persists, please contact membership@thedpp.com.

Coming Soon

This download is not yet available - check back soon!

Learn more

Payment Checkout

Set at runtime

Continue

Payment Complete

Thank you, we have emailed you a link to download this file. The link will be active for 24 hours. If you are having problems please contact us