Data is king: Why content creators must rethink their role in the AI era - FT中文网
登录×
电子邮件/用户名
密码
记住我
请输入邮箱和密码进行绑定操作:
请输入手机号码,通过短信验证(目前仅支持中国大陆地区的手机号):
请您阅读我们的用户注册协议隐私权保护政策,点击下方按钮即视为您接受。
双语电台

Data is king: Why content creators must rethink their role in the AI era

Content creators may feel the most profound shift and play a more important role as data becomes a strategic asset in the AI era
00:00

{"text":[[{"start":9.53,"text":"This article only represents the author's own views."}],[{"start":13.76,"text":"As the global AI race heats up, it’s becoming clear that data doesn’t lose its value once large models reach the reasoning stage. On the contrary, it’s even more critical due to the need for dynamic knowledge. The so-called “last mile” of high-quality datasets often determines a model’s ultimate performance."}],[{"start":36.15,"text":"That is likely why Facebook parent Meta Platforms (META.US) made a $14.3 billion strategic investment in Scale AI, a company focused on data labeling and cleaning for AI training."}],[{"start":53.18,"text":"Scale AI provides structured, high-quality datasets to OpenAI, Meta, Google and other tech giants by combining the output of massive human labor with automated pipelines. Its data labeling process involves tagging images, texts or audio with meaningful metadata — such as identifying pedestrians in a photo or labeling the point of an article. Data cleaning eliminates errors, duplicates or irrelevant material to ensure consistency and accuracy."}],[{"start":87.37,"text":"Another example of the growing value of quality data is a recent licensing deal between The New York Times and Amazon (AMZN.US), which allows fact-checked editorial content to be used for training AI models. A similar agreement between the Associated Press and OpenAI has also been signed."}],[{"start":109.52000000000001,"text":"Though these arrangements are described as content licensing, they reflect a deeper shift: content has become data, and data has become a service. These deals highlight how media organizations are reassessing the value of their content, while AI developers continue to pursue high-quality material with growing urgency."}],[{"start":131.46,"text":"In contrast, the Chinese-language AI ecosystem faces unique challenges, such as a shortage of publicly available data, lack of large-scale professional annotation and difficulty digitizing classical and cultural texts at scale. Such obstacles highlight the challenges facing development of localized large AI models."}],[{"start":155.99,"text":"Chinese-language materials are relatively scarce"}],[{"start":159.62,"text":"A white paper published by Alibaba Research Institute notes that English accounts for 59.8% of all crawlable web text, while Chinese represents just 1.3%. Wikipedia, a commonly used open resource, has over 7 million English articles, whereas there are only 1.5 million Chinese — less than a quarter of the volume."}],[{"start":184.85,"text":"This imbalance creates a major disadvantage. Without sufficient publicly available Chinese material, local large language models in Chinese may fall far behind their English-language counterparts in natural understanding and text generation — potentially leading to culturally mismatched outputs and a sense that these models have “consumed too much foreign ink.”"}],[{"start":209.9,"text":"Chinese authorities have long recognized this gap and have taken steps to address it. Platforms such as People’s Daily and Xinhua are actively constructing curated, high-quality materials, consisting of vetted news, commentary and policy interpretation, designed to ensure alignment with official values and to support AI safety from a moral and ideological standpoint."}],[{"start":237.43,"text":"Initiatives like the \"Cyber Research Large Language Model\" further concentrate on integrating data from legal and policy documents, state media and other publications, reinforcing alignment with Chinese values."}],[{"start":252.22,"text":"In China, such value alignment has become a basic requirement for any domestic AI system. While China has yet to produce a company of Scale AI’s size, several local firms, including Aishu Technology, Testin, iFlytek (002230.SZ) and Haitai Ruisheng (688787.SH), are building up their capabilities in large-scale data annotation and cleaning. The Shanghai AI Lab is also developing a platform-based material processing system in partnership with policy and academic resources, laying the foundation for a “Chinese version of Scale AI.”"}],[{"start":293.65,"text":"According to market research firm IDC, the value of China’s AI training data market was estimated at $260 million in 2023, and is expected to grow to approximately $2.32 billion by 2032, representing a compound annual growth rate of 27.4%."}],[{"start":317.23999999999995,"text":"Ultimately, the performance of any AI model depends on the content it consumes. In the AI era, content creators — especially those in journalism — must recognize that they are no longer merely material providers. They are now an integral part of the data services supply chain."}],[{"start":337.37999999999994,"text":"When news stories, commentary, academic papers and cultural archives are structured, semantically labeled and integrated into AI training pipelines, their value shifts from real-time information to durable data assets. Content creators who proactively organize and annotate their materials, and pursue licensing partnerships with AI developers, may find themselves unlocking new revenue opportunities."}],[{"start":367.2099999999999,"text":"It’s time for content to be seen not just as narrative, but also as infrastructure."}],[{"start":384.2499999999999,"text":""}]],"url":"https://audio.ftmailbox.cn/album/a_1750297349_2997.mp3"}

版权声明:本文版权归FT中文网所有,未经允许任何单位或个人不得转载,复制或以任何其他方式使用本文全部或部分,侵权必究。

内塔尼亚胡的对手在伊朗问题上争相“比强硬”

以色列反对党即便强烈支持进攻行动,仍批评总理对战争的管理。

伊朗战争重新唤起对全球通胀的担忧

美国联邦储备系统、欧洲央行和英格兰银行将于本周就该冲突带来的威胁公布首次正式评估。

自由民主党呼吁打造“真正”独立的英国核威慑力量

随着民调落后、党内不满情绪加剧,自由民主党领导人埃德•戴维正试图争取“温和”保守党选民的支持。

美联储将如何应对伊朗战争的后续影响?

欧洲央行现在仍然处于“有利位置”吗?通胀预期会迫使英格兰银行出手吗?

“AI先生”彼得•蒂尔在罗马讨论“敌基督”

一系列闭门活动将挑战美国籍教宗利奥十四世的信念,而他此前曾警示人工智能的风险。

特朗普的“震撼与战争”使这场经济危机不同以往

与伊朗的冲突将比去年的关税危机留下更深、更持久的伤痕。
设置字号×
最小
较小
默认
较大
最大
分享×