Insiders share China’s text-to-video development situation, market and industry prospect
Virtually perfect
Published: Mar 14, 2024 11:08 PM
Screenshots of the AI-generated animated video Tale oof Lipu Village Photos: Courtesy of Lipu Village

Screenshots of the AI-generated animated video Tale of Lipu Village Photos: Courtesy of Lipu Village

When OpenAI released an AI-generated 60-second video in early February, the hyperrealistic video and its cutting-edge text-to-video model, Sora, sent shockwaves to the global AI-generated content (AIGC) industry that were nothing short of seismic.

"I was so impressed, thinking about the possible technologies and theories used in Sora," said advertising professional and AI-generated video enthusiast Huang Chujie.

He said before Sora, most of the current text-to-video models could only produce clips no longer than four seconds.

Huang leads a micro AIGC community named "Lipu Villag" in his spare time. Last month, a three-minute AI animated video produced by Lipu Village won the "Best Video Generation Award" at the international competition Massachusetts Institute of Technology (MIT) AI for Filmmaking Hackathon 2024.

As AI-generated videos continue to be a public focal point, the Global Times recently spoke with several Chinese industry insiders. They shared their observations and thoughts on the current situation and application prospects of text-to-video technology in China, as well as how Chinese enterprises have explored and expanded the overseas market.

Screenshots of the AI-generated animated video Tale oof Lipu Village Photos: Courtesy of Lipu Village
Screenshots of the AI-generated animated video Tale of Lipu Village Photos: Courtesy of Lipu Village

'At the first echelon'

After hearing that Lipu Village's video won a prize, Huang shared the good news on social media with great excitement. 

"We're surprised to win, and were very pleased to receive many congratulatory messages from peer AIGC lovers," he recalled.

The video, named Tale of Lipu Village, shows audiences a fantastic world with imaginary creatures, such as eggplant pigs and potato chip flowers. Including Huang, the six-member team created numerous interesting roles and scenes for the video, and added music, sound and narration.

All this work was done with AIGC-related software within three days. "That was inconceivable in the time before AI," Huang told the Global Times. He estimated that it would have taken three months for another six people to make a similar three-minute animated video using traditional methods.

And now Sora will probably bring about new milestone-like development in the text-to-video industry with minute-long clips. Huang is looking forward to its official debut, saying it will largely enrich the expressions of AI videos with long scenes.

Back home, Huang said China's text-to-video development is "at the top echelon" globally. He mentioned that among the winners of the MIT AI for Filmmaking Hackathon 2024 competition, the majority were Chinese teams.

"A little falling behind on the model aspect though, we are very strong in the application of the text-to-video technology, and other elements that are equally essential to AI video making, such as imagination and storytelling," he told the Global Times. 

Photo: Courtesy of StoryStorm

Photo: Courtesy of StoryStorm

Explore overseas market

Moreover, some Chinese AIGC entrepreneurs have grasped opportunities in the fast-blooming text-to-video industry revolution and explored overseas markets.

At present, there are at least "dozens" of Chinese enterprises that have text-to-video software products or related projects tailored for the international market, according to Song Donghuan, founder of StoryStorm. 

StoryStorm is one of China's earliest AI-generated video communities comprised of some 500 AI video creators, of which about 100 core members are "at the top of the industry in China."

Song said that some AI-generated video software or mobile applications popular in overseas markets, such as PixVerse and NeverEnds. 

"They are all products by Chinese companies, although many people don't know it," Song told the Global Times. He added that Chinese users are probably more familiar with CapCut, or the international version of ByteDance's Jianying. With more than 500 million downloads globally, CapCut recently added an AI feature.

As a whole, it is no exaggeration to say that Chinese enterprises "account for half of the overseas AI-generated video market," said Ni Kaomeng, who initiated the China AIGC Industrial Alliance in June 2023, a nationwide civil group of China's AIGC industry insiders.

Ni said he is glad to see Chinese text-to-video software and applications gain a foothold in the international market, which not only benefits the involved Chinese companies, but also gradually helps international users to learn more about China in the long run.

 "Current foreign AIGC tools know little about China, especially Chinese history," Ni told the Global Times. For instance, if someone types in keywords related to ancient China, such as clothes or a building from a certain dynasty, the foreign tools usually generate some "awkward things" that have nothing to do with the dynasty, and sometimes even mix in Japanese or South Korean elements.

To improve the situation, it needs Chinese companies and users to continuously train the AIGC models with the correct materials, and promote them to the overseas market, said Ni. "This is the very first step of exporting Chinese culture in the AI era."

Screenshots of the AI-generated animated video Tale oof Lipu Village Photos: Courtesy of Lipu Village
Screenshots of the AI-generated animated video Tale of Lipu Village Photos: Courtesy of Lipu Village

Prospects and troubles

Text-to-video is a super sunrise industry globally. There will be a 100-fold increase in the number of AI-generated videos in two years, predicted Song.

In the era of the information economy, the fast-developing text-to-video technology can generate a variety of creative and high-quality content at lower cost, which will directly raise efficiency in industries like TV and film production, advertising and gaming, Song said.

The technology will also benefit tourism and leisure industries in aspects like personalized trip customization, publicity material making and souvenir designing, he added.

Echoing Song, Huang said he is thinking about turning his Lipu Village into a rural tourism brand with a combination of virtual and reality.

However, apart from innovations and conveniences, the mushrooming AI-generated videos may also bring ethical and intellectual property troubles, eliciting both public interests and concerns.

"Indeed, it will be increasingly difficult to prove that a video has been created by humans and not through AI tools," Song said frankly. 

In the near future, as AI-generated content looks more and more realistic, the human creators may have to prove themselves by uploading warrants to the blockchain, he said.