Tech
Chinese AI firms fight to stand out from rivals in text-to-video market
Chinese firms from start-up Zhipu AI to tech giant ByteDance have rushed to launch artificial intelligence (AI) video-generation tools in recent days, but face challenges in differentiating themselves from local rivals in the market.
Other new market entrants include short video platform operator Kuaishou Technology and start-up Shengshu AI, which released video generation tools for public use. E-commerce giant Alibaba Group Holding has also published a framework for a Sora-style tool. Alibaba owns the South China Morning Post.
While Chinese firms are a few months behind OpenAi’s Sora in developing models that can turn text into videos, they have shown potential to quickly catch up in the field, analysts said.
Lu Yanxia, research director for emerging technology at IDC China, said text-to-video models have mushroomed thanks to China’s significant investments in AI models. Microsoft-backed OpenAI pioneered text-to-video generation with the debut of Sora in February, but the San Francisco-based start-up has yet to make the product available to the general public, with only a limited number of pilot users given access.
ByteDance was the latest among its peers to introduce its version of Sora, with a video tool called Jimeng released on local Android stores on July 31. It accepts both text and image prompts to generate a clip of up to 12 seconds, making it the top choice when it comes to video length.
Kuaishou’s model can generate clips with a maximum length of 10 seconds, while ZhipuAI’s Qing and Shengshu’s Vidu generate clips ranging between four and six seconds. Shengshu, on the other hand, stands out when it comes to the speed of generation. Its version takes less than 30 seconds to generate a clip of four seconds, while most other services take longer to generate a video of similar length.
An employee at one of the AI firms, who requested anonymity, said the models developed by Chinese firms were homogeneous and did not vary greatly from one another. Rather, companies will differentiate themselves based on services provided and industries that they target.
All four services have adopted a freemium model, letting users trial the services at no charge but with longer wait times during peak use periods. They also offer pricing plans so users can avoid delays and receive additional perks, such as higher definition clips.
IDC’s Lu expects the video models to first be adopted by the internet sector, in particular for live streaming and video games, with applications in smart cities and manufacturing to follow.
“This will be the main competitive field for generative AI technologies,” she said.