Be part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Be taught Extra
ByteDance researchers have developed an AI system that transforms single images into sensible movies of individuals talking, singing and shifting naturally — a breakthrough that might reshape digital leisure and communications.
The brand new system, referred to as OmniHuman, generates full-body movies that present folks gesturing and shifting in ways in which match their speech, surpassing earlier AI fashions that might solely animate faces or higher our bodies.
How OmniHuman makes use of 18,700 hours of coaching information to create sensible movement
“Finish-to-end human animation has undergone notable developments in recent times,” the ByteDance researchers wrote in a paper revealed on arXiv. “Nevertheless, present strategies nonetheless wrestle to scale up as giant normal video era fashions, limiting their potential in actual purposes,”
The staff educated OmniHuman on greater than 18,700 hours of human video information utilizing a novel strategy that mixes a number of sorts of inputs — textual content, audio and physique actions. This “omni-conditions” coaching technique permits the AI to study from a lot bigger and extra numerous datasets than earlier strategies.
AI video era breakthrough exhibits full-body motion and pure gestures
“Our key perception is that incorporating a number of conditioning alerts, equivalent to textual content, audio and pose, throughout coaching can considerably cut back information wastage,” the analysis staff defined.
The know-how marks a major advance in AI-generated media, demonstrating capabilities that vary from creating movies of individuals delivering speeches to depicting topics taking part in musical devices. In testing, OmniHuman outperformed present programs throughout a number of high quality benchmarks.
Tech giants race to develop next-generation video AI programs
The event emerges amid intensifying competitors in AI video era, with corporations like Google, Meta and Microsoft pursuing related applied sciences. ByteDance’s breakthrough may give its TikTok father or mother firm a bonus on this quickly evolving area.
Business specialists say such know-how may rework leisure manufacturing, academic content material creation and digital communications. Nevertheless, it additionally raises considerations about potential misuse in creating artificial media for misleading functions.
The researchers will current their findings at an upcoming pc imaginative and prescient convention, though they haven’t but specified when or which one.