Stability.AI Debuted Most Provocative Open-Source Text-to-Image Model, ARK Says

Artificial intelligence is disrupting the graphic design industry, uncovering attractive investment opportunities.

OpenAI’s DALL·E 2, the AI system that creates realistic images and art from a description in natural language, impressed the industry this summer with its ability to generate creative images from text prompts. In the last few months, DALL·E 2 has passed some important milestones: commercial availability, pricing tiers, and more restrictive content mediation. 

“During the same time, other research groups have developed similar text-to-image models,” William Summerlin, ARK Invest analyst, wrote in today’s newsletter. “In our view, Stability.AI has debuted the most provocative model, called Stable Diffusion. Indeed, its image-generation model seems superior to DALL·E 2 in certain domains, particularly face generation.”

Summerlin said that among the important differences between the two models, Stable Diffusion is open source and places few constraints on generation requests. “Users can self-host it on their desktops, for example, and can generate any images they desire. In contrast, OpenAI curbs the generation of offensive and other controversial images,” he added.

The proliferation of open-source DALL·E 2-like models leaves investors wondering if these models will be commoditized.

“We believe large models trained on publicly available data will be commoditized, while models trained on proprietary data will remain differentiated and difficult to duplicate,” Summerlin wrote. “Tesla, for example, collects massive amounts of data from millions of vehicles, including Autopilot interventions, also known as ‘corner cases’. Almost impossible to replicate, those data will be critical to the success of its autonomous vehicle program. In our view, the most successful models will combine proprietary data assets with publicly available data.”

