![]() It’s not perfect - Point-E’s image-to-3D model sometimes fails to understand the image from the text-to-image model, resulting in a shape that doesn’t match the text prompt. When given a text prompt - for example, “a 3D printable gear, a single gear 3 inches in diameter and half inch thick” - Point-E’s text-to-image model generates a synthetic rendered object that’s fed to the image-to-3D model, which then generates a point cloud.Īfter training the models on a dataset of “several million” 3D objects and associated metadata, Point-E could produce colored point clouds that frequently matched text prompts, the OpenAI researchers say. The image-to-3D model, on the other hand, was fed a set of images paired with 3D objects so that it learned to effectively translate between the two. ![]() The text-to-image model, similar to generative art systems like OpenAI’s own DALL-E 2 and Stable Diffusion, was trained on labeled images to understand the associations between words and visual concepts. Outside of the mesh-generating model, which stands alone, Point-E consists of two models: a text-to-image model and an image-to-3D model. (Meshes - the collections of vertices, edges and faces that define an object - are commonly used in 3D modeling and design.) But they note in the paper that the model can sometimes miss certain parts of objects, resulting in blocky or distorted shapes. ![]() To get around this limitation, the Point-E team trained an additional AI system to convert Point-E’s point clouds to meshes. (The “E” in Point-E is short for “efficiency,” because it’s ostensibly faster than previous 3D object generation approaches.) Point clouds are easier to synthesize from a computational standpoint, but they don’t capture an object’s fine-grained shape or texture - a key limitation of Point-E currently. Rather, it generates point clouds, or discrete sets of data points in space that represent a 3D shape - hence the cheeky abbreviation. Point-E doesn’t create 3D objects in the traditional sense. According to a paper published alongside the code base, Point-E can produce 3D models in one to two minutes on a single Nvidia V100 GPU. ![]() This week, OpenAI open sourced Point-E, a machine learning system that creates a 3D object given a text prompt. The next breakthrough to take the AI world by storm might be 3D model generators.
0 Comments
Leave a Reply. |