Summary/Story at a Glance
Data visualization is essential for analyzing diverse datasets across fields such as biomedical sciences, physics, medicine, and astronomy. A significant challenge in this field is creating compelling materials for rendering scientific datasets. Recent advances in generative AI have paved the way for breakthroughs across various domains. Leveraging this progress, researchers at VIDI have developed an innovative material authoring pipeline tailored for data visualization. Using Intel® Extension for PyTorch* and high-performance Intel® GPUs, the team designed and trained a stable diffusion control pipeline, enabling prompt-driven material creation for visualization and rendering tasks. Intel® Data Center GPU Max Series significantly accelerated the development process, offering 6 times faster training and inference speeds compared to using CPU only, allowing for rapid iteration on neural network designs and training parameters.
Prompt-Driven Material Generation: Leveraging Diffusion Models to Author Compelling Material and Texture Definitions for Rendering
For over a decade, VIDI, the visualization lab at UC Davis, has been at the forefront of data visualization research, making significant contributions to the visualization and rendering community. In this project, we address a longstanding challenge in scientific visualization–the creation of convincing materials to apply to scientific datasets to bring realistic appearances to medical scans, biological structures, astronomical phenomena, and more!
In 2007, Bruckner et al. introduced the Style Transfer Function, a data-driven approach for specifying illumination using example sphere maps, laying the groundwork for rendering techniques that integrate real-world lighting effects through sampled references.[a] Building on this foundational work, we extend this idea and develop an AI-powered approach to quickly author complete materials that can accommodate multiple lighting environments and complex illumination setups. Using diffusion models and custom control networks, we create multiple sphere maps, each lit from distinct angles. From these maps, materials are synthesized that respond realistically to diverse lighting conditions.
Challenge: Training and Serving Diffusion Control Models at Scale
To allow users to create expressive materials in a simple manner, we want to use multiple sphere map examples lit from different directions (see Figure 1). This allows the input to be a set of images that show how the material looks from different angles. We combine this with generative AI, allowing users to make entire materials simply from a textual prompt. For this process to work, we must then combine the multiple example maps into a single coherent material, train a generative AI model to create grids of such sphere maps with consistent material and controlled lighting directions, and finally create a texture to use with the final material.
Figure 1. Grid of sphere maps lit from different directions. Black areas indicate invalid or unneeded light directions
We seek to leverage a pre-existing AI diffusion model that encodes extensive knowledge about real-world objects, materials, and artistic styles. Unfortunately, a basic diffusion model is not able to consistently generate images with specific image-space constraints; the use of a ControlNet allows this specificity. However, a control model for lighting directions does not exist. We, therefore, train our own diffusion control model for this purpose. This approach requires generating a large database of controls, prompts, and result images and training a diffusion control model. Training and refining such models, however, can be challenging on consumer hardware due to long runtimes and memory constraints, requiring careful attention to parameter selection.
Therefore, we wanted a solution that enables more efficient training with faster iterations. Using Intel's high-performance compute hardware reduces the training times significantly compared to using regular resources available in a research environment. Furthermore, by using cloud-based computing resources, we are able to make our pipeline available on systems that cannot normally run models of this size without issue.
Solution
Our approach combines Stable Diffusion with a custom-trained ControlNet model, leveraging Intel’s compute hardware in conjunction with the Intel Extension for PyTorch for training and inference (see Figure 2). In the first step, we generate sphere maps with pixel-level control over lighting directions. The ControlNet model uses these lighting directions, enabling the generation of multiple sphere maps with predefined lighting conditions. For model training, we incorporate grids of images rendered with multiple material models and spot lighting, varying orientations to maximize directional information.
Figure 2. Material map prediction pipeline at the core of our solution, accelerated by Intel’s compute hardware
To create a material from the generated sphere maps, we leverage the property that each pixel, for a given point light source, represents a unique combination of incoming and outgoing light directions. In standard rendering, these angles are used to calculate reflected light or to access values from a data-driven BRDF. Our approach inverts this process: using a control image that encodes angle directions, we create a BRDF from the generated diffusion output, interpolating data where image coverage is sparse.
We extend this functionality by enabling texture generation via two methods: extracting a texture from the sphere maps using re-synthesis[b] or using Stable Diffusion to create a texture directly with an automatically modified material prompt.
Results
The outputs from our pipeline, illustrated in Figure 3, demonstrate the ability to synthesize materials with highly realistic lighting interactions across diverse material properties. From biological tissues to astronomical simulations, the generated materials capture intricate light-material interactions, enabling more immersive and realistic visualizations (see Figure 4). These results are achieved thanks to the performance and scalability of Intel® hardware, which accelerates both training and inference processes. The Intel® oneAPI Base Toolkit allows seamless optimization of compute workloads, while Intel® Tiber™ AI Cloud provides the high-performance computing environment necessary to handle the complexity of our models.
Figure 3. Examples of materials generated by our pipeline
Intel® GPUs power model inference, ensuring low latency and high efficiency for generating outputs. Using the Intel® Data Center GPU Max Series, we can generate new material maps in as little as 3.18s (4.24 iterations/s) enabling a quick turnaround. Compared to CPU inference on a top-of-the-line Intel® Xeon Platinum processor (18.98s, 1.36 iterations/s) this constitutes 6x the performance and is on-par up with using comparable Nvidia* compute hardware (Quadro RTX 8000 at 3.46s, 3.66 iterations/s). This performance allows for rapid iteration and refinement, enabling the development of high-quality materials in a matter of seconds. Our current implementation supports the export of generated materials to a common format. In the future, we plan to extend the Intel® OSPRay renderer to support loading our exported materials. This simple extension will make them immediately available for advanced scientific visualization applications in OSPRay.
Figure 4. An example of a multi-material scientific dataset with nine different materials generated using our pipeline.
Summary
In conclusion, we provide a new method that addresses the longstanding challenge of creating convincing materials for scientific datasets, enabling realistic appearance modeling for applications ranging from medical scans to astronomical phenomena. By building upon the foundational Style Transfer Function introduced by Bruckner et al., we developed a new AI-based approach using advanced diffusion models and custom control networks to generate sphere maps illuminated from diverse angles. These innovations enabled the synthesis of materials that realistically respond to complex lighting environments.
Crucially, this work was made possible by integrating Intel’s cutting-edge hardware and software solutions. The Intel oneAPI Base Toolkit provided a unified programming environment, allowing us to develop and optimize our models across heterogeneous computing architectures efficiently. Additionally, leveraging Intel-powered cloud computing resources enabled scalable training and inference, drastically reducing development time and enhancing performance. These tools and technologies ensured that the project achieved both technical excellence and practical feasibility.
Acknowledgements
We express our thanks to the Intel AI Tiber Cloud team and the UCD oneAPI Center of Excellence and who have provided invaluable support and feedback throughout this project and for providing access to their resources without which this work could not have been realized. We are also grateful to Kittur Ganesh and Nikita Shiledarbaxi from Intel for their support and guidance throughout this project.
Get The Software
Explore Intel oneAPI Base Toolkit for open accelerated computing across heterogeneous architectures. The Intel Tiber AI Cloud platform allows you to try your hands on the toolkit components, and other AI and HPC libraries, tools and frameworks powered by oneAPI.
Looking for smaller download packages?
Streamline your software setup with Intel’s tool selector to install full kits or new, right-sized sub-bundles containing only the components needed for specific use cases! Save time, reduce hassle, and get the perfect tools for your project with Intel® C++ Essentials, Intel® Fortan Essentials, and Intel® Deep Learning Essentials.
[a]Bruckner, S., & Gröller, M. E. (2007). Style Transfer Functions for Illustrative Volume Rendering. Computer Graphics Forum, 26(3), 715-724.
[b]Harrison, P. (2001). A Non-Hierarchical Procedure for Re-Synthesis of Complex Textures. International Conference in Central Europe on Computer Graphics and Visualization, 190-197.