Ayaan Haque
Hi! I'm Ayaan Haque, a 21 year old researcher at Luma AI working on training multimodal foundation models. I worked on the training team for Dream Machine, Luma's video generation foundation model.
Foundation Models
Dream Machine is a video generation model, built on a highly scalable and efficient transformer model trained directly on videos. Dream Machine is capable of generating physically accurate, consistent and eventful shots of 120 frames in under 120 seconds, and is a first step towards building a universal imagination engine.
Genie is a 3D foundation model that can generate high-fidelity 3D objects from text instructions. Genie generates objects in under 10 seconds, and can be refined into higher quality assets.
Selected Publications
Terminal Velocity Matching (TVM) is a scalable, single-stage generative training method that delivers diffusion-level quality with a 25× fewer inference steps, now trained at 10B+ scale.
We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction.
Hi! I'm Ayaan Haque, a 21 year old researcher at Luma AI (now a Series C company). I worked on the training team for Dream Machine, Luma's video generation foundation model! More recently, my work has focused on multimodal (video, audio, text) pre-training, real-time autoregressive diffusion models, model architecture design, and training algorithms. In the past, I was a student researcher at Google DeepMind working on autoregressive video generation. Before Dream Machine, I worked on Genie, Luma's 3D foundation model.
I started my MS in EECS at UC Berkeley with Prof. Alexei Efros and will graduate in Spring 2026. I completed my BS in EECS at UC Berkeley in Spring 2025. In the past, I worked on self-supervised and unsupervised representation learning. I interned at Samsung SDSA, and got my research career jumpstarted back in high school with Wang Group at Stanford.
In a past life, I was a builder and hacker (I'm a MLH Top-50 Hacker!), and now I'm exploring deep-tech startups. Other than that, I enjoy writing, watching/playing sports, eating out with friends, and just having a good time.
Projects
I've just listed a few of my favorite projects, and the remaining are available on my Github.
We propose a method for editing 3D Gaussian Splatting (3DGS) scenes with text-instructions. This project is a follow-up to our previous work, Instruct-NeRF2NeRF. We improve the visual quality of edits, the training time, and rendering speed of the model.
Stack: Python, PyTorch
Implemented DreamFusion, CLIP-NeRF in Nerfstudio, an open-source project for NeRF development. Contributed (in small parts) to large-scale open-source project by reviewing code, writing documentation, and implementing research methods.
Stack: Python, PyTorch