Keyframer: A Deep Dive into HCI Research Prototyping
One of my favorite projects I built at Apple was Keyframer, an LLM-powered animation prototyping tool. The fastest way to learn about Keyframer is to watch the video below (and if you want to learn even more, you can read our paper about this work here or see more on the Apple Research website here).
This post is not about how Keyframer works, but more of a deep dive into the process by which Keyframer came about (and the complications of getting it published). I always wished I knew more about how research projects I like came to be (which is rarely represented in publications), so I thought I’d do my best to capture that progression here.
Keyframer Origins: SVG Experiments
In early 2023, I learned that the original group I had joined at Apple (which I had been part of for about 1.5 years) was dissolving. That group was formerly known as the Learning Sciences Group, an HCI research group in AIML focused on developing novel AI applications for supporting learning. As a result, everyone within this group got moved to other research groups as part of a re-org, and I got placed into a group around intelligent understanding of user interfaces. As in any re-org, determining how you fit into a new group is an ongoing process as you figure out your mutual interests and shared priorities. I was uncertain if any of my research interests exactly aligned with existing projects in the group. So I figured the best path forward was for me to pitch a project I would want to work on.
Rewind to several years prior to my joining Apple (starting around 2018) – I started playing around with SVGs (scalable vector graphics) after using them as a designer for many years (e.g., exporting them from Illustrator). In a sabbatical in between jobs, I spent 6 weeks building web apps to process and manipulate SVGs at the Recurse Center (more on that in this blog post), which included an app to turn any font into a stencil font for laser cutting (which you can use here and read about here). Then when I was working at startup after being at Recurse, I worked on a lot of silly front-end SVG animations for the site, largely using a combination of CSS and Javascript, some of which you can see below (and you can find the code for here).
I also built some apps for fun that were designed to help people prototype their own SVG-based animations using CSS keyframes. (Unfortunately, the startup has since shut down and now I can’t share any live links to these apps, but you can see a GIF of the app below and the original Tweet is here).
Prototyping Keyframer
Back to my time at Apple (at this point, early 2023) – LLMs were taking over, and I was interested in seeing how well LLMs could manipulate SVGs images, where are essentially XML. Since SVGs are generally used in graphics / user interfaces, I thought I could reasonably position this project for the UI Understanding group I just joined – and luckily my manager supported this!
Once I pitched the project, I started mucking around with how well GPT-3.5/4 could generate both SVG code and CSS and Javascript for animating SVGs. This gave me some insight into what LLMs were both good and not so good at when it came to creating animated SVGs.
One amazing part about working on this project is I was given a lot of freedom to prototype (I had some responsibilities on other projects but basically spent ~80% time on this pet project for a few months). Once I did some initial experimentation on different UI use cases, I decided to specifically focus on animated illustrations (an area I had a lot of personal interest and excitement about, which hopefully is clear from the stuff I wrote about in the previous section). That led me to start prototyping the Keyframer app, where a user can supply their own SVG to begin with, prompt an LLM to animate the image, and receive a preview of the animated SVG along with the CSS code to animate it (again, check out the video above for how it works).
Keyframer incorporates some of my specific points of view about the role of LLMs in creative work. First, I found that LLMs are sort of garbage when it comes to generating SVGs from scratch. As someone that cares a lot about design and illustration, I am also generally uninterested in using LLMs to create artwork from prompts directly, mostly because most generated images (and other artwork) tend to look pretty bad / generic / clearly AI generated when generated directly from a prompt, and doing so takes a lot of creative freedom out of the process that I find unfulfilling. I find it a lot more interesting to see what LLMs can do from a user’s given artistic starting point (like adding interactivity to an image you’ve already created), which is why Keyframer was designed this way. Also, writing animation code from scratch can be very tedious (something I had direct experience with having worked on front-end animations), so why not let an LLM simplify that process.
Another reason working on Keyframer was fun is that it let me enter a lot of pockets of Apple that I was interested in and wanted to meet people in. First, we wanted to see how well this idea might align with how professional designer and animators work on animations in their existing practice. So while I was developing Keyframer, I started working with a fellow HCI research scientist in the group (Regina Cheng) on interviewing these creators internally. In the process, I got to meet some amazing animators that have been doing this work professionally at Apple for years, from all across the company (like people who work on the incredible animations on the Apple.com website!)
Using this early feedback, I continued to iterate on the design of the Keyframer app, which was super fun and also led me to learn a lot of new things (like for example how to deploy an app internally with Docker and a bunch of internal stuff I won’t talk about). For example, initially the app didn’t have any notion of generating multiple designs in response to a given prompt, but the formative interviews revealed designers’ interest in comparing different design directions, so I decided to work on a feature that enables generating variants. When we were ready to have people try out Keyframer, I even created the illustrations used in the study (shown below); there were a bunch of alternatives I also illustrated, including an underwater scene that’s in the Appendix of the paper. One challenging part of creating the illustrations is they had to be believable enough as images someone might want to animate but also couldn’t be so complex that the SVGs would be too big and we’d have issues with token limitations with passing the SVG code to the LLM.
At this point, I worked with Regina to test out the app with a variety of folks internally from different backgrounds, including those with and without prior animation and programming experience. For any exploratory study, I think it’s super useful to get feedback from different people to see where a given tool might be best suited. We ran a pilot user study with 13 people at Apple in Fall 2023, wrote up the results, and submitted it to an HCI conference in January of 2024.
The Initial Reaction
Once we wrote up the paper for Keyframer, we silently put it up on arXiv which is a typical approach in HCI and AI research these days. We didn’t expect anyone to react to it because we didn’t advertise its existence at all. But then a popular Twitter account picked up the paper and shared it https://x.com/_akhaliq/status/1756867429047640065, and that opened up a can of worms (the post has 88k views).
In the days after this tweet, we ended up seeing speculative articles about Keyframer pop up on a bunch of sites (some of which I personally read news on years before I joined Apple, so this felt pretty surreal), like MacRumors, The Verge, 9 to 5 Mac, Lifehacker. Even crazier to me was seeing content creators actually make videos about what they thought Keyframer was about:
Keep in mind at this point that there were no videos released about the project, just a PDF on arXiv showing some static images of the Keyframer app and interface. So it was wild to me that Keyframer got picked up the way it did!
Anyway, as most things go, once people outside started writing about Keyframer, it generated a good amount of internal interest, which again was a really cool reason to get to meet lots of people across the company!
The Publication Grind
The light media attention (relative to Apple standards) for Keyframer in February was then followed up by a disappointing paper rejection in April. Me and the other lead author were both ACs for the conference it was rejected from, so we were quite surprised it didn’t make it to PC meeting, and I found the reviews to be very frustrating. I know rejection is a regular occurrence in academia, often due the randomness of reviewers, but this one was especially challenging for me because it was around that time that I knew I was going to leave my job at Apple and I wouldn’t really have the same bandwidth to continue working on the project. Luckily, we were able to open source the project before I left (which you can find on Github here).
Unfortunately, this initial surprising rejection was then followed up by two additional rejections at other conferences over the next year. The most annoying part is that at the time we had initially put the paper on arXiv, I believe this was the first work to examine the application of LLM code generation to SVG-based animations; since then, other papers (which cite Keyframer) have been published, which made it harder to publish our paper. This is just the reality of publishing anything in the AI space - it moves really fast, so if you get a paper rejected early on, it becomes very difficult to keep revising it given how quickly new features are introduced. For example, at the time I worked on Keyframer, no commercial LLMs had direct previews of generated code built into their chat interfaces, but now Claude and then GPT introduced this feature – even so, they lack a lot of the animation-specific features available in Keyframer like the variant generation and prompt engineering we did to reduce chances of token limitation and improve response time, etc. It was clear that most of our reviewers saw these features as an incremental improvement, rather than ones that were innovative enough to be publishable, but I think usability is severely underrated in HCI research, and I will stand by how much easier it is to use Keyframer for animation generation than any of these tools, even today.
Another gap I find “interesting” about product prototyping vs research prototyping is the role of user studies and feedback. Having worked in product for years at several companies, I know that a lot of features are driven by designer intuition and it is rare for features to be fully user tested. I prescribe to the Nielsen model that you learn a lot more from the first few user tests you do, and it’s often inefficient to use the same prototype on many users because it’s more productive to use early feedback to revise a design before testing again. However, I’m always surprised by HCI reviewers that will insist that you run large user studies, often with the same narrowly defined audience (we got a lot of pushback that we had both people with no animation or programming experience alongside experts in both in our study). I don’t think it’s very productive to do user studies this way, but that’s my opinion.
Anyway, there is a small happy ending to this publication journey which is that we had a short version of this paper accepted for VL/HCC this year. I personally still think this should be a full paper, but there isn’t enough time or energy to keep revising the paper (which already has almost 30 citations). We will still keep the arXiv up for those interested in the full details about this work (and hope that people continue to use this paper for their citations!).
Wrap Up
I’ll end this deep dive by saying, 1) I was really thankful to have the space to prototype this idea! Building stuff you’re excited about is always the most fun; 2) AI HCI publishing is hard; 3) those seemingly random side projects can always funnel into more fully manifested ideas years later, so it’s always beneficial to keep experimenting and keep your eyes open for new opportunities!
Acknowledgments
Thanks to my awesome co-authors and collaborators on this work – Regina Cheng, Jeffrey Nichols, and Andrew McNutt – who have all been instrumental in finally getting this work out there!