AI News HubLIVE
站内改写3 min read

Ask AI what goes with chicken and the answer depends on whether it learned from recipes or molecules

Kaikaku.AI's Epicure models separate recipe-based and chemistry-based ingredient pairing. The chemistry-based model performs better in classifying taste and nutrition despite lacking direct data. The models were trained on 4.14 million recipes in seven languages and FlavorDB.

SourceThe DecoderAuthor: Jonathan Kemper

What goes with an ingredient? The answer depends on whether you're looking for a recipe companion or a flavor relative. Previous AI models have mixed the two. The startup Kaikaku.AI separates both perspectives in new research.

With "Epicure," Jakub Radzikowski and Josef Chen present three nearly identical AI models. They differ only in training data. The first model, "Cooc," only sees which ingredients appear together in real recipes. The second, "Chem," only sees which flavor molecules the ingredients share, drawing on the FlavorDB chemistry database. The third, "Core," blends both.

[caption id="attachment_36121" align="aligncenter" width="1800"] Each point represents an ingredient, with similar ingredients clustered together. The models were never told which cuisine an ingredient belongs to, yet they sort themselves into clear regional cuisine groups. | Image: Radzikowski & Chen[/caption] Same question, three answers The difference shows up in specific queries. Type in "chicken," and Cooc returns garlic, onion, and black pepper, ingredients that frequently appear alongside it in recipes. Chem returns beef or pork, ingredients with a similar flavor profile. For "basil," Cooc serves up parsley, olive oil, and parmesan, the typical pasta pantry lineup. Chem serves up oregano, tarragon, and rosemary, the herb relatives.

[caption id="attachment_36122" align="aligncenter" width="1800"] The test measures how accurately properties like fruity, bitter, or protein content can be read from each model. The farther right a point sits, the more reliable the reading. The chemistry-based Chem model leads almost across the board. | Image: Radzikowski & Chen[/caption]

The chemistry-driven model also performs better in areas where it shouldn't have any information, according to the authors. Flavors like sweet, sour, or bitter and nutritional values like protein or fat content aren't directly coded in the training data. Yet Chem classifies ingredients along these axes more clearly than the other variants. The chemical relationships apparently act as a shortcut that also tunes the model to other culinary concepts. Multilingual corpus instead of English-heavy data The most complete public ingredient model to date, FlavorGraph, is built on an English-language recipe corpus. Epicure, by contrast, processes 4.14 million recipes from eleven sources in seven languages. These include Chinese, Russian, Vietnamese, Turkish, Indonesian, and German. A pipeline built on Claude and Gemini embeddings translates and cleans up about 200,000 raw terms, such as spelling variants, brand names, and preparation instructions, into 1,790 clean ingredients.

[caption id="attachment_36123" align="aligncenter" width="1800"] This chart shows how distinct each cuisine's ingredients are from the rest. South Asian cuisine stands out the most, Western Atlantic the least. Chem separates the regions most sharply in every case. | Image: Radzikowski & Chen[/caption]

The corpus remains unevenly distributed, though. About half the material comes from East Asian sources, while Latin American, Eastern European, and South Asian cuisines each contribute single-digit percentages. Only about a third of the ingredients are directly anchored in the chemical database. The rest pick up the chemical signal indirectly through related ingredients. A dial for direction Two modes of operation run on the finished model. The first is a simple neighbor search: which ingredients are closest to a given one? The second lets users shift a seed ingredient by an adjustable angle toward a target direction. At zero degrees, the original stays untouched. At sixty degrees, the target neighborhood takes over.

[caption id="attachment_36124" align="aligncenter" width="1800"] Without any predefined categories, the analysis finds groups of ingredients that belong together. The groups then get Claude-generated labels like "dessert ingredients" or "Chinese wok cooking essentials." | Image: Radzikowski & Chen[/caption]

Turn "rice" slightly toward South Asia, and curry leaf, urad dal, chana dal, and fenugreek seeds appear. Turn "chicken" more toward processed Western Atlantic cuisine, and you get Cream of chicken soup, crescent rolls, and ranch dressing, typical US home cooking staples.

The model choice can even decide which culture an answer comes from. Turn "chocolate" in the direction of "sweet pastries," and Cooc and Core land on Western baking ingredients like cocoa, vanilla, and baking powder. Chem lands on an East Asian dessert cluster with red bean paste, matcha powder, and purple sweet potato. The choice of model also determines the cultural home of the answer.

[caption id="attachment_36125" align="aligncenter" width="1800"] The dotted line shows how similar randomly selected ingredients would be. The groups the model actually found sit far to the right, meaning they are coherent clusters. | Image: Radzikowski & Chen[/caption] Authors are building robot restaurants Behind the research is a restaurant tech startup. Kaikaku was founded in London in 2023 and runs its own robotic restaurant, Common Room, in the Brunswick Centre, with plans to expand it into a chain.

The company uses its own machine learning systems to weigh and portion ingredients. Its machine, called "Fusion," can theoretically dispense 360 bowls per hour. The system also includes ML-powered inventory management and 3D-printed food-safe components. The company raised about $1.8 million in a pre-seed round in 2024.

Given that background, the interest in a machine-readable map of the ingredient world makes sense. A model that switches between recipe companions and flavor relatives on demand, translates ingredients across cuisines, or shifts them along axes like "fatty" or "fermented" would be useful in several places. It could help with menu development at a bowl restaurant, suggest replacements during supply shortages, or assist when scaling to new locations.

Whether this works in practice remains to be seen. Model weights and datasets are now available on Hugging Face, making independent verification possible in principle. But the examples shown in the paper are hand-picked. In sparsely represented regions like South Asia or Latin America, the answers are likely far less stable than for the dominant East Asian and Western cuisines.

The vocabulary cleanup also depends on the output of language models, which carry their own cultural biases. The fact that chocolate ends up near matcha in one model variant's "sweet pastry" direction is a nice effect. But it says little about how reliably such rotations work beyond the cherry-picked examples.

Co-author Josef Chen promotes the model on X as "the largest multilingual food model ever built," saying they've got "all of human cooking compressed into 2 megabytes." An older version of the model is available as a demo at epicure.kaikaku.ai.