Strange IndiaStrange India

Apple’s been a bit behind on the generative AI front, minus some small features added to iOS 17. That said, 2024 is shaping up to be Apple’s big AI year. All eyes are fixed on iOS 18, which should be packed with AI features, including an upgraded Siri.

Ahead of that release, Apple researchers, in partnership with the University of California, Santa Barbara, have unveiled an open-source AI model that understands natural language instructions. In short, you tell the AI to do something to change a photo, and it will.

What is Apple’s MGIE AI image editor?

This new AI model, named “MGIE” (MLLM-Guided Image Editing), takes in standard commands from the user to achieve three different editing goals: “Photoshop-style modification, global photo optimization, and local editing.”

Photoshop-style modification includes actions like cropping, rotating, and changing backgrounds; global photo optimization includes adjusting effects for the entire image, including brightness, contrast, or the sharpness of the image; while local editing affects specific areas of the image, such as its shape, size, and color.

MGIE is mainly powered by an MLLM (multimodal large language model), which is a kind of LLM capable of interpreting visuals and sounds in additions to text. In this case, the MLLM is used to take in user commands and interpret them as proper editing direction. MGIE’s research paper explains how this is a traditionally difficult task, as user commands can often be too vague for a system to properly understand without additional context. (What does the program think, “make the pizza look healthier” should mean?) But researchers says MLLMs like MGIE’s are effective here.

Based on the research paper, MGIE is capable of many different kinds of visual edits. You can ask it to add lightning to an image of a body of water, and make the water reflect that lightning; remove an object in the background of an image, such as a person unintentionally photo-bombing; turn things into other things, such as a plate of donuts into a pizza; increase focus on a blurry subject; remove text from an otherwise nice photo, among many other possibilities.

You can get a sense of how the tech will function by perusing the complete research paper, which includes examples of the editor in action; it’s available here.

This isn’t the first application of AI in photo editing, of course. Photoshop has had plenty of AI editing tools for some time now, including ones generated from user prompts. But MGIE might be the most realized vision yet of an AI image editor based on commands.

How to try out Apple’s MGIE image editor yourself

As the model is open-source, anyone can download and integrate it with their own tools. However, if you’re like me, and wouldn’t know where to start with that, you can try this demo hosted by one of the researchers of the project. You can upload an image you want to edit, enter a command, then process it.

At this time, however, the demo has quite the queue of requests backed up. I’m currently one of 237, which I imagine could keep growing as more people want to try the model.

It’s not clear if or how Apple will integrate MGIE into its own platforms. But if there were a year for the company to do so, 2024 would definitely be it.

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *