How to Avoid the Pitfalls of Character References
I've been working on quite a few projects for clients recently. Some of these fall into the "graphic novel" category while others require AI video generation. What they all have in common is that they require consistent characters.
We've come a long way since I released my very first YT video about consistent character creation. In fact, the --cref has made my life dramatically easier and the same can be said about --sref.
However, as I've begun to use these more frequently in projects, I've come to realize that they still come with their own set of pitfalls.
It's quite funny, actually. You would think that all these great features largely eliminate the need for problem solving. But the reality is quite different. All of these features create their very own set of new challenges.
In today's newsletter, I'd like to address a very specific one.
The Problem with Photographic References
As you know, Midjourney almost always defaults to photorealism. There's nothing inherently wrong with that but it also means that the majority of examples that you'll find online will tend to showcase photorealism.
If you check social media, you will notice that the vast majority will choose like-for-like examples. What I mean by this is that a photographic character reference will be used to create a photographic image.
Or, an illustrated character reference will be used to create a new illustration. It's quite rare to encounter examples that use mixed media (e.g. a photographic reference to create an image for a graphic novel).
To be clear, I'm not saying there's anything wrong with that per se. However, it means that some of the weaknesses and challenges of --cref aren't necessarily addressed.
In my concrete example I had a typical headshot photo of a character. The objective was to incorporate this character into the graphic novel that I was working on.
/imagine
Upper body of a 35 year old lumberjack posing in front of a gray wall. The lumberjack has a bald head and a long beard.
--ar 2:3
Normally, I could try to use this character reference and then simply adjust the style in my prompt. The results would look like the images below and more some people that might actually be good enough.
/imagine
A lumberjack giving a speech in front of a mountain lodge in the style of a graphic novel illustration.
--ar 16:9 --cref https://s.mj.run/LlQeT9g36as
But the art style was meant to look like a classic graphic novel, so it's supposed to be a little rough around the edges and not particularly polished. The next step would be to try to use a style reference (--sref) to force the rough look that I was looking for.
That should do the trick, right?
Not quite. You can imagine my frustration when I kept getting images in a graphic novel style but with characters that looked almost photorealistic.
/imagine
A lumberjack giving a speech in front of a mountain lodge in the style of a graphic novel illustration.
--ar 16:9 --cref https://s.mj.run/HN-OD-Q6vXk --sref https://s.mj.run/LlQeT9g36as
I know that many of you will think this is absolutely fine. But with the additional context that I have, I can tell you that it's simply not going to work for the client.
How to Fix the Style
I spent quite a bit of time trying to figure this out. Luckily, the solution is actually quite simple.
Step 1: Transforming the Original Character Reference
You start by taking your original photographic character reference. Write a brief prompt as if you were trying to recreate the character with a straight text prompt.
/imagine
Upper body of a 35 year old lumberjack posing in front of a gray wall in the style of a graphic novel. The lumberjack has a bald head and a long beard.
--ar 2:3
Alternatively, you can use /describe if you can't think of anything. Make sure to add "in the style of a graphic novel" at the relevant location in your prompt (of course, you can use whatever style you like).
Next, add your original photographic character reference using the --cref parameter. You can set a character weight if you like but it's not necessary.
Then you add the style reference (--sref) for your art style. This will create a first round of images that already transform your character quite a bit.
/imagine
Upper body of a 35 year old lumberjack posing in front of a gray wall in the style of a graphic novel. The lumberjack has a bald head and a long beard.
--ar 2:3 --cref https://s.mj.run/HN-OD-Q6vXk --sref https://s.mj.run/LlQeT9g36as
Step 2: Repeat the Process (Optional)
Some of the images might already be good enough while others may still look a bit too realistic. You can fix this by simply doing the exact same thing one more time.
/imagine
Upper body of a 35 year old lumberjack posing in front of a gray wall in the style of a graphic novel. The lumberjack has a bald head and a long beard.
--ar 2:3 --cref https://s.mj.run/YPHu-9gEUyk --sref https://s.mj.run/LlQeT9g36as
All you need to change is to replace the original character reference with one of the newly created characters in the new style. By repeating this process again, you can successfully eliminate all photographic elements from the image.
What you get are 4 new character references to choose from. They should all be in the style you need and also resemble your original character.
Transforming the Style of the Character Reference – Repeat
Step 3: Use the New Character Reference
Now that you have a new character reference in the correct style, it's time to apply it to the new image.
/imagine
A lumberjack giving a speech in front of a mountain lodge in the style of a graphic novel illustration.
--ar 16:9 --cref https://s.mj.run/UfdpAOKWQE0 --sref https://s.mj.run/LlQeT9g36as
There really isn't much to change. Prompt whatever sort of scene you would like to create. Then add your new character reference as well as the style reference.
If you like, you can also play around with the character and style weights but that really depends on what you are trying to achieve. The beauty of this is that you can now create pretty much any scene you want.
You no longer need to worry about the stylistic look of your character.
Wrapping Up
As much as --cref has made our lives considerably easier, it comes with it's own unique set of challenges. Mixed media inputs can inject stylistic elements that may not be what you are looking for.
The best way to fix this is to first convert your character references to the actual style you need. Once you've harmonized the styles of your input, you'll get a far more consistent style in your outputs too.
That's it for today.
See you next week!