I have to admit doing some bit of tossing phrases into one of the DALL-E Mini generators to gawk/shrug/tweet about what emerged. There is a bit of natural curiosity to either get a result that is a stunning graphic but what seems more common in my attempts… crap.

So I like seeing my colleague Bryan Alexander testing the waters.

and seeing limitations as pointed out by Mark Sample

I have a lot of questions about what is going on here that deserves to be called intelligence or even squinting a bit in pain when some article calls this “jaw-dropping” (admission, I have seen about 0.000000000000000001% of all AI generated images, yet this stops me in no way from drawing conclusions).

This sideline, quasi curiosity been part of my own experiment with a sort of blog-ish like effort in the OEG Connect community space (though hardly anyone else no one has taken up the offer to make their own thread).

There’s a bit of a fad rush to be prompt tossing into the machine. What makes for a good AI image prompt? There seems to be a bible of sorts, and might be worth testing out.

Yet I feel like this is all a distraction, like someone is harvesting the data of all the things tossed into these machines. I read inferences that the AI is learning as it goes, but who is teaching it? Why can’t we tell craiyon et al when something it turns is good or crap?

On Promptism

But going above and beyond this, a quote from Seb Chan’s Fresh and New email issue on Generative things and choosing the right words keeps circling back:

My social media feeds overflowed with DALL-E and Midjourney ‘promptism’ visuals. The coining of ‘promptism’ by others nicely deflects attention from the underlying technologies to the craft of finding the right language (the prompt) to ‘talk to the machine’. Its a bit like those people who seem to have a magical ability to craft ‘the perfect Google search query’ but aren’t trained librarians and have really just have done a bit of SEO work in their past.

https://buttondown.email/sebchan/archive/79-generative-things-and-choosing-the-right-words/

Apparently as Seb mentioned, promptism has been manifesto-ized as a “movement”

(gack).

Again I feel like the public attention on spawning out images of orangutans riding avocados done Impressionist style is diverting focus from asking – can we do do more than create 9 box cartoons? The use of AI to identify Holocaust victims at least seemed something a tad more useful even if as usual when i comes down to explaining it gets very fuzzy vague fast.

Can AI Improve Old Photos?

So when I saw a Research Buzz tweet with this article GFP-GAN is a New Free AI Tool That Can Fix Most Old Photos Instantly (GFP = “Generative Facial Prior”). Sure I could have reweeted and moved on to creating things like “A bottle of ranch testifying in court” (look there is a twitter account to follow @weirddalle) but I feel I understand better wand blog better when I can do things hands on.

The methodology is splained as:

The improved 1.3 version of the GFP-GAN model tries to analyze what is contained in the image to understand the content, and then fill in the gaps and add pixels to the missing sections. It uses a pre-trained StyleGAN-2 model to orient their own generative model at multiple scales during the encoding of the image down to the latent code and up to reconstruction.

Using additional metrics helps the AI enhance facial details, focusing on important local features like a person’s eyes, mouth, and nose. The system then compares the real image to the newly restored image to see if they still have the same person in the generated photo.

https://petapixel.com/2022/07/28/gfpgan-is-a-new-free-ai-tool-that-can-fix-most-old-photos-instantly/

The Petapixel article used as an examples the pristine ones the software provides in a gallery, but I wan to push the buttons myself. So I headed over to the GFP-GAN demo site

I started with a 1970s era photo of my Dad and his best friend who we called him “Uncle George” though was not a relative, and you can see it did do well to sharpen the faces.

Before and after photo with some image “improvement” by GFP-GAN.

The creators do note the limits:

GFP-GAN is not without its flaws though. According to the developers, while the restored images are much more detailed than previous and other versions, the current restored images are not very sharp and can have a “slight change of identity.” This means that in some cases, the restored images can sometimes look like a different person. …

Bouchard says even though the results are mostly fantastic and remarkably close to reality, “The resulting image will look just like our grandfather if we are lucky enough. But it may as well look like a complete stranger, and you need to keep that in consideration when you use these kinds of models.”

https://petapixel.com/2022/07/28/gfpgan-is-a-new-free-ai-tool-that-can-fix-most-old-photos-instantly/

AndI see this a bit in my Dad’s face. To the average person, the improvement is clear, but to me, his son, I can say it looks like him, but there is something intangibly wrong,in the smile, the jaw, that leaves me feeling like it’s more “Dad-ish” than “Dad’.

I tried another old photo of my Dad as a teen and his Dad, my grandfather. I enjoy this photo because I never knew my Dad as a kid and my grandfather passed away before I was born. The results are again impressive, note especially the addition of some reflections in my grand-father’s eye glasses

Okay, the image is sharper, can we do more? Quite some time ago I played with some early AI web tools that attempted to colorize old photos. There’s a slew of them now, I just tried the one fro Hotpot

My Dad and Grandfather colorized

It’s again ‘Interesting” but problematic in my grandfather’s coat colors and also some weird changes in the background.

BackI went to GFP-GAN to see what it could do with a photo of my mother’s brother, Harvey, when he served in the Navy during World War II.

GFP-GAN has really cleared up this photo of an uncle I barely knew

I have to be impressed here, not only with the clarity of his face, blurred in the original, but also how much more realistic his jacket and slacks look. Let’s try some Hotpot AI color:

Uncle Harvey comes to life in color

This is pretty compelling, and I find myself wondering where this might be with the palm trees in the background. I had found his draft card indicating he served on the USS LST-892 which likely would have operated out of maybe San Diego.

One more,I tried a collage of scanned photos of my grandparents in their house that my parents moved in when my Dad’s father passed away. We moved from there when I was 2 so I don’t remember it at all.

My grandparents in the 1950s.

Again, the sharpness in the faces counts as a definite improvement and almost makes me realize how much my memory/minds can fill the gaps of the slightly blurry original. They really pop to life in the colorized version, though I am fairly sure grandma did not have blue legs.

So What

I do find it useful to see what this image improving algorithm can do. I am less sure where the AI is coming into play here, isn’t this just algorithms at work? I yet have an understanding of exactly what it is doing. Note-maybe I need to read the paper or cheat and watch the video (I just noticed I skipped over it)

And probably this sill is promptism in the machine- I toss an image into an AI site and it returns something.If it does not produce desirable results, I try to adjust by using a different image. I still have no insight at all on what steps the machine went through.

But a difference here, over just tossing in some box a phrase like “Santa taking a picture in a tornado” is my own stake in the photos being improved. There are things about the images no algorithm will ever know.

Promptism from 1964

As serendipity goes, today I experienced something that connected to this post in an unanticipated way. This being the first day of a vacation during the travel portion I partook in something I’ve not done in a long while, reading for pleasure. I grabbed a paperback off the shelf, a novel I had purchased a while ago, but never read, Philip K Dick’s The Penultimate Truth (light post apocalyptic reading!).

In the first pages, the speech writer Joseph Adams is putting to use a device:

At the keyboard of the rhetorizor he typed, carefully, the substantive he wanted. Squirrel. Then, after a good two minutes of sluggish, deep thought, the limiting adjective smart.

“Okay,” he said, aloud, and sat back,touched the rerun tab.

The rhetorizor, as Colleen reentered the library with her tall gin drink, begam to construct got him in the auddiemension. “It is a wise old squirrel,”i said tinnily (it possessed only a two-inch speaker), “and yet this litle fellow’s wisdom is not its own; nature has endowed i–“

The Penultimate Truth, page 2

Adams gets angry with the generated response and slaps the machine to turn it off. His companion advises him that his prompt needs improvement.

“Dear,” Colleen said, and sighed.”I heard you type out only two semantic units. Give it more to ogpon”

:

:

“Listen.” he said grimly. “I want to see what this stupid assist machine that cost me fifteen thousand Wes-Dem dollars is going to do with that. I’m serious, I’m waiting.” He jabbed the rerun tab of the machine.

The Penultimate Truth, page 2

So my casual reading of a scifi novel links me right back to this idea of us entering phrases into a machine with expectations of it generating content for us. Reminder- PKD visioned this in 1964.

Here we are in 2022 and promptism is peaking.


Featured Image: My own prompt generated creation, by me, a human. I took the results from craiyon the prompt “A techno utopian stares in boredom at his mobile phone” – I was going to comment on the gender but I did enter “his” rather than “a”. Mea cupla. This is superimposed on a google images search on “dall-e examples” – call it shared under CC BY as if anyone really could use this

If this kind of stuff has value, please support me by tossing a one time PayPal kibble or monthly on Patreon
Become a patron at Patreon!
Profile Picture for CogDog The Blog
An early 90s builder of web stuff and blogging Alan Levine barks at CogDogBlog.com on web storytelling (#ds106 #4life), photography, bending WordPress, and serendipity in the infinite internet river. He thinks it's weird to write about himself in the third person. And he is 100% into the Fediverse (or tells himself so) Tooting as @cogdog@cosocial.ca

Comments

  1. Thanks for the bit of reportage on Promptism. I had not run into the term prior to this weblog entry. It goes without saying, it gives a whole new meaning to “prompts”. When I think of prompts I naturally assocatiate creative ones. I like to think of the the #ds106 Daily Make as the prompt we can all believe in, the one we can get behind. The one that’s built #4Life. :^)

  2. Very relevant, I read a post today about a company who took time and money to generate cover art for blog posts that aren’t served well by stock images (technical writing). I think their approach to overly-descriptive desired images highlights the difficulty of using AI to do this kind of thing.

    Do you feel different about the photos that cleaned up well? As I was reading, I wasn’t impressed until your uncle’s photo and I don’t even know him. I wonder how our perception of family from old photographs is affected by the quality of the photo. Maybe it’s nostalgia in the “oldness” of the image outweighing a clearer view of who that person was.

    Or maybe it’s just feeling weird about some computer out there filling it what it thinks those people should look like.

    1. Thanks for that indeed relevant link Brian. I have to admit that for their purposes, it works well, and I would disagree the author’s assertion that there is not a consistent style. And as someone who devoted much time tin writing to finding and creating featured images I feel an urge stand up as being more John Henry than AI steam engine. Well, for $45 / 100 images, I might lose.

      A thing of consideration might be thinking about how much relationship there is to the featured image- the articles are not dependent on them to convey information, they are decorative. And thus it is also what our relationship to the photo. For me, seeing a bit more clear image of the face of an uncle I really do not remember, works, whether it is more real. That’s because I have a stake in the photo, which is different from yours. But I think what you are suggesting is making an older photo more crisp and clear is not always an improvement?

      But yes, all of it feels weird, and it gets weirder all the time. Thanks for dropping the comment in!

      1. Since writing that comment, I was looking at a book about The Legacy Museum which focuses on the US history of terror against Black citizens and it featured a recolored photo of a lynching, but the coloration was done by a person.

        The colorized version of a photo done by AI feels strange – less personal – maybe because it’s done statistically rather than a person who studies the period and can interpret the gray and black into realistic color.

        Maybe it’s the math driving everything more than the product that sits funny. I’m still not sure.

        Next time I’m at my parents, I might do the same experiments with some of dad’s old photos to see what he thinks.

Leave a Reply

Your email address will not be published. Required fields are marked *