The Lindahl Letter
The Lindahl Letter
Trust and the future of digital photography
0:00
-8:01

Trust and the future of digital photography

A zero trust image paradigm

This week based on the backlog, I should be covering the topic of Bayesian optimization. During the course of sitting down to write this week something different happened. Apparently, I was a highly misbehaven backlog prompt this morning. Instead of digging into that topic I’m going to spend some time talking about a more pressing philosophical question related to the future of trust and digital images. This missive is more about expressing and recognizing concern than delivering information. Starting in April of 2022, OpenAi shared the DALL-E 2 model which the researchers noted, “DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language” [1]. Strangely enough the word ethics does not really appear on the homepage for the DALL-E 2 system. It was apparently more important to spend time setting up an Instagram account for art created by the model [2]. You can pretty easily go see how photorealistic some of these images are. With this release from OpenAI, I never requested to join the waitlist to kick the tires on this one. I’m generally more interested in natural language processing than visual image processing. 

Let’s set the stage as clearly as possible on this one. People are used to being able to go to the photo finish. Races have been decided by photos and ultimately video for years. We trust video in replay for sports and it remains the visual record of our times. Things changed. Full stop. It used to require a lot of effort to make a deep fake or to alter photographs. It required software and spending time to accomplish that task. Right now a host of new models and other ML implementations are creating the possibility of asking a prompt for an image and within a few seconds getting a reasonable approximation. Some of these images are really high quality. They are photorealistic renderings. You can call up pictures of people who never existed doing things that never happened. 

All of this raises really interesting ethical questions about the use and propagation of such ML technology. Keep in mind that these models are just openly being used and served up online. No degree of concern for the potential social impact stopped the distribution and ultimately additional model development. Right now I’m going to say that you cannot trust the future of digital photography. Don’t believe what your eyes report at this point within the digital space. Right now these models very quickly change and make digital images. Soon enough people will develop that technology into a series of images and realistic video will be produced based on a prompt. Essentially they just have to extend the model to the context of a few frames in series and short videos will spring into existence. The evening news could galvanize popular opinion with a story and photograph. At this point, I’m not sure we can trust that type of evidence anymore. Lingering implementations for how civil society is going to change in the face of a zero trust image paradigm. I’m not sure people even understand the alternate realities that could be created and presented as fact. Somebody could bring forward the presentation of a very news forward YouTube channel powered by DALL-E 2 created images. For example, you could introduce a new continent and talk about the discovery of Atlantis and potentially go on for years presenting an alternate reality as truth. Somebody will probably make a living doing that or something equivalent to it. That is where the ethical considerations of this technology and the impacts on society as a whole took a backseat to race to share and demonstrate effectiveness. 

Take a moment and consider that just because a technology can do a thing does not mean it should be used to do those things. We make choices. You have to have ethical considerations at the forefront of that type of decision making. We are getting to a point where we have a zero trust image paradigm that will effectively make it a necessity to question everything you see in terms of digital photography and ultimately video. That realization and reality will reverberate across interactions in daily life. At this point, based on the evidence we have, I’m going to declare we have to embrace and ultimately enforce a zero trust image paradigm. How do we even label actual historical documents accurately at this point? Historians will have to be very careful going forward in the analysis of the times about to happen. This may very well be a watershed moment about how we evaluate the truth in front of us and how we verify and validate that narrative. My argument here is not intended to be hyperbolic or presented with any sarcasm whatsoever. A very real situation is developing within our ability to trust the visual world being presented to us. We have to consider the possibilities in front of us and begin to evaluate a path forward. I’m assuming that the path forward is zero trust. That should be clear within the argument being presented. You will have to decide what to do with the world being brought to live by models and systems like DALL-E by OpenAI.

Other prompt based text to image generating models exist as well: DALL-E mini, GLID-3, CLIP, RuDALL.E, and X-LXMERT. My focus here is on the DALL-E model from OpenAI as it has seemed to capture a higher degree of interest from the public mind [3]. You can go run the DALL-E mini model from Hugging Face spaces online for free [4]. However, that site is apparently migrating to craiyon and you find a link to that shared within the footnotes [5]. You can check it out for yourself and see if you share my concern. At the very least, you need to be prepared to openly question any images that are presented go forward. They could very well be synthetically generated.

Links and thoughts:

“Intel Messed Up - WAN Show June 24, 2022”

“Vergecast: M2 MacBook Pro review, Solana’s crypto phone, and this week’s tech news”
https://megaphone.link/VMP9448497842 

Top 7 Tweets of the week:

Twitter avatar for @rasbt
Sebastian Raschka @rasbt
Has anyone tried diffusion-based models, yet? Heard that they produce better results than GAN (e.g,. arxiv.org/abs/2105.05233 is quite convincing) and heard they are easier to train. True? People also say they are pretty slow to sample from. Anyone any experience with these?

Footnotes:

[1] https://openai.com/dall-e-2/ 

[2] https://www.instagram.com/openaidalle/ 

[3] https://techcrunch.com/2022/04/06/openais-new-dall-e-model-draws-anything-but-bigger-better-and-faster-than-before/ 

[4] https://huggingface.co/spaces/dalle-mini/dalle-mini 

[5] https://www.craiyon.com/

What’s next for The Lindahl Letter?

  • Week 79: Why is diffusion so popular?

  • Backlog: What is GPT-NeoX-20B? Bonus topic: What is XGBoost?

  • Week 80: Deep learning Bonus topic: Bayesian optimization

  • Week 81: Classic ML algorithms

  • Week 82: Classic neural networks

  • Week 83: Neuroscience

I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.

0 Comments
The Lindahl Letter
The Lindahl Letter
Thoughts about technology (AI/ML) in newsletter form every Friday