Publishing a model or selling the API?
Both OpenAI and Hugging Face have teams doing great things with respect to machine learning models. Their delivery models are very different. You can visit OpenAI at https://openai.com/api/ and look around at the machine learning models being sold to consumers like IBM, Salesforce, Intel, and Cisco. Delivering machine learning models via an API is one way to go about publishing and sharing your work. Alternatively, people are publishing models like what is happening over at Hugging Face https://huggingface.co/models where if I’m reading the page correctly you can search for various models from a list of 36,028. Both of these organizations are delivering excellent machine learning content to a world of people looking to operationalize machine learning within their corporate strategies. Deciding to publish a model or to sell it via an API is a major decision to make. Selfishly, I much prefer the open source models that I can play with and download.
Before we move on to the rest of this analysis please consider my full disclosure that I participated in the OpenAI private beta for both Codex and GPT-3. Both of those sets of beta analysis provided API access and not full downloads of the models in question. That participation may have given me a good idea of how the system works and let me kick the tires, but it did not cloud my judgment or make me want to give OpenAI favorable treatment. I do agree with the original assessment by the OpenAI team that the GTP-3 and GTP-2 models open the door to misuse . Back in 2019 The Verge team noted that, “OpenAI has published the text-generating AI it said was too dangerous to share,” .
We face a very real possibility that the models could be misused to flood our information streams and that it would become almost impossible for communication to function. Some people already believe that bots and other flooding techniques to AstroTurf and falsely drive news cycles are already breaking a problematic news ecosystem. A truly asymmetric delivery problem exists when the amount of content being produced is massively larger than what can be consumed by an individual. Traditional media has transformed from the highly curated view newspapers both national and local mixed with nightly news broadcasts provided to near real time broadcasting. The level of curation within a 24 hour broadcasting channel is even fundamentally different from the single serving real time publishing cycle that happens online. While the topic of information flooding deserves an entire post of consideration especially related to the mechanics of how it works I’m going to move on to the ethics part of the question.
Ethicists have been debating the potential release of dangerous machine learning models for some time . It is a serious debate that needs to be had probably at a governmental and ultimately international consensus level given the potential influences on civil society as a whole from a dangerous intersection of technology and modernity. You can easily provide a model like GPT-3 a prompt for a topic and it will very quickly spit out content. If you elected to do that over and over again for a nefarious purpose, then you could flood comments, posts, news, and other points of information. It is a truly great tragedy of the public information commons that takes the power of sharing information online and tips it to an extreme.
Outside of the ethical considerations of these large language or foundational models. We are probably going to see the heavily used and curated machine learning models deployed via the API method of selling and providing accessibility. Reducing the friction to be able to access and use an API which is generally going to be curated by an organization that is handling all the maintenance and training has a certain value proposition going forward. You almost get to set it and forget about the ongoing cost of training, enhancing, and maintaining the machine learning model. Your machine learning return on investment model may very well allow for some additional cost per transaction within an externally sold API to get the benefits of speed to access and ongoing scalability. That creates an advantage for the biggest companies that can provide proven uptime and reliable service. My attention turned to looking at Google Scholar for “machine learning API marketplace” to see what publications surfaced . A lot of the articles felt like pitches or introductions to specific technology. They were describing parts of the landscape, but were missing the bigger picture of what would happen in the overall marketplace.
Links and thoughts:
1. I watched Linus and Luke during the WAN show episode from April 8, 2022. They were super excited about launching a screwdriver, backpack, and maybe getting a second giant product testing studio. It's good that Linus is getting into the product testing part of the review space as that is an area where we need more focus and professional attention for the people to consume in general. Our ability to get independent reviews of a high quality seems to be shrinking over the last few years. We have seen a lot more focus on unboxing and reviews from people on YouTube as people are branching out to try to get unbiased and unfiltered insights into products before buying them increasingly online.
2. This week in ML News Yannic Kilcher talked to people on the street and we learned about Google's 540B PaLM Language Model and the OpenAI DALL-E 2 Text-to-Image model.
3. On this episode of Decoder Nilay Patel talks to Chris Dixon about a lot of topics related to web3. It was a very interesting hour of discussion about the future of the internet and what will happen online.
4. Nilay and friends were really excited this week on The Verge podcast where they discussed a couple of topics including Mark Zuckerberg’s big plans for AR glasses, Google’s apparent lack of interest in AR hardware, and Elon Musk’s Twitter drama.
Top 5 Tweets of the week:
nilay patel @recklessToday’s Decoder is one I’ve been looking forward to for a while: @cdixon comes on to discuss web3 and what problems it can actually solve. And, of course, the fact the Apple won’t currently allow any web3 apps to sell without taking a huge cut: https://t.co/xDyIKNqIvL https://t.co/RlofzFrXDl
What’s next for The Lindahl Letter?
Week 69: A machine learning cookbook?
Week 70: ML and Web3 (decentralized internet)
Week 71: What are the best ML newsletters?
Week 72: Open source machine learning security
Week 73: Symbolic machine learning
I’ll try to keep the what’s next list for The Lindahl Letter forward looking with at least five weeks of posts in planning or review. If you enjoyed this content, then please take a moment and share it with a friend. Thank you and enjoy the week ahead.