In a prior post, Will Generative AI Hurt Search & Publishers, I discussed how Generative AI could impact content publishers. Here I’ll suggest some strategic options for publishers, limited as they may be. It’s been a while since I’ve personally built large scale content products, but the category is in my blood from a deep history in digital publishing. As I’ve studied and worked with AI, I’ve felt compelled to share some ideas.
Generative AI and other machine learning tools are not just the latest tech flavor of the month. Their value is powerful and accelerating, especially for information like text, imagery, audio, and video. Numeric data as well of course, though finance and science have their own tools. We’re not focusing on coding, life sciences, atmospheric science, killer robots, or other technical fields here; rather on general content publishing.
Things aren’t all bad for publishers. GenAI brings both opportunities and challenges, offering efficiencies and creative possibilities, but also introducing complexities in quality and originality, with business model and revenue impacts.
Brief Recap of GenAI Business-Specific Risks for Publishers
Snippet from earlier article on Will Generative AI Hurt Search & Publishers?
- Reduced Traffic to Sites – Direct answers in search engine summaries may preclude the need to go to a site, even more so then with the link snippets, answer boxes or knowledge panels.
- Aggregation without Attribution – Source usage happens without attribution or compensation.
- Erosion of Value – As GenAI content floods digital spaces, original content could get crowded out. It may be hard to justify the costs of high quality content if users go to free, summarized information.
- Lowered Revenue – Much of content is ad supported. Less traffic means shrinking budgets. And paywalls / subscription models may suffer as well.
- Copyright Enforcement – As discussed, it will be challenging to enforce intellectual property rights.
- Audience Fragmentation / Brand Dilution – If content is delivered via AI summaries, even with attribution, publishers lose control of their brand and brand voice. Information becomes generic.
What’s the News on the Content Production Side?
Content creators use GenAI to their benefit. It helps brainstorm ideas, draft articles, create social media, and check work. Tools work best in the hands of skilled craftspeople. Talented writers can still thrive.
However, there are production risks, especially for originality and authenticity when blending AI and human output. GenAI’s inaccuracies can harm credibility. GenAI might make things up. These are called ‘hallucinations’ and it’s shameful when humans fail to catch these. Technical reasons are beyond scope here, but you can see: Why AI Makes Stuff Up, and What are AI hallucinations? And ironically, those worried about copyright might inadvertently commit violations through unchecked GenAI use.
What’s a Publisher to do?
Publishers have options from using tools effectively to addressing threats.
Opportunities on the Production Side
Publishers can use GenAI as an editorial tool, helping with idea generation, drafts, social media, and workflow. AI can be a brainstorming partner and quality checker in the production pipeline.
Publishers can blend human and AI input, focusing on originality. With human oversight, they can avoid errors and copyright risks. While GenAI supports creativity, human expertise shapes final products, maintaining brand voice and reliability.
Business Opportunity Considerations
Content publishers can leverage strengths to create new revenue streams, offering unique services tailored to their content. Options include custom summaries, premium products, and immersive experiences.
Exploring these, publishers can maximize GenAI’s financial opportunities by adding value. Here are some benefits of being a primary source of quality content. While not all may translate into opportunities, they’re worth considering.
Let’s look at some benefits of being a publisher and where there may be new opportunities…
- Content Quality: You control quality. Despite content commoditization, a solid destination site holds value. Search still brings access. 93% or 68% of online experiences start there, depending on the source you believe. GenAI will impact this, but quality still matters. GenAI improvements can stumble, sometimes lowering rather than raising quality. Even the MMLU dataset used to test GenAI output is arguably wrong in some places. Maintaining high editorial standards can become strategic vs. others who may increasingly overuse content production tools to produce what I’ve called Mediocrity at Speed.
Personality Matters. Your producers and authors contribute to brand value. While there’s a risk they may leave or go independent, they remain part of what makes your site a destination. - Source Attribution: You hold the trust. GenAI has known issues, and even as it improves, your brand remains a trusted authority. Make sure, insofar as possible, that your content is tagged in whatever ways the future suggests might make it easiest for any AI tools to reference you as the source. There’s no GenAI standard for markup, (as yet), but it’s reasonable to assume they may hitchhike on existing standards, which is good to have anyway. (E.g., schema.org, open graph, and so on.)
- Content Safety: Certain groups (parents, schools, etc.) may prefer GenAI limited to known, quality publishers. It’s not difficult to bypass GenAI safety measures and get undesirable results. Content safety includes accuracy, especially in fields like children’s education. There’s no real preparation line item here for AI, just a potential value proposition.
- Governance / Compliance: Trusted source material will matter, especially for safety or mission-critical content. Such information may continue to thrive behind paywalls.
- Use GPTs in Consumer Products: Offer GPT-based subscription products with exclusive features like interactive summaries, personalized newsletters, or targeted recommendations for subscribers.
- Content Aggregation Consortia: LLM quality depends on content. Limited tech differences and overlapping pre-training data reduce model differentiation, affecting publishers. This ties into the next idea, ‘Join Them.
- Join Them: Partner with AI developers to offer high-quality datasets or pay-per-access feeds, ensuring fair compensation and visibility. Quality content impacts AI training. (See the State of AI report from Air Street Capital, page 31: “diverse, human-generated data will become increasingly critical for maintaining model quality.”) We’re seeing early agreements, like Hearst, Time and others with OpenAI. And book deals as well, such as Wiley and others. Attribution is key, as without it, GenAI tools satisfy many user needs without a clickthrough thus fully disintermediating publishers.
AI companies know quality could suffer without partnerships, and perhaps also increase regulatory risks. Not to mention risks of intentional injection of bad content into training data. Though some publishers are starting to include AI publishing rights in author contracts, authors may still be a bit unclear as to what to make of all this.
This could get complicated fast. AI companies are vying for recognition. Will they leverage quality partnerships? Under what financial terms? How will smaller niche publishers, lacking volume or legal weight, manage? - Unique and Specialized Content Creation: Develop content GenAI can’t easily replicate, such as live discussions, interactive visuals, and multimedia narratives. Like live events for TV/Radio. This type of value will likely be hard for AI to match, at least for now. In terms of topics, broad AI-generated summaries often lack depth in niche topics. Consider in-depth, specialized content for niche communities, such as enthusiasts, professionals, or academic audiences, who value specificity and expertise.
- Leverage Real-Time Content Updates and Breaking News: This is just one use case and not amenable to all topics, but even as AI integrates search, they may struggle to keep up with breaking news.
A note on AI quality: we’re at the start of this innovation wave. Evaluating GenAI quality remains a developing field, and data curation is key. Low-quality sources harm models, and smaller, specific models may lack general utility. Synthetic data, used to train or test AI, helps with privacy, scarcity, and bias issues, and in risk-heavy AI areas. (Synthetic data is artificially generated information.) But in day-to-day content, human-generated variety and quality still matter.
Semi-Hidden Benefits of Being a Publisher vs. an AI
The following issues may not directly affect publishers’ risks, and things may change, but they’re worth considering for a holistic view of the changing landscape.
- Cost: Content publishing may become a higher-margin business than GenAI as GenAI commoditizes. Ignoring fixed tech costs, let’s focus on incremental unit economics — considering content serving and search results vs. GenAI results.
Serving static pages or search results costs less than AI. A lot less. GenAI has major fixed and variable costs, notably in chips and power. While the GenAI market is forecasted to reach $356B by 2030 (Statista), investment will eventually need returns. AI’s major costs include storage, compute, and foundational model training, with fine-tuning and Retrieval Augmented Generation (RAG) adding to inferencing costs. Perhaps most important is the variable costs in terms of advanced chips, and power needs. The industry is still driven by startup economics focused on capability and growth right now, but ongoing operating costs are clearly already of deep concern and even startup economics need to face reality.
It’s challenging to estimate, but the variable cost of a search result (Google, Microsoft) is likely fractions of a cent, similar for static web pages. Inferencing costs, however, could range from $0.01 to $0.10 as of late 2024, depending on token count and query size. With billions of queries, costs add up quickly. Inferencing costs also depend on how many “tokens” are being processed; that is, the size of the query. So the variable costs may be challenging to estimate. (See Decoding the True Cost of Generative AI for Your Enterprise and Cost of AI: Inference, and this most excellent article: The Inference Cost Of Search Disruption – Large Language Model Cost Analysis.) This being said, AI companies up and down the stack are focusing on functional economics and costs are expected to drop over time.
Where Humans Still Do Better (But AI is catching up)
- Multimodal Challenges – For product recommendations, AI may also struggle to integrate text sources, numeric data, and multimedia. AI may struggle with temporal consistency, to understand sequences over time. There may be interpretation problems across modalities, such as having lots of text, but limited video for an issue. And more. These issues are known and under improvement, so GenAI may close these gaps over time. (Generally. The voracious energy needs of AI perhaps remain a question mark in terms of cost and availability. If you thought blockchain was a Grid Gobbler, just wait for AI to keep stringing servers together.)
What Publishers Might Need
- Develop Verified Content Initiatives: Publishers may need a verification system for brands, authors, or human-sourced content. Blockchain identity tools could help, despite potential costs. This would require a sophisticated industry consortium. Like ISBN for books or GS1 for products, a content classification, provenance and verification system would be complex, but may become essential.
That which is truth today seems to be quite the moving target. Whatever your definition or philosophical perspective of this, the source matters. Anonymous speech has always had its value of course. But for the purposes of judging credibility in an era of misinformation, (however you want to define that), source attribution could become a premium asset, distinguished from GenAI created content.
Publishers Can Fight
Publishers Can Fight: The question is, ‘fight for what?’ No one wants to be left out, as that’s akin to ‘opting out’ of Google. Most fights will center on attribution and revenue sharing.
This is challenging due to a tangled web of legal issues. Is GenAI output a derivative work? And if so, is it mostly Fair Use? If not, what are the damages and remedies? It’s complex. When it comes to derivative work, it shouldn’t even be a question. It’s 100% derivative work. Theoretically. Whereas search engines just aggregate and link to indexed content, Generative AI is, by definition, generated from source materials of others. But is it? Really? There’s no direct copying. Legally, courts have already rejected expansive arguments against OpenAI and Meta because there was no proof of substantial similarity. Though other claims may be allowed to proceed. Cases continue against Anthropic, OpenAI, Meta, Midjourney, Stability, Runway, Udio, Suno, and others from news outlets, image suppliers, authors, creative artists, and record labels.
Bottom Line
Publishers once again face a new digital frontier—one that demands adaptability and foresight. For centuries, publishers made money via newsstand sales, subscriptions and basic advertising. Centuries. Yes, radio and television and other media took some time away from them as they proliferated in the early 20th century. But these too had fairly simple revenue models. Along came the 1990s. All of a sudden, publishers were hit with major sea changes. Declines in print circulation due to online content, eventually self-publishing and social media content, shifts to lower value digital advertising, costs to manage digital, etc. And now, Generative AI. Those in the industry with a dedication and long commitment to journalism were once describe as having “Ink in their Veins.” I’m not sure what they’ll need now. But grit and resilience will need to be part of the formula.