Skip to main content
Home/Guides/Case studies: why AI-generated versions underperform on LinkedIn

Case studies: why AI-generated versions underperform on LinkedIn

AI ContentBy the SocialNexis Editorial TeamJune 202610 min read

Around 54% of long-form LinkedIn posts are now AI-generated, and majority status has not helped them. It has made the feed hostile to them. AI-written case studies posted without heavy human editing run 45% below their human counterparts on engagement, and the collapse is faster than the benchmarks admit.

Marketing and Branding: human vs AI LinkedIn post engagement (avg likes + comments)

771
207
Human-written postsAI-generated posts

AI-Generated Case Study LinkedIn Performance: What the 2025 Data Confirms

The short version

AI-generated case studies on LinkedIn receive an average of 45% less engagement than human-written ones, based on analysis of 3,368 long-form posts. LinkedIn's 360Brew algorithm penalizes generic, template-style content through authenticity and dwell-time scoring, and B2B buyers detect AI writing at the proof stage, which directly reduces purchase trust.

Start with the number every AI case study workflow should be measured against. AI-generated posts received 45% less engagement than human-written ones. That figure comes from an analysis of 3,368 long-form posts pulled from 99 influential profiles across 11 industries between January and November 2025, with engagement defined as likes plus comments. The AI versions collected less than half. This is the baseline. If a workflow cannot beat it, the workflow is the problem.

The saturation context makes the gap worse, not better. By late 2025, 53.7% of long-form LinkedIn posts were classified as likely AI-generated. Human-authored case studies are now the minority in the feed. That inversion is the surprising part. When most of the content around you is machine-produced, a post with real texture stops blending in. It stands out to readers who have been trained by months of scrolling, and it stands out to the algorithm scoring for the same signals readers respond to.

Look at the vertical where most B2B case studies actually live. In Marketing and Branding, human posts averaged 771 likes and comments per post against 207 for AI posts. That is a 273% gap, and it is the category most directly tied to thought leadership and proof-stage credibility. This is not a fringe result from an obscure niche. It is the worst-performing category for unedited AI output, and it is the one B2B teams publish into most.

What the published benchmarks miss is the shape of the decline. In SocialNexis account data, accounts that introduce AI-generated case studies without a quality review checkpoint do not drift downward gradually. They drop. We see 40 to 60% reach collapse within 72 hours, and the suppression carries forward to the next two or three posts even after the account returns to human-reviewed content. The 360Brew model appears to keep a recency-weighted author quality score. A single unreviewed AI post does not reset it. Two or three consecutive ones do.

That detail changes how you should think about risk. The danger is not one bad post that underperforms and gets forgotten. The danger is a short run of unreviewed AI case studies during a busy week that drags down the account's distribution floor for the posts that follow. The penalty outlives the post that triggered it.

Does LinkedIn's 360Brew Algorithm Know Your Case Study Was AI-Written?

It does not need to know in the way a detector flags a single sentence. LinkedIn's 360Brew is a 150-billion-parameter LLM-based recommendation model, deployed across 40 to 100% of LinkedIn surfaces by fall 2025. It scores posts on topic authority, authenticity, and engagement quality, and it actively suppresses generic, template-style content. There is no single AI tell it hunts for. It weighs the combination of post structure, engagement velocity, and account-level behavioral history, and generic case studies lose on several of those axes at once.

The filter sitting in front of distribution has also tightened. LinkedIn's quality filter now rejects over 50% of all posts before they reach any audience, up from roughly 40% in 2024. AI-generated case studies are disproportionately exposed to that filter because they share structural patterns with millions of other posts and score low on topic authority when the account's history does not establish recognized expertise in that content cluster. The post never reaches a reader, so the author reads it as a dead post rather than a rejected one.

There is a second penalty most people never diagnose correctly, and it has nothing to do with the words. LinkedIn's behavioral spam classifiers evaluate account-level patterns: sentence-length distribution, paragraph rhythm, and posting-time jitter. They run before the content quality model evaluates the post. When an account posts AI-generated case studies at fixed intervals with invariant structure, the behavioral classifier flags the pattern first. In our data, accounts running on cloud-based schedulers with fixed posting times absorb this compound suppression. Accounts operating with human-paced scheduling intervals and local IP routing avoid it. The content can be identical. The outcome is not, because the penalty is on the account behavior, not the post.

This is the trap. The account owner sees a low-reach case study and concludes the content was bad, so they rewrite the content. But the suppression came from the posting signature, and a better-written post on the same fixed cadence inherits the same penalty.

LinkedIn has reported 94% accuracy in detecting AI-generated content as part of its active filtering program. Take that figure at face value and the implication is direct: most fully AI-generated case studies face distribution suppression regardless of topic relevance or how the keywords are tuned. You are not optimizing your way around a 94% detector with better phrasing. The lever is the workflow, not the prompt.

Rather not do this by hand? SocialNexis drafts posts and comments in your own voice and schedules them across LinkedIn and X.

Start free

B2B Buyers Catch AI-Written Case Studies Before Your LinkedIn Engagement Metrics Do

The algorithm is not the only judge, and it may not be the harshest one. Research found that 46% of people trust a brand less upon learning AI produced content they believed was human-written. The same work showed that explicit disclosure of AI authorship does not recover that trust, and in some tests it made perception worse. The trust deficit does not live in the disclosure. It lives in the gap between what the reader expected and what was produced. A case study is a trust document by definition, which is the worst possible place for that gap to open.

B2B buyers are good at this. They identify AI-written case studies through formulaic language, over-polished structure, and the absence of real-world friction: the budget that almost killed the project, the stakeholder who blocked it, the rollback decision nobody wanted to make. Detection happens at the proof stage, which is the exact moment the content is supposed to close consideration. A buyer who senses a fabricated-feeling case study does not just discount that post. They discount the claim it was supporting.

The audience mismatch shows up in the comment section before it shows up in any aggregate metric. AI-written case studies consistently attract lower-seniority commenters: individual contributors and students rather than directors and VPs. The reason is mechanical. Generic problem framing resonates with people who have encountered the topic conceptually, not operationally. SocialNexis users who track commenter job titles against their ICP definitions see this pattern within 48 hours of posting, well before the post's total engagement settles into a number worth reacting to.

Human-written case studies that carry specific friction details pull the opposite audience. They draw comments from people who have lived the same scenario, and those people are the decision-makers the content was built to reach. The difference is not the topic. Two posts can describe the same project. The difference is the texture of how the problem is described, and texture is exactly what unedited AI strips out in favor of a clean, resolved narrative.

This is why comment quality is a leading indicator and total engagement is a lagging one. By the time the like count tells you a case study underperformed, the wrong audience has already self-selected into the comments, and the algorithm has already read the seniority and substance of that audience as a signal about the post.

What the Hybrid AI Workflow Gets Wrong About LinkedIn Case Study Voice

Hybrid is the right answer, and most people still get it wrong. Buffer's analysis of 1.2 million posts found that AI-assisted content, meaning AI used as a drafting aid with genuine human editing, reached near-parity with non-AI posts: 6.85% versus 6.22% engagement rate. The performance-safe path is a hybrid workflow with real editing, not full AI replacement. Read that comparison carefully. The gap in the data is between full AI replacement and human-edited AI output, not between AI-assisted and purely human writing. AI in the loop is fine. AI without the editing is the failure.

The editing has to be heavy to count. An analysis of 500 AI-generated posts found that posts scoring 8 to 10 on a 10-point AI-polish scale averaged 0.4% engagement rate, while posts scoring 1 to 3 averaged 2.1%, more than 5x higher. The differentiating factor was the amount of post-generation editing, not whether AI was used. Counterintuitively, more polish was worse. The over-clean version reads as machine output. The lightly worked version keeps the fingerprints of a person who wrote it. The editing is the work, and skipping it is where most hybrid workflows quietly fail.

The most common hybrid failure mode is not using AI at all. It is failing to voice-match the output. We see practitioners write the hook and CTA by hand and let AI generate the case study body. That seems safe. It is not. The tonal register shifts at the transition point between the human opening and the AI middle, and that shift creates a dwell-time cliff. Readers drop off mid-post because the voice changes under them. The cruel part is that the hook still collects reactions from passive scrollers who stopped reading early, so the like count looks fine while the post is bleeding attention exactly where the proof was supposed to land.

Because the like count hides it, the only place this shows up is in comment-to-impression ratios and in commenter seniority. The metrics that average well will tell you nothing. The metrics that measure depth will tell you everything.

Accounts that train the AI on the author's prior 20 to 30 posts before generating case study copy do not exhibit this pattern. The lexical fingerprint in the body matches the opening, the register holds, and the cliff disappears. Voice-matching before scheduling is the step that decides whether a hybrid approach performs like human content or like AI content. It is not a finishing touch. It is the load-bearing part of the process.

Rather not do this by hand? SocialNexis drafts posts and comments in your own voice and schedules them across LinkedIn and X.

Start free

Dwell Time, Not the Like Count, Decides AI Case Study LinkedIn Reach

LinkedIn has documented this signal directly, so it is not speculation. Dwell time is an officially documented ranking input. Posts that hold reader attention for 61 or more seconds achieve a 15.6% engagement rate. Posts viewed for under 3 seconds average 1.2%. That spread is enormous, and it reframes the whole problem. Narrative depth is not a stylistic preference a copywriter argues for. It is the mechanism that earns distribution, because the platform measures how long the post held you and feeds that measurement back into who else sees it.

This is most of why generic AI case studies bleed reach. Generic AI copy-paste content sees roughly 30% less reach and 55% less engagement than authentic posts. A large share of that gap traces to dwell time. Thin narrative, absent specifics, and homogenized structure cause readers to scroll past before the algorithm ever registers the post as worth pushing further. The post is not penalized for being AI in the abstract. It is penalized for being skippable, and skippable is what unedited AI reliably produces.

Structure compounds the effect. Posts built around a personal story or a lesson learned generate 38% more engagement than promotional posts, and How I framing generates 3x more saves than listicles. AI-templated case studies systematically fail to produce either signal because they default to outcome descriptions. The model writes we improved retention by 40% because that is the clean, summarizable shape. But the outcome is not the tension. The decision and the friction are the tension, and tension is what holds a reader past the first scroll into 61-second territory.

Format is the last lever, and it favors human authorship. Native document posts currently average the highest engagement rate for case study-length content at 7.00%, a 14% year-over-year increase. Document posts reward the level of specific detail and narrative structure that is hard to fake. This is the format where the human-authorship advantage is most measurable and most repeatable, because the format itself demands the depth that AI output tends to skip.

The practical takeaway is to stop optimizing for the like and start optimizing for the second. If a case study does not earn dwell time, the like count is a vanity number attached to a post the algorithm has already decided not to distribute.

Get the next breakdown in your inbox

Occasional, practical guides on LinkedIn and X growth. No spam, unsubscribe anytime.

Comment Quality Reveals the Audience Mismatch AI Case Studies Create

The most visible output of AI case study underperformance is not a low like count. It is a low comment-to-impression ratio paired with low-seniority commenters. AI-written case studies draw individual contributors and students instead of directors and VPs, because generic problem framing has broad conceptual recognition but little operational resonance. The trap is that total engagement can look acceptable while the ICP-fit of the audience is poor. You can hit your numbers and reach the wrong people, and the dashboard will congratulate you for it.

Reposting an underperforming AI case study makes this worse, not better. SocialNexis account data shows the 360Brew model carries negative engagement-velocity signals forward. A post that earned low dwell time and few saves in its initial distribution window is distributed even more narrowly on repost, because the author's topic-authority score for that content cluster has already been downweighted. Recirculating the post does not give it a second chance. It confirms the model's verdict and accelerates the suppression instead of correcting it.

The correct recovery workflow is to rewrite the case study with first-person friction details and publish it as net-new content. The rewrite should not keep the AI-generated body and bolt on a fresh hook. That preserves the exact tonal mismatch and thin middle that suppressed it the first time. Rebuild from the operational details that were missing in the original: the constraint, the objection, the decision that almost went the other way. Those are the elements that produce dwell time and pull the right commenters, and they are the elements a fresh hook cannot supply on its own.

Track the right metric and you catch this early. Accounts that treat ICP-fit of commenters as a primary success metric, rather than raw engagement, identify the audience mismatch within 48 hours and course-correct before the suppression compounds across multiple posts in the same content cluster. The accounts that only watch likes find out weeks later, after the damage has already spread to the next several posts in the same topic area.

Read the comments, not the count. The comment section is the earliest honest signal you have about whether a case study reached the people it was written for.

How to Write LinkedIn Case Studies That Earn Dwell Time and Buyer Trust

The differentiating variable is specificity, not polish. Name the budget constraint. Name the internal objection that almost killed the project. Name the metric that moved and by how much. Specifics are what create dwell time, and dwell time is what distributes the post. A case study without friction reads as marketing copy regardless of who or what wrote it, and the algorithm and the buyer both read marketing copy the same way: skip it. The instinct to sand a story smooth is the instinct that kills it. Resolved, frictionless narratives are precisely what scores high on the AI-polish scale and low on engagement.

For hybrid workflows, run voice-matching before scheduling, not after. Train the AI draft on the author's prior 20 to 30 posts so the lexical fingerprint in the body matches the hook and the CTA. A tonal shift at the transition point is detectable to readers and creates the dwell-time cliff that suppresses distribution. This step is not optional. It is what separates hybrid content that performs like human writing from hybrid content that performs like AI output, and it is the single step most teams skip because the metrics that would expose its absence are not the ones they watch.

Choose the framing the platform rewards. Use How I framing over outcome summaries. We improved retention by 40% is promotional. How we almost shipped the wrong fix, caught it at rollout, and what the post-mortem changed is a story. The second version holds attention past 61 seconds. The first does not. Where the evidence supports it, use the document post format. At a 7.00% average engagement rate it is the highest-performing format for case study content, and it rewards exactly the detail level that human authorship produces credibly.

Watch the posting pattern as carefully as the prose. Avoid high-frequency publishing of AI-generated case studies at fixed schedule intervals. LinkedIn's behavioral classifiers flag posting patterns with invariant sentence-length distribution and fixed posting-time windows before the content quality model even evaluates the post, which means a good case study can be suppressed for the schedule it arrived on. Vary the structure and the timing.

After publishing, measure depth instead of volume. Track comment-to-impression ratio and commenter seniority rather than aggregate engagement. Those are the signals that show whether the case study reached its intended audience or simply collected reactions from passive scrollers. Get those two right and the reach tends to follow, because they are the same signals the algorithm is reading on its side of the glass.

Frequently asked questions

Why do AI-generated case studies get less engagement on LinkedIn than human-written ones?

AI-generated case studies underperform because they lack the friction, specifics, and tonal authenticity that hold reader attention long enough to register as quality content. LinkedIn's 360Brew algorithm scores posts on dwell time and engagement quality. Posts held for 61 or more seconds average 15.6% engagement rate versus 1.2% for posts read in under 3 seconds. AI-templated case studies rarely hold attention past the opening because they default to generic outcome descriptions rather than the lived narrative that produces dwell time.

How does LinkedIn's 360Brew algorithm detect and penalize AI-generated content?

360Brew is a 150-billion-parameter LLM that scores posts on topic authority, authenticity, and engagement quality. It evaluates the combination of post structure, dwell time, engagement velocity, and account-level behavioral patterns rather than a single AI tell. Generic, template-style content scores low on authenticity. LinkedIn has also reported 94% accuracy in detecting AI-generated content through a separate filtering program, meaning most unedited AI case studies face distribution suppression before reaching their intended audience.

Do B2B buyers trust AI-written case studies on LinkedIn?

Research shows 46% of people trust a brand less upon learning AI produced content they assumed was human-written, and explicit disclosure does not recover that trust. B2B buyers are skilled at identifying AI case studies through formulaic language, over-polished structure, and the absence of real-world friction. Detection at the proof stage directly undermines purchase confidence at the moment the case study is most critical to the buyer's decision process.

What is the engagement rate difference between AI and human LinkedIn posts in 2025?

Analysis of 3,368 long-form posts from 99 influential LinkedIn profiles found that likely-AI-generated posts received 45% less engagement (likes plus comments) than likely-human-written ones. In Marketing and Branding, the gap reaches 273%: human posts averaged 771 likes and comments versus 207 for AI posts. A separate analysis of 500 AI-generated posts found posts scoring 8-10 on an AI-polish scale averaged 0.4% engagement rate versus 2.1% for lightly edited posts scoring 1-3.

Which LinkedIn industries see the biggest engagement gap between AI and human content?

Marketing and Branding shows the largest documented gap, with human posts averaging 771 likes and comments versus 207 for AI posts, a 273% difference. This is also the vertical most directly tied to B2B case study content and professional credibility on LinkedIn. The gap is widest in categories where authenticity is central to perceived expertise, because generic AI framing is most conspicuous where readers expect demonstrated operational experience rather than summarized best practices.

Does using AI to write LinkedIn posts hurt your personal brand credibility?

It depends on how AI is used. Buffer's analysis of 1.2 million posts found that AI-assisted content with genuine human editing achieved near-parity with non-AI posts: 6.85% versus 6.22% engagement rate. The credibility risk comes from unedited AI output or hybrid workflows where the AI-generated body does not match the author's established voice. That tonal mismatch is detectable to readers, creates a dwell-time cliff mid-post, and compounds into a lower topic-authority score over time.

What makes a LinkedIn case study post perform well algorithmically?

Three factors matter most: dwell time (how long readers stay), engagement quality (saves and substantive comments rather than passive reactions), and topic authority (whether the account is recognized as credible in that content cluster). Human-written case studies with specific friction details and personal framing generate the narrative depth that produces 61-plus second dwell times. Document posts currently average the highest engagement rate at 7.00%, rewarding the detail level that well-constructed case studies require.

How can I use AI to help write LinkedIn case studies without triggering quality penalties?

Use AI as a drafting aid, not a replacement. Train the AI on your prior posts before generating case study copy so the output matches your lexical fingerprint. After generation, edit heavily: add the budget constraint that nearly killed the project, the specific objection you had to overcome, the exact metric that moved. Posts scoring 1-3 on a 10-point AI-polish scale average 2.1% engagement rate versus 0.4% for posts scoring 8-10. The editing is the work, not the generation.

Why do heavily polished AI LinkedIn posts underperform compared to lightly edited ones?

Over-polished AI output reads as promotional rather than personal. It removes the friction and imperfection that signal authentic experience. LinkedIn's 360Brew model and B2B buyers both respond to the same tells: uniform sentence structure, resolved outcomes with no setbacks, and the absence of a specific named constraint or decision. Posts that retain the editing fingerprints of a real practitioner, including tonal variation and specific numbers, hold attention longer and draw comments from people who recognize the operational experience described.

What percentage of LinkedIn case studies and long-form posts are now AI-generated?

By late 2025, 53.7% of long-form LinkedIn posts (100 or more words) were classified as likely AI-generated, up from a base that surged 189% between January and February 2023 following ChatGPT's mainstream adoption. For case study-length content, the proportion is likely higher because case studies are among the most labor-intensive formats to write from scratch. That majority share means human-authored case studies stand out to both the algorithm and to readers accustomed to AI output.

Sources and further reading

Put this guide into practice

SocialNexis writes posts and comments in your voice, then runs them across LinkedIn and X on a schedule you set.

All guides