Recent essays – Niels Hoven

Income Share Agreement Incentives and Dynamics Part 3: Metrics and product teams

on June 13, 2019

One overlooked challenge of ISAs is the interdependence between admissions and outcomes and how much harder that makes traditional metric-driven optimization.

This essay is the last of a three part series exploring the impact of ISAs on customers, companies, and day-to-day operations on product teams. The previous essays can be found here:

The customer perspective – How demand for ISAs will evolve
The company perspective – How ISAs shift company strategy
The operator perspective – Key metrics for ISA product teams

Imagine a traditional online courses company like Udemy or Coursera. When revenue drops, diagnosing the problem is straightforward. If conversion is down, there’s a problem with the purchase funnel. If retention is down, there’s problem with content quality.

But for a program like Lambda School whose revenue comes from ISAs, what happens if one of their cohorts has a low success rate? For example, what if employment of graduates drops from 90% to 60% for a cohort? Did admissions become less selective? Did the teaching become less effective? How do you diagnose the problem?

For an ISA payment model, the interplay of selectiveness and effectiveness make it significantly more challenging to troubleshoot. It’s more difficult to isolate the effects of each variable. Nearly every metric that is affected by the quality of admissions is also affected by the quality of education, so how can you isolate and measure their effects?

Mix-Shift

It turns out that we can think of this as a mix-shift problem. Mix-shift describes what happens when the makeup of your audience changes (e.g. a sudden influx of new users), resulting in major metric changes even without any changes in the product.

Ideally, most inbound traffic has attribution tracking – i.e. you can tell whether a user arrived via an Adwords ad or a particular Facebook lookalike audience. Thus it’s straightforward to see if there has been a significant change in the ratio of traffic coming from different channels/audiences.

But imagine if you didn’t have that information? What would you do? This is essentially the situation that an ISA school runs into when trying to diagnose a root cause as either a change in the product, or a change in their class makeup.

The dual role of admissions

Any applicant pool will consist of a mix of potential students with wildly varying attributes – different levels of grit, motivation, ability, experience, etc. Essentially, this is like the ecommerce example where an audience is composed of a mix of different segments. But unlike in the ecommerce example, identifying the audience segments is a lot more difficult.

For an ISA, the admissions process has to do much of the heavy lifting for user segmentation – using a limited amount of data to understand as much as possible about applicants, and then making a prediction about whether or not the current program can help the applicant be successful.

Thus admissions has two jobs:

Characterizing applicants based on limited information (e.g. “how much motivation, commitment, ability, etc do we think this applicant has, based on their essays and homework?”)
Making a call which groups are most likely to benefit from the current curriculum (e.g. “are people who can only commit one night a week likely to be successful?”)

So Admissions must use metrics to evaluate both their predictive validity and their admit/no-admit criteria. With that in place, Outcomes can evaluate how effective their educational programs are within various segments.

Diagnosing problems

If applicants with similar traits in the admissions process perform wildly differently over the duration of the course, then the questions asked in the admissions process need to be re-examined, as their predictive validity is low.

If applications with similar traits in the admissions process perform similar over the duration of the course, but one particular segment is consistently struggling, then the admit/no-admit criteria is likely flawed.

And finally, if the curriculum is improving over time, then for any particular audience segment, success rates should increase over time as well. If a particular audience segment suddenly begins performing differently, then some issue with the effectiveness of teaching is a likely culprit.

Industry challenges

I expect these challenges will cause existing institutions to struggle to adapt to an ISA-based education model. ISAs open up untapped market and the advantages of personalized pricing, but companies that offer them will have to develop new core competencies. The clear winners here are students and consumers, who benefit from better aligned incentives and the flexibility of new payment models.

We are at the very early stages of the ISA movement. Expect to see momentum behind ISAs grow significantly over the next few years, and I, for one, could not be more excited.

Income Share Agreement Incentives and Dynamics Part 2: Companies

on May 29, 2019

The most exciting thing about Income Share Agreements is how effectively they align the incentives of a student with a school. This realignment of incentives presents new challenges for any company offering ISAs. Companies unprepared to meet these challenges will find it hard to remain profitable.

This essay is the second of a three part series exploring the impact of ISAs on customers, companies, and day-to-day operations on product teams. The other essays can be found here:

The customer perspective – How demand for ISAs will evolve
The company perspective – How ISAs shift company strategy
The operator perspective – Key metrics for ISA product teams

The good news

Offering ISAs as a payment option benefits a company in two ways:

Serves as a guarantee of quality
Scales prices with the value the customer receives

We saw in Part 1 that ISAs are most appealing to consumers when uncertainty is high: uncertainty about a product’s quality, uncertainty about their individual ability to succeed, etc. For a major investment like a coding bootcamp (typical price $10,000-$20,000), removing this downside risk vastly increases the addressable market by making a product accessible to customers who could not otherwise consider it.

Additionally, if you offer a product that is significantly more valuable than your target market realizes , then offering an ISA is a way to offer a fair price to a consumer who otherwise doesn’t know how to value your offering. E.g. offering a coding bootcamp to someone who has never even met a software engineer, or offering negotiation coaching to someone who is unaware of how much flexibility companies have in designing comp packages.

“Your education is free if you don’t get employed in your field of study” is a really, really powerful sales message to someone who might on day one have very literally never met anyone in their field of study.
— Patrick McKenzie (@patio11) April 10, 2019

So ISAs not only increase a company’s total addressable market, but also increase what that market is willing to pay. Their uplifting success stories are inherently viral. Success drives growth, creating a self-compounding flywheel. It seems like an amazing deal for a company. So what are the downsides?

The downsides are new and significant execution challenges.

A tale of two payment models

Payment models have a major impact on the companies’ incentives, priorities, and org structures. Consider two schools with the exact same course and curriculum that only differ on payment methods. Because profit margins rely on students being successful, ISA schools have to care about two additional factors: Effectiveness and Selectiveness.

Effectiveness

Effectiveness is straightforward and uncontroversial. If an ISA school can improve their curriculum and teaching methods, they can make their students more compelling candidates and increase their chances of success.

The point is trivial but the ramifications are huge. Without ISAs, school success is primarily a result of the strength of their marketing engine. With ISAs, “course quality” must become one of the company’s core competencies.

Selectiveness

Whatever the traits are that contribute to a student’s success (grit, motivation, strengths, experience?), an ISA-based school must screen for only those students that can be successful. For example, Lambda School says they generally won’t admit students who can’t commit to attending all the classes, which seems reasonable. Because an ISA school doesn’t get paid if its students don’t find jobs, it simply can’t afford to admit students that it can’t set up for success. Once its curriculum/effectiveness improves, then a school can widen its funnel.

Upfront payment

Contrast this with Upfront University, a coding academy which charges its fees up front. For Upfront U, the business model is well-understood. Drive as many students as possible into the top of the funnel, and then maximize conversion. The quality of the product (education) primarily matters for the purposes of brand building.

This model also applies to things like games (e.g. Steam), on-demand courses (e.g. Udemy), and information products (e.g. internet marketing). Sales, promotions, and other merchandising tactics to drive conversion are effective tools because what matters most is closing the sale.

Actual engagement with the products is less critical. If a Steam Sale user ends up buying a library of unplayed games, it’s no big deal. Actually, it’s arguably better, as a user can only finish classes/games at a certain rate, while there is no limit on how rapidly they can fill their library.

For these companies, marketing and checkout experience are their core competencies. When profit margins aren’t directly tied to product quality, there is no motivation to strive for product excellence. The products themselves can be outsourced to third parties (games), users (online courses), or low-wage contractors (info products).

New challenges

For schools that have been primarily marketing engines, executing against Effectiveness and Selectiveness will be a serious operational challenge. Not only will they have to execute against these new competencies, but when issues arise, their interdependence makes it even more difficult to diagnose root causes when issues arise.

If hire rates drop for a given cohort, was that a result of a problem with the admissions process? The course content? How can you tell?

Nearly every success metric (progression through course material, graduation rates, distribution of outcomes, etc) that is affected by the quality of admissions is also affected by the quality of education, so how can you tease apart their effects? These new dimensions are much more challenging to isolate, than say, conversion and retention in traditional software products.

I address this challenge in PART 3.

Income Share Agreement Incentives and Dynamics Part 1: Consumers

on May 20, 2019

In recent months, income share agreements (ISAs) have captured the imagination of the tech world. ISAs make higher education accessible to disadvantaged students by charging no upfront fees, and allowing students to pay instead with a fixed percentage of their future income. Lambda School, one of the best known coding schools with an ISA model, reports that their average incoming student has an income of $22,000, and the average graduate increases their income by $47,000.

In these next few essays I’m going to be discussing the impact ISAs will have on the education market, primarily as a result of the incentives they create. My tentative plan for the series is as follows:

The customer perspective – How demand for ISAs will evolve
The company perspective – How ISAs shift company strategy
The operator perspective – Key metrics for ISA product teams

This essay explores the customer perspective (i.e. the demand side of the market), introduces a framework for thinking about ISAs, explains current market trends, and makes some predictions for how the market will evolve.

What are ISAs?

Most articles describe ISAs as an innovative funding mechanism. On the other hand, some pedants like to say that they’re nothing novel – just standard financial instruments under a new name.

If you find “unsecured loan with a forgiveness clause, variable performance-based payback, and no covenants” more intuitive, good for you. You are not this essay’s intended audience. For the rest of us, ISAs are a little bit like a loan with a money-back guarantee:

If you fail to get a good job, your loan evaporates and you owe nothing.
If you get a good job, you pay back your loan.
And if you get a really high-paying job, you pay back your loan, plus a generous tip.

ISAs’ immense traction so far is primarily because of the “money back guarantee” aspect. Removing the downside risk has made programs accessible to a huge but previously untapped market of potential students.

Capitalizing on uncertainty

ISA are most appealing to consumers when there is high uncertainty in a system. For a consumer, this manifests as uncertainty about what results they can expect. The root cause of this uncertainty can stem from many potential sources, such as:

Uncertainty about the effectiveness of the product being purchased
Uncertainty about one’s own ability
Uncertainty about changing market conditions
Uncertainty due to inherent randomness

In the case of a coding bootcamp, for example, students may have doubts about the value of the program. This could be concerns about the quality of the program, or simply unfamiliarity with the value of software development skills.

Uncertainty about one’s own ability is also a compelling reason to opt into an ISA. Even if a program has a 95% success rate, there’s always the lingering doubt of “what if I’m that last 5%?” Being liable for a $20,000 loan would be crushing for someone already in a difficult financial situation. Paying with an ISA removes that risk.

One of rare @matt_levine quotes I will disagree with, regarding ISAs (a la Lambda School) to find education.

I think the target audience / greatest beneficiaries are students who, for rational reasons, severely underpredict their career trajectory or overpredict risk. pic.twitter.com/6gh2Jd9l43
— Patrick McKenzie (@patio11) April 10, 2019

Additionally, some products simply have inherent randomness involved. As a thought experiment, imagine someone selling lucky four-leaf clovers that increased your chances of winning the lottery by 1%. With a $100 million jackpot, one of those clovers would be worth about a million dollars, but you would be hard pressed to find someone willing to pay that. On the other hand, purchasing a pile of clovers by promising to share your future lottery winnings is a very compelling proposition.

There is similar randomness/uncertainty in salary negotiation. Due to the number of unknowns in any given situation, it’s hard to know in advance how strong a candidate’s negotiating position is. Sometimes a negotiation results in a few thousand dollars for the candidate, and sometimes it results in hundreds of thousands of dollars. As a result, it’s hard for coaches to name a one-size-fits-all upfront price. Paying with a percentage of the post-negotiation upside makes more sense for most clients.

The future of the ISA market

The appeal of ISAs in high uncertainty situations brings up one of their most interesting dynamics.

Ironically, the more successful a school is, the fewer students will opt into an income share agreement. If every class has 100% employment guaranteed, then there’s no risk in just paying with a standard loan.

When uncertainty is low, students see less value in downside protection. And if they’re confident that they’ll land a high-paying job, they would actually prefer to pay a flat fee, rather than something based on their income.

This tendency is known as “adverse selection”, and refers to the fact that ISAs have the most appeal to the students who are least likely to be able to pay them back. Income-based repayment is also least appealing to students who expect to land high-paying jobs.

“The target audience is people who are focused on financial optimization but don’t plan to make a lot of money, and I am not sure how big a group that is.” – Matt Levine

In my opinion, this problem is wildly overblown, as it is mitigated by just offering both upfront payment or an ISA as payment options, and letting students opt into whichever payment method they prefer. I think the much more interesting market dynamic will be the declining proportion of students opting into ISAs at each school as their programs result in more and more reliable successes.

Rather than a dominant payment option, what ISAs will become is a strong signal of a school’s effectiveness. The simple availability of an ISA payment option will be a means for schools to signal their confidence in their own program, even as the fraction of students opting into the ISA declines. Over time, the lack of an ISA payment option will become a red flag, alerting potential students that a program has unreliable results.

Schools will have to offer ISAs to maintain their credibility, but it’s not as simple as just adding a new payment option. Businesses offering ISAs must recognize how fundamentally it shifts their incentives and day-to-day operations. I address this in Part 2.

Watercooler Moments: How to design products that generate word of mouth

on July 20, 2018

Because it’s so hard to measure, people tend not to think of word-of-mouth as a product feature. But it can be designed for and optimized, just like anything else. Television has been designing and engineering word of mouth virality for years. This essay is about how to do it in software.

Specifically, I want to talk about a tactic that was once prevalent in television that is now beginning to resurface in software: the watercooler moment.

Word of mouth virality is driven by watercooler moments – experiences that are so memorable that you can’t wait to talk to your friends about it at the watercooler the next day.

Famous watercooler moments

In 1980, CBS used the advertising catchphrase “Who shot J.R.?” to promote the TV series Dallas. Viewers had to wait 8 months to find out the answer. A session of Turkish Parliament was even suspended so that legislators could get home to see the answer revealed. It was the highest-rated TV episode in US history, with 83 million people tuned in to discover what happened.

When Ellen DeGeneres came out as gay, there was rampant speculation about whether her character on her sitcom Ellen would come out as well. And she did, in an award-winning episode in April 1997 that generated enormous publicity and a nationwide conversation. The episode was the highest-rated episode of Ellen ever, with 42 million people tuned in to see the event.

During the live broadcast of Super Bowl 38’s halftime show, Janet Jackson’s chest was exposed during a dance routine with Justin Timberlake. The moment, which became the most watched moment in TiVo history, resulted in 540,000 complaints to the FCC, “Janet Jackson” becoming the most searched phrase of 2004, and the phrase “wardrobe malfunction” entering the popular lexicon.

That fact that moments can be planned or scripted doesn’t make the emotions they create any less genuine. Watercooler moments transcend the boundaries of their medium, sparking conversations in the real world to become communal experiences.

Designing watercooler moments

People are social animals. We have an instinctual desire to tell stories. Stories help us make sense of the world, share useful information, and reinforce bonds. They are the currency of human connection.

Watercooler moments turn a one-off event into a communal experience. People retell the story, share the story, interpret the story, discuss and argue its meaning. Interesting drama involving interesting participants provides endless fodder for discussions of motivations, ethics, and morality.

So creating a compelling story is the first step in creating a water cooler moment. But since you (a software developer) presumably have no script or characters to rely on, it means your app itself will have to create the story on its own.

Products that generate stories

Unexpected emotions create compelling stories. The more unexpected the event, and the more extreme the emotion, the more powerful the desire is to share it.

Any extreme emotion will get people talking. But while negative ones (outrage, anger, disgust, etc) are exploited to great success by the media, they’re generally not emotions you’d like your product to generate. So for now, let’s focus on tactics that generate unexpected moments of delight.

Example: Asana monster

Emotions don’t necessarily have to be that extreme. Case in point: the little blue yeti that occasionally pops his head up after you move a card in Asana. An unexpected moment of delight can be enough to get people talking. Unconvinced? Just search for “asana narwhal” on Twitter.

Example: Hearthstone

This is the only example I’ve included from gaming, but it’s my favorite due to both the intensity of emotion and the intentionality behind its design. To intentionally engineer watercooler moments, Hearthstone’s designers created a number of cards (such as Millhouse Manastorm, shown below) with probabilistic effects that would, on rare occasions, completely change the course of the game in a spectacular way.

Dramatically snatching victory from the jaws of a punishing defeat (or vice versa) is the sort of intensely emotional experience that you can’t help talking about, no matter which side of it you were on.

Example: Zappos customer service

Zappos uses exceptional customer service to create memorable moments for their customers. Sometimes these stories are so powerful that they even make the news, like overnighting a pair of shoes to a wedding for free because the original pair was routed to the wrong location.

Example: Tinder

Some apps are fortunate enough to generate watercooler moments naturally. Tinder grew to 50 million users in 2 years through word of mouth by allowing people to get laid (or at least matched) on-demand without fear of rejection.

Example: ClassDojo

ClassDojo (shameless plug: come work with me!) has also grown entirely via word of mouth. It surprises and delights teachers by solving problems they previously considered intractable: creating classroom community and growing parent involvement. Now ClassDojo is used in 90% of US schools, entirely through organic word of mouth growth.

Creating your own watercooler moments

To create watercooler moments, find opportunities to design experiences that are extremely unexpected (e.g. an albino giraffe), extremely delightful (e.g. flying first class), or both (e.g. a surprise party).

Obviously the best case scenario is that your core use case massively exceeds users’ expectations to the point where they can’t stop talking about it. (Think Napster in 1999.) Another great scenario is if your core use case is a series of unexpectedly delightful moments delivering a variable reward stream of dopamine hits directly to the brain. (Think Tinder, which is basically a slot machine that pays out sex.)

But for those of us not fortunate enough to be working on products whose core use cases tap directly into the brain’s pleasure centers, here are some tactics that might help:

Tactics for creating unexpectedness

Probability
User behaviors
Real world events

The simplest way to introduce unexpectedness into your product is adding some kind of probabilistic event. The celebration monsters in Asana, for example, don’t appear every time a card is moved. If they did, they would be expected and therefore boring and unworthy of comment.

Slack uses randomness to great effect with its randomized loading messages. It’s little touches like these lighthearted random messages that let Slack inject personality and delight into a corporate productivity tool.

Another option is using user behaviors, particularly ones outside of core usage. Could something interesting happen if a user accidentally swipes instead of taps? Maybe some parts of the UI that don’t look interactable are actually responsive. Or maybe there’s some easter eggs for your users to discover.

Real world events are also good opportunities to deliver unexpected experiences. This is becoming common enough that it doesn’t have the impact that it used to, but it still gets users talking to see snow collecting on the UI during the holidays, or rainbow trails during Pride, or pumpkins on Halloween.

Tactics for creating delight

Next level visual polish
- Animation
- Particle effects
Characters
Personal messages from us, or for you
Celebrate a real user accomplishment

A classic way to create delight is though UX and visual polish. While a baseline level of polish and usability is expected in any app these days, taking your polish to a level above and beyond is a great opportunity to create delight.

via Taylor Ling

Fabulous is one recent app that made me feel that sense of delight. Its clean yet whimsical UI was so enjoyable to use that even my non-designer friends couldn’t stop talking about it.

There are countless tactics beyond visual polish to “juice” up the delightfulness of an experience, but animations, particle effects, and cute characters are always safe bets.

Personalization is another great way to surprise and delight a user. In a world where we’re used to being on the receiving end of impersonal corporate emails, a message will stand out if it is clearly written to me personally, with empathy and understanding for my personal and unique situation. Alternatively, you could delight users by getting personal on the sender side, if you’re willing to open up, get personal about yourself, and send an authentic personal email from you and not just a faceless company.

Finally, recognizing your users’ accomplishments is a great way to delight them. If a user does something exciting in your app, help them celebrate! Maybe they just made their first post, maybe they returned to your app after a month away, maybe they discovered emojis for the first time.

Realize that many actions that seem mundane to you still feel like big accomplishments to your users, so help them celebrate! Pop a congratulatory message, shower them with confetti, send them a certificate of accomplishment, or something else creative.

In summary

To grow word of mouth, delight users in surprising ways. Find opportunities to increase delight or increase surprise until people can’t stop talking about you.

Bad Bucketing: How analytics companies are getting retention wrong

on July 9, 2018

Buckets with eggs

Here’s a familiar experience: You’re trying to improve retention so you run a series of experiments. You end up releasing the same control experience to several cohorts, with dramatically different results each time. Your sample size was large, your source of users hasn’t changed, and the tests were close enough together that there shouldn’t be any seasonality effects. What’s going on?

It turns out that there’s a nuance in retention calculations that trips a lot of people up. Let’s call it “Bad Bucketing”, and even some analytics companies are getting it wrong.

Wait, isn’t retention just a standard calculation?

While most metrics have a straightforward intuitive explanation, if you’ve ever rolled-your-own and done the actual calculations, you’ll quickly realize that calculating even basic metrics requires you to make numerous decisions.

(For example, for retention: Are we looking at all events, or only session-start events? Or for conversion: Are we calculating it as a percentage of our active users? Or only the ones who opened the app? Or only the ones that viewed the sales page?)

Frequently the right answer to these decisions is obvious. And sometimes the answer doesn’t really matter that much. But sometimes, the right answer is non-obvious and also REALLY matters. Calculating retention is one of those times.

Calculating retention

As a concept, retention is pretty intuitive. It answers the question of “Do people like my app enough to keep coming back to it?” Retention measures the percentage of users that come back to an app on a specified time scale: usually daily, weekly, or monthly. 1-day retention is frequently described as “What percentage of today’s users come back tomorrow?”, and 1-month retention as “What percentage of this month’s users come back next month?”

Retention is one of the most fundamental product metrics. It’s a proxy for product market fit, user lifetimes, and everything that is good. It is arguably the most critical metric to track for any product, but the most common and intuitive way of calculating retention has serious flaws, regardless of sample size.

The Bucket Blunder

The most intuitive way of calculating retention considers each day as a separate bucket. Count the number of new users in today’s bucket – that’s the cohort for today. Then calculate what percent return tomorrow. That percentage is your 1-day retention. This calculation is simple, intuitive… and wrong.

Treating all the users in your bucket the same glosses over the fact that users who show up earlier in the day have to stay engaged for a longer period of time in order to count as retained, as compared to users who show up late in the day.

Basic retention: Both green and purple arrive on Day 0 and return on Day 1

A more reliable way to calculate retention is to consider each user’s install time individually. A user counts as retained for one day if they show up between 24 and 48 hours after their initial install. In other words, instead of asking “What percentage of today’s users come back tomorrow?”, ask “What percent of people who install today come back 24 hours later?”

Rolling retention windows: Green user has retained for 1 day after install. Purple user has not returned 1 day after initial install.

How serious is this problem, really?

I’ve seen retention measurements literally double because the UA bursts happened to hit just right. This results in false celebration now, followed by a wild goose chase when the next test inevitably comes back far lower.

Consider two users: Early Ellie installs at 12:01 am on June 1, and Late Larry installs at 11:59 pm on June 1. Ellie has to engage 24 hours after install to count as retained for 1 day. Larry only has to return 2 minutes later. As a result, installs later in the day will show much higher retention numbers.

The size of this effect further depends on how you’ve defined “being active”. Does a user count as being active on June 2nd if we see any activity from him? Or are we only looking at session start events? If taking any action inside our app qualifies Larry as an active user (a reasonable assumption), then if his first session lasts 2 minutes, from 11:59pm to 12:01am, then our system will say he’s been retained for 1 day.

How analytics companies are calculating retention

If you don’t feel up for the challenge of rolling your own analytics, one of the benefits of an off-the-shelf solution should be that you don’t have to worry about any of this. Unfortunately, that’s not the case, because all the top analytics providers calculate retention differently.

Consider how incredible it is that after at least a decade of retention being widely recognized as the single most important product metric, there’s still no standardized way to calculate it and each of the top analytics-as-a-service providers is just using their own judgment.

Mixpanel: YES!

Mixpanel calculates retention correctly. Hooray Mixpanel!

Flurry: Intuitive, but wrong

Flurry calls retention “return rate” and does it wrong but it’s still an improvement over their awful retention calculation.

Amplitude: Rounded to the nearest hour? Fine, I can live with that

Amplitude changed their calculation recently and now calculates retention correctly for dates after August 18, 2015. They do round to the nearest hour (I’m not sure why, since we have these things called computers that are really good at dealing with clunky numbers) but that’s probably close enough.

Heap: I’m just confused by this

Heap is unclear. Their description of daily retention looks correct, but their description of weekly retention looks incorrect. I’ve emailed them for clarification. (EDIT 7/13/2018: Heap was very helpful and it sounds like they’re calculating retention correctly, using the same methodology as Mixpanel. Hooray Heap!)

How concerned should I personally be?

This is a particularly serious problem if you tend to burst user acquisition when running your experiments.

If the UA faucet gets turned on early in the morning for experiment 1, but late in the day for experiment 2, v1’s test will be full of Early Ellies, and v2’s test will be full of Late Larrys. The product changes won’t even matter; v2’s retention metrics will dominate v1’s.

Turning on UA at the same time of day for each test doesn’t solve the problem either, because the time required for ad networks to ramp up the volume on your campaign varies from day to day and week to week.

This happens on longer timescales, too. Does your August cohort have great monthly retention? Maybe that’s because all your August users installed during back-to-school in the last week of August, so they only had to stick around for a few days to count as retained in September.

Rolling retention and you

I’ve been referring to “What percent of people who install today come back 24 hours later?” as “rolling retention”, because of the rolling 24-hour buckets that are specific to each user. In rolling retention, any user counts as retained if she returns between 24 and 48 hours after her initial install, no matter what time of day she installed.

(Ideally, we would just call it “retention”, but until everyone starts calculating retention the same way, I guess we’re stuck qualifying the name somehow.)

“What % of people who install today return tomorrow?” is an intuitive question, but gives unreliable results. Instead ask, “What % of people who install today come back 24h later?” On the surface the questions are the same, but the latter gives much more trustworthy results.

If you start calculating retention this way, be aware that there will be some weirdness around the end of your retention curves.

You’ll now need to wait 48 hours to get your day1 retention, to give the Late Larrys a full 24h to return. And while you’re waiting for to see if Larry returns for his day1 retention, there could be an Ellie from the same cohort who’s already come back for day2.

It’s a pretty minor nuisance, though, and well worth it to have retention metrics that you can actually rely on. Have you run into something similar? If so, I’d love to hear about it.