Defining Your Product’s North Star Metrics and Leading Indicators

Last Updated on July 17, 2023 by Editorial Team

Author(s): Lisa Cohen

Originally published on Towards AI.

One key role that data science teams play is defining metrics and setting targets for the product and company. Having a clear set of success metrics is a very effective way to communicate the priorities of the organization in a clear and objective way. It’s also an effective tool for scalability because each team can then set their respective goals and innovate autonomously while laddering up to the overall objective. The OKR framework is an effective way to roll this out to the organization, where the KRs track progress that leads to our long-term objective. While metric reporting can be a more repeatable and automated part of the job, there is a lot of data science innovation that goes into the development of good metrics.

Metric terminology

There are typically two levels of metrics: north star metrics and leading indicators.

North star metric: The north star metric provides a clear vision for how to measure long-term success.

Leading indicators: Leading indicators (also called surrogates or drivers) represent specific actions, which happen on a shorter time scale, and cause long-term growth of the north star metric.

Overall, the north star represents where you want to go, and the leading indicator is how to get there.

Qualities of effective metrics

Effective metrics should be representative, understandable, predictive, sensitive, and include associated guardrails.

North star metrics:

– Represent the mission: The north star metric choice and definition provide an opportunity to clearly articulate the vision of the product and organization in an objective, measurable way.

– Understandable, simple and inspirational: In order to rally the company around this goal, a simple metric that everyone understands is more impactful than a complicated or obscure one.

Leading indicators:

– Predictive of future success: We want the metric to be predictive of future success, to ensure that short-term metric optimization drives long-term desired results.

– Sensitive: The metric should be sensitive to change, on the timescale measured. Otherwise, the team will get stuck in metric reviews, not being able to evaluate the impact of their work, or compare the value of different investments, since they all show the same immoveable metric.

Guardrails:

– Guardrails: Having the right guardrails in place is a key step to any metric rollout. This helps reduce gamification and protects things like quality and safety, so that these potential tradeoffs don’t get compromised at the expense of metric growth.

Identifying leading indicators

In determining leading indicators, we want to identify engagement actions that are particularly impactful for this specific product, and determine that they causally drive long-term outcomes (so that investing in them will result in the desired end). For example, “emails sent” for an email app, or “social connections” for a social media app. These metrics are actionable for the team, since they can design product experiences that help customers successfully complete these tasks.

In terms of sensitivity, for online products, we’re looking to see that leading indicators change (by statistically significant amounts) on an experiment time basis (1–2wks), and north star metrics move within a quarterly basis (1–3mo). With this time scale, we can use leading indicators to define the overall evaluation criteria (OEC) for experiments and then track north star metrics in monthly or quarterly reviews.

Meta analysis is a very effective way to identify leading indicators, when you have a sufficient history of experiments to review. We can run a regression analysis across past experiments, and identify the top metrics whose changes in an experiment timescale led to long term changes in the north star metric. We’re looking to see that the leading and north star metrics move together over time, with the leading indicator moving first. We assess the long-term impact by shipping with a holdback and then viewing the impact over time. (“Zero” or feature-level holdbacks can also be developed to view the all-time impact for a particular feature area but are less sensitive to recent changes.)

In cases where experimentation isn’t possible or available (i.e., it would be unfair not to treat all top customer accounts or a medical treatment that would take decades to measure), observational analyses offer a great solution to identify causal drivers. DoWhy, DoubleML, and EconML are a few causal inference libraries we can use to develop synthetic controls (i.e., through propensity score matching) in the existing dataset and then evaluate the leading indicator treatment. We can also use causal inference to compare two populations with different outcomes and identify the causal drivers or cohorts which led to that.

Lastly, machine learning can be a scalable technique to identify candidates for leading indicators. For example, if we build a classification model that predicts whether or not a customer will be successful in the long-term metric, we can then analyze the SHAP feature importance of the predictive variables and test their causality. As discussed above, this can be done through experimentation where we run short-term experiments that move the leading indicator and check that the north star metric change follows in the holdback.

User research, using the product, and domain expertise, and analyzing customer journeys of successful (vs churned) users are additional ways to spark ideas on leading indicators to test.

Photo by Anna Nekrashevich: https://www.pexels.com/photo/magnifying-glass-on-top-of-document-6801648/

Magic moments

As a related concept, magic moments are where we identify the “aha” or “wow” moment in product usage (i.e., 7 Facebook friends after 10 days), after which the customer experiences the value and is significantly more likely to retain and grow on the platform. One way to do this is through a cluster analysis of past user behavior and identifying the inflection points in growth and retention. An advantage of data-driven approaches to establish these thresholds is that they mark meaningful inflection points in the user experience (so customers are less likely to be moving above and below the thresholds by chance). As in the previous section, we can use techniques including causal inference and experimentation to verify if these moments truly lead to successful outcomes.

Active use

Similarly, as we’re counting customers, it’s important to have a meaningful definition of active use. This helps provide a clear view of “who is a customer?”, and helps prevent over-counting customer adds and churn for users that were not truly customers yet.

Setting targets

Once we identify our leading indicators and north star metrics, typically, the next step is to determine targets for the goals. Often we will start with a forecast of the metric. The forecast represents where we will end up (within confidence intervals), assuming we continue the current level of investment. Typically, we will set an aspirational goal above that forecast, which represents the level of ambition for accelerated growth. Known events, changing budgets, and market growth levels are additional factors we can take into account.

Meta-analysis can also be a useful input for opportunity sizing by reviewing the distribution levels (max, median, average) that we’ve been able to move these metrics with past feature investments.

Organizing for success

Across the company, different teams may be better positioned than others to move certain metrics. For example, Growth drives adoption, Product experiences can drive engagement, Support drives satisfaction, etc. (Meta-analysis across teams’ experiments can provide a quantitative view of which metrics they’ve been able to most successfully move in the past.)

The leadership team drives portfolio planning to ensure we have the right composition of teams to accomplish the desired goals, and then teams can proceed with their specific focus. As a company grows, there can be successive north star metrics, leading indicators, and guardrails at successive levels, i.e., the company and team levels. We can use the cascading OKR framework to help the team and company levels connect.

Ship criteria

Sometimes an experiment can move two metrics in opposite directions. This can present a “launch or not” dilemma for the team running the experiment. Of course, trying another implementation of the feature that might better accommodate both metrics is the most desirable outcome. However, this may not always be possible. One way to approach this situation is by considering the relative priority of the two metrics. Another way to manage the “tie breaker” is weighing the relative quantities of the two metric changes, and developing ship criteria to quantify acceptable tradeoffs. For example, in order to help “good” (authentic) users easily sign up, how much potential fraud can we allow (for cases that are inconclusive)? Comparing the lifetime value of the good user versus the cost of fraud can help optimize this tradeoff. (Then if we can limit the exposure of the fraudulent sign-up through progressive access, that will help let through more good users as well.) Another tradeoff might be the increased engagement from sharing user notifications versus the number of users who then turn their notifications off. Similarly, adoption versus revenue. Guardrails are a key aspect of this as well; for example, a change that degrades the product performance below user expectations cannot be shipped.

Monitoring production metrics

Over time, it’s good to check in on the effectiveness of your metrics. The north star metric represents the overall mission, so it should be stable and rarely changing. There may be improvements to the productionalization of the metric, as measurement bugs and edge cases are found or the product and feature measurement changes. (Company-level metrics are considered production-level priority and should also have data lineage, data quality monitoring, and SLAs in place.) Still, it’s worth reflecting every 6–12mo to confirm this is still the top priority for the product at this time. Also, if you’re seeing adverse side effects or gamification of the metric, there may be additional guardrails to put in place or edits to the metric definition (i.e., stop counting a particular behavior you don’t want to promote). If you change the metric, make sure to version and update the data catalog. (You can also backfill historical data with the new metric to analyze long-term trends.)

Conclusion

Leading indicators and north star metrics are key aspects of any product development and provide a clear focus for the broader organization. Data science teams play a key role in defining and validating these metrics, which enable customer success.

Frequently Used, Contextual References

Resources

Publication

Defining Your Product’s North Star Metrics and Leading Indicators

Author(s): Lisa Cohen

Metric terminology

Qualities of effective metrics

Identifying leading indicators

Magic moments

Active use

Setting targets

Organizing for success

Ship criteria

Monitoring production metrics

Conclusion

Related links

Further reading

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

The Fundamental Mathematics of Machine Learning

Built-In AI Web APIs Will Enable A New Generation Of AI Startups

Auditing Predictive A.I. Models for Bias and Fairness

Why is Llama 3.1 Such a Big deal?

5 AI Real-World Projects To Set Foot in The Door

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Defining Your Product’s North Star Metrics and Leading Indicators

Author(s): Lisa Cohen

Metric terminology

Qualities of effective metrics

Identifying leading indicators

Magic moments

Active use

Setting targets

Organizing for success

Ship criteria

Monitoring production metrics

Conclusion

Related links

Further reading

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement