Theory: My Intuitions on Oppportunities
Where I suspect the highest leverage projects will be found
This is a Theory post, one of many post categories found on this blog. To see what to expect from posts in each category, along with categorized lists of past posts, check out the Table of Posts. As always, feedback or collaboration of any kind is greatly appreciated either here in the comments or privately on Twitter (@Damien_Laird).
So far I’ve done reviews of the state of research on both prediction markets and prediction polling, theorized about the different inputs into the quality of a forecast, and shared my own experiences trying to forecast global catastrophic risks (GCRs). Though I haven’t captured it in any post, I’ve also spent a lot of time lately trying to get a feel for the current state of forecasting outside of Academia. Perusing forecasting blogs, checking out the various online forecasting platforms, and joining Discords. This will definitely inform future writing.
For now however, I finally feel like I’m building an intuition for what threads to pull on to improve our ability to forecast GCRs. This doesn’t mean I’m right! But it starts to shift my internal compass from “observe and think” to “try and collaborate”. I need to start throwing things at the wall and bouncing them off of people with different perspectives to find out what might stick.
This post is to capture my initial intuitions to both provide context for those future projects and why I think they could be important, but also to carve in stone what I believe right now before testing these hypothesis. This should make it a lot easier to learn from whatever I get wrong here.
Open Online Platforms
It seems to me that over the history of forecasting, most of the improvements in quality came from academia. There was lots of low quality research and dead-end (to me, of course) theorizing, but there were also some ambitious studies that produced deep insights that have endured over time and jump started the field. In hindsight, this progress looks frustratingly slow but it was undeniably powerful, and I think there’s still plenty to learn from controlled research environments.
However, I suspect the balance of progress has shifted and that moving forward most advances will come from open online platforms like Metaculus, Manifold Markets, or Good Judgement Open just to name a few of the most popular. These are already seeing large scale participation, they have demonstrated an ability to innovate on their own structure and improve over time, and they are in a position to collect the reams of data needed to generate insights about the field. They can even make a lot of this data available for others to innovate with at minimal cost to themselves.
This combination of factors gives me a lot of confidence that experiments, natural or otherwise, on these platforms will be the frontier of forecasting performance and innovation moving forward. Therefore, it follows that the highest leverage interventions will be interacting with them in some way.
Incentivizing Knowledge Creation
Forecasting to date seems to have largely treated the information that goes into forecasts as an external variable. Generalist forecasters can do some research to inform their work, and you can put them together on a team so that they can share information the others have missed. There seems to an implicit assumption in the research I’ve reviewed that this converges onto high quality forecasts and is generally sufficient.
I expect this to be a domain/question dependent aspect of this research topic that has otherwise been shown to be largely consistent across domains. When an event is fast changing, covered thoroughly by media, or just in a relatively “shallow” topic, I expect this model to hold up well. A search engine, a modest amount of time and effort, and some collaboration to check blind spots will let participants in forecasting tournaments converge to high quality forecasts that couldn’t be meaningfully improved. If, however, the bulk of information relevant to a forecast is buried in the specialized media of a niche field, that requires domain expertise to parse well or even discover, this convergence seems impossible. No matter how many generalist forecasters you throw at that kind of question in that environment, each will only be able to incorporate a small amount of the relevant information and they’ll even struggle to interpret that information correctly. I believe forecasting GCRs tends to land squarely in that second case.
I expect significant gains in forecast quality to be possible if we better connect generalist forecasters to the information and understanding they need. To date, the attempts I’ve seen at this have involved asking domain experts to forecast alongside generalists so they can learn from each other, but I believe we can create much better incentive schemes for the creation/curation of this information. It’s also critical that this information is accumulating over time so that it can be built off of vs. repeatedly rediscovered.
Prediction Polling Over Markets
I think prediction markets are a fantastic technology with tremendously valuable applications in forecasting. However, short of some new innovation that I haven’t found a hint of yet, I think they’re vastly weaker than prediction polling when it comes to forecasting GCRs.
The strength of markets is their intrinsic incentive scheme that motivates the aggregation of separate, private knowledge. But this incentive has been shown to break down over long time horizons. Tying up funds for long durations is not appealing to investors. I have seen some proposed solutions to this, and with play money markets much more innovation might be possible.
Unfortunately a second and more serious restriction is that their fundamental structure encourages the hoarding of information besides bets themselves. If all you care about is the exact question being forecasted, this is ok. As I just explained though, I think large gains could be made in forecasting GCRs by accumulating relevant knowledge on them over time and prediction markets just don’t seem compatible with this process. Prediction markets on highly relevant, somewhat short term questions might be useful components in a forecasting scheme, but I expect that they’ll need to be used to inform prediction polling forecasting done in an environment that encourages associated rationales that others can build off of.
A Culture of Practice
There are some training resources for forecasters, and even some studies showing that some trainings improve accuracy and that forecasters improve over time. You could imagine a world where we improve the state of training by incorporating feedback from top forecasters into different lessons and then doing high powered studies (maybe by tracking training participation on online platforms) to discover what works. This would surely work, but it will be slow and expensive.
I think there’s a tradeoff that can be made between measurability of progress and the speed of that progress. Forecasting is pretty close to a meritocracy. Skill, at least in the form of performance, is readily measurable. I expect that if you embed participants who want to improve into a positive sum environment, ie they aren’t in direct competition with each other, they’ll naturally share techniques and information that others can learn from. The transparency of performance will allow participants in these cultures to readily know who to listen to, and continuous dialogue around what’s being shared can improve it over time in a very tight feedback loop.
Creating environments of collaboration paired with high bandwidth communication, easy association between identity and performance, and the ability to curate knowledge over time should accelerate improving forecast quality over time. My guess is that something like this is what already happens within top forecasting teams.
Next Steps
Upcoming posts will mix these ingredients into interventions that I think are viable and publicly share my best guess at the recipes to making them happen. Depending on what they are and how they work, I might chronicle my efforts to make them real, but I’ll at least report out any progress in that direction and the knowledge updates I make as a result of these attempts.
What’s Not On This List?
There are three categories of things that I think are critical to improving the field of GCR forecasting that I’ve omitted:
Efforts to recruit more forecasters
Improving accuracy with better aggregation
Efforts to connect forecasts to stakeholders that can use them
To some extent this is because I think progress is already being made on them and I’m not seeing opportunities to accelerate it much. Metaculus and Manifold seem to be attracting large audiences, open sourcing the data from their forecasting collections should lead to the best aggregation algorithms we’ve ever had, and both Good Judgment Open and Metaculus are creating compelling partnerships with stakeholders and producing written content that other stakeholders should find compelling.
I also believe that the levers that I laid out, if pulled on, will naturally improve these other three. New forecasters will be more likely to join a lively culture, improving the quality of individual forecasts with better information will improve aggregation, and better reasoned / more accurate forecasts will be more compelling to stakeholders. These represent intrinsic hypotheses that I’ll also be testing as I work on interventions in the areas I described.