Are you also a lingerer? Mathematical method to eliminate procrastination
Author: Ddmond
Original Article in Chinese:https://www.zhihu.com/question/19888447/answer/1930799480401293785
Perhaps this goal is a bit bold, but if you readers have reservations about it, please allow me to invite you to quickly scroll through it and feel the style and content of this article, which may give you a little more confidence:)
In this article, the core idea I want to convey is that the problem of self-control is not necessarily a psychological or physiological problem, but may also be an engineering problem that can be solved by mathematics and physics. Based on the above ideas, there is such a possibility: people can impose lasting and far-reaching constraints on their own behavior, and even change the long-term steady state of their entire lives to a certain extent, without DDL, environmental constraints, or even any external supervision, just by relying on a few abstract and sophisticated thought experiments.
(Of course, it should be noted that even the most powerful methodology can only partially solve behavioral problems. Many serious neurological and pathological problems still need to be treated by professional medical means.)
In order to verify this idea, the author, as a person who has been troubled by severe ADHD and self-control problems since childhood, spent more than ten years from elementary school to doctoral studies, and went through countless trial and error, thinking and verification, before gradually figuring out the two generations of self-control technology that will be introduced in this article.
They have helped me, without external constraints or medication, to complete the transformation from struggling to concentrate for even an hour to being able to concentrate on self-study at home all day for several months without external pressure, and having a well-organized life. The exploration process of these two generations of technology is also an interesting and tortuous journey of continuous trial and error and iteration of the phenomenon of "self-control".
So, I have been writing this article as a wish that must be fulfilled: I hope to give it as a gift to those who, like me, are suffering from ADHD and self-control disorders. Dear stranger, I hope it can help you and others who are suffering, even if it only solves a little bit of the trouble caused by self-control problems.
Next, I will slowly explain these two methods to you. They are called CTDP (chained time delay protocol) and RSIP (recursive steady-state iterative protocol) . The ideas of these two methods are very different from the common clichéd discussions that emphasize concepts such as "put down the phone", "make plans", "goal decomposition", "reward and punishment mechanism", "delayed gratification", "intrinsic drive", and "habit punching in" - what I want to do is to abstract the underlying mathematical and physical mechanisms from daily behavior, and try to start from the first principles and solve (partially) these self-control problems that have plagued humans for thousands of years, such as procrastination, difficulty in starting, giving up halfway, and low status, as elegantly as possible.
Of course, in this process, I will introduce some basic mathematical concepts, and a thorough understanding may require a little bit of freshman advanced mathematics. But please rest assured that I will try my best to use easy-to-understand popular science language to explain concepts in a qualitative and semi-quantitative way (quantitative analysis is not realistic for this topic). In addition, in order to allow more friends with attention disorders like me to read smoothly, this article will also use more ADHD-friendly colloquial writing, with pictures and texts interspersed, and organized in numerically numbered sections.
Finally, a small tip: During the reading process, you may have various questions (for example, when reading the principle of the sacred seat , you may think "what if I cheat" or "what if I don't want to sit on it"). Please don't worry, usually the next section will answer these questions. Many limitations of the first-generation method will also be discussed and solved in the second-generation method introduced after Section 13. In short, please read slowly and don't worry :)
1
Before we get into the details, let's consider the most common scenario:
Suppose it is 7pm now, you have just finished dinner and are sitting at your desk with homework or a paper you want to read. At the same time, your phone screen lights up and you see an interesting push notification on Xiaohongshu. At this point, you have two choices:
Playing with your phone: You will get instant relaxation and happiness, however, it seems that it may delay your study plan tonight, and you may feel anxious and guilty afterwards;
Go study : You will face immediate boredom and fatigue, however, this will relieve the stress of your recent assignments and will also help with academics.
Faced with this choice, almost all people who lack self-control will tend to watch videos all night and then regret it. Why is this?
This problem is the most classic toy model of self-control. In response to it, people have proposed various discussions, such as the so-called willpower model, dopamine, reward and punishment mechanism, goal decomposition, delayed gratification, environmental control, psychological suggestion, identity recognition, to-do list and countless other academic concepts, empirical rules, folk remedies and so on.
However, in this article, I will abandon all the above-mentioned old-fashioned and vague concepts and explain them with a simple mathematical model.
2
This is the first assertion of this article: when a person faces any choice, his true inclination towards a certain behavior can be expressed as the future value function of that behavior. V(τ)V(\tau) , and the weighted discount function W(τ)The product of W(\tau) , from this moment ( τ=0\tau=0 ) to infinity:
I=∫0∞V(τ)⋅W(τ)dτI=∫_0^{\infty}V(\tau)⋅W(\tau)d\tau
in:
Future Value Function V(τ)V(\tau) means that in your current view, this behavior will be τ\tau brings value;
Weighted Discount Function W(τ)W(\tau) means your prediction for each moment in the future τ\tau 's value emphasis (the hyperbolic discount function in economics also describes a similar phenomenon).
In other words, when we are faced with a choice, we do not add up all the "future values" and then make a decision. Instead, it depends on the weighted sum of the values of all future moments. Generally speaking, the weight of the present value is higher, while the weight of the future value is lower.
Take the example of "going to study vs. browsing the phone" just now:
If we choose to study , we will face switching costs and boring learning in the short term.V(τ)V(\tau) is negative; but the stress and satisfaction relieved by learning in the medium termV(τ)V(\tau) turns positive; in the more distant future, the impact of a learning experience will eventually dissipate.V(τ)V(\tau) approaches 0 again;
The opposite is true for mobile phone browsing : short-term instant pleasure makesV(τ)V(\tau) is positive, but in the medium term, you will feel anxious and guilty about delaying your plans.V(τ)V(\tau) turns negative; and playing with the phone all night will not change your life, soV(τ)V(\tau) will also slowly approach 0.
Ideally (that is, purely rationally), if the weight functionW(τ)If W(\tau) is a constant, then the total net value of learning will be significantly higher than playing with mobile phones.
However, our brains tend to be extremely short-sighted, and the weight function W(τ)W(\tau) is very high in the short term and quickly approaches zero in the long term.
Under this short-sighted weight distribution, the short-term advantage of using mobile phones is multiplied by the weight, and the resulting points will be much higher than studying. This is why we ultimately choose to play with mobile phones.
(Note: In this article, we adopt the "assuming but not seeking" approach for these two functions and only conduct qualitative analysis. After all, quantitative analysis is not realistic in this problem.)
3
In fact, you will find that this seemingly simple mathematical model can explain almost all similar classic scenarios in life:
For example, let’s take the example of playing with your phone. You finally choose to pick up your phone, thinking you will only play with it for a while, and then what will happen?
From the moment you pick up your phone, the same story will repeat itself endlessly:
After watching short video A, the short-term temptation to watch short video B will be higher than putting down the phone;
After watching short video B, the short-term temptation of watching short video C will be higher than putting down the phone...
In your view at every moment,短视频放下手机I(短视频)>I(放下手机)I(short video)>I(put down your phone) both hold true, so you will choose to watch the next short video every moment.
So, you keep scrolling through the whole night until two or three in the morning. The cost of staying up late starts to become more and more significant, until it slowly offsets the temptation of short videos, and you finally go to bed full of regret.
(Of course, some people, unable to face the anxiety and self-blame they feel after staying up late, find that the psychological cost of putting down their phones is getting higher and higher, and they end up staying up all night.)
But what if you choose to study at the beginning? You may be surprised to find that once you really get into it, it becomes easier and easier to continue studying, and you may even gradually lose interest in browsing your phone.
This is because of the "switching cost" phenomenon in psychology: when we switch from one task to another , this "switch" itself is naturally accompanied by a certain psychological resistance. It is also easy to understand in this model - if you want to withdraw from the current activity, you must first stop the action at hand, clear the working memory that has been loaded in the brain, and then forcibly switch to the new behavior process.
This cost is mathematically equivalent to the most sensitive τ=0\tau=0 position, a negative impulse function is inserted!
This is why it is harder to get out of an overindulgence, but easier to stick to a task once you’ve started it.
(For the sake of convenience in subsequent presentation, I will not draw the impulse function of this switching cost separately, but will automatically merge it into the value functionV(τ)V(\tau) . )
However, there are times when we can naturally resist the temptation to play with our phones. The review week before the exam and the eve of the DDL are such times.
As mentioned earlier, the short-term pleasure brought by checking your phone will usually only cause some anxiety in the medium term and will not really have a significant impact on our lives.
But it’s different during exam week - if you are still addicted to short videos at this time, you will fail the exam, which will lead to a series of serious consequences, and may even change your life.
As a result, the long-term negative value of playing with mobile phones has expanded dramatically, and it has defeated the short-term temptation with a very low weight! Therefore, the script of the review week is usually a clear threshold. After a certain point in time, your time investment will suddenly increase sharply. The so-called "ddl is the first productivity" is actually the same principle.
(Of course, for some people, the severity of this consequence will make the meaning of "reviewing" heavier and heavier, the cost of switching will increase, and the closer to the deadline, the less likely they will start, and finally they will fail the course)
There are many more examples like this. It can be said that in terms of a single behavior, the success or failure of our self-control depends on only one thing - the value distribution function V(τ)Whether the distribution of V(\tau) over time is favorable.
4
now thatW(τ)W(\tau) represents the innate short-sightedness of human beings, which is almost fixed and difficult for us to change fundamentally. So, we have to ask a question:how is self-control possible?
Here, we can make the second assertion of this article:
All effective human self-control strategies are essentially constructing a transformation of the value distribution function. V(τ)→V′(τ)V(\tau)→V^{'}(\tau) , thus making the behavioral tendency closer to the result of rational decision-making.
Here are some examples:
Useless method 1 (long-term reward) Self-motivation, imagining "a better life after success in the future", or "gamification" to set rewards for yourself after learning. This is equivalent to the value function of learning behavior.V(τ)On V(\tau) ,a positive linear incentive is superimposed in the long term - but since the long term weight is extremely low, this method is actually thankless and usually useless;
Useless method 2 (long-term punishment) is to set a penalty for playing with your phone, or go for a run or write a self-reflection after playing with your phone. This is equivalent to giving a value function to playing with your phone.V(τ)On V(\tau) , insert a negative value into the future - this method is also useless;
Slightly useful method 3 (recent punishment) is to lock up the phone or find someone to supervise it. This is equivalent to increasing the switching cost of indulgence in the near term, that is,V(τ)V(\tau) recently inserted negative values - this method is useful, but not much;
Useful method 4 (non-linear compression) For example, many people are familiar with the "Pomodoro Technique". Its mechanism is actually to change the "give up halfway" after starting to learn.V(τ)V(\tau) , packages and bundles the sunk costs of the entire Pomodoro timer, and then nonlinearly compresses it to the present moment - this is a more useful method, and we will expand on it in detail when explaining CTDP (first-generation technology) later.
In order to measure the effectiveness of these methods more intuitively, we can also define an indicator, which is the gain (G) of the self-control strategy:
学习放纵学习放纵G=I′(学习)I′(放纵)/I(学习)I(放纵)G = \frac{I^{'}(learning)}{I^{'}(indulgence)}/\frac{I(learning)}{I(indulgence)}
In simple terms, it is the ratio of rational decision-making tendency before and after using the strategy. If you use this indicator to test most of the so-called "self-control methods" on the market, you will find that their gains are generally pitifully low. They either only work on the far end with extremely low weights, or even do not act on the value function at all. V(τ)V(\tau) !
For example, motivational slogans from marketing accounts such as “Just do it” and “Tell yourself that you always have a choice!” can actually get thousands of likes on Zhihu and Douyin, which shows how low the current average level of awareness of the topic of self-control is.
Those who can achieve self-control through inefficient means (many of these people are indeed excellent) do so not because these methods are so excellent, but because they already have good habits, environment, and their own conditions, and are only one step away from true self-control, so even a slight stimulus can motivate them to achieve positive behavior.
Sadly, because these advantages actually account for a very high proportion of personal achievement, people with these advantages rarely need to pursue truly effective self-control strategies. Instead, the superficial methods they share after their success have become the most widely spread mainstream cognition - this counterintuitive survivor bias will be discussed in more depth at the end of the article.
5
Now, here comes the fun part.
With this mathematical foundation, we can construct an extremely clever strategy that V(τ)V(\tau) performs nonlinear compression and linear translation transformation to obtain an astonishing positive gain for a single rational behavior almost out of thin air.
More importantly, it can solve the three most common problems in self-control: difficulty in starting, giving up halfway and short-lived enthusiasm.
All of this is based on three core principles.
The first core principle is called the "Sacred Seat Principle."
Let’s do a thought experiment like this:
Suppose there are many seats in a study room for you to choose freely. One day, you suddenly have an idea to designate one of the seats as the "sacred seat" and set the following rules for it:
Sitting in any other seat has no special restrictions, you can do whatever you want; but once your butt touches this "sacred seat", you must use your best state and study 100% focused for a full hour. On the other hand, if you don't have the confidence to concentrate, then simply don't allow yourself to sit in this seat, and prefer to choose other ordinary seats.
In short, this "sacred seat" must never be desecrated by an inattentive butt.
Of course, just imagining such a rule does not have any real binding force. But what if you really enforce it once?
One day, you really sat on it, and as soon as your butt touched that seat, you really studied for an hour with the utmost seriousness.
A magical thing happened: from the moment you successfully executed this rule for the first time, this seat that was originally just an imaginary seat was really given value in your mind! From then on, the possibility of you sitting in this seat and playing with your phone will really be much lower than before!
At this time, you add a new game rule:
The first time you concentrate is recorded as #1. After that, every time you successfully concentrate for an hour, it can be used as proof of work to add a number record to this "sacred seat": #1, #2, ..., #N. But if you fail once - for example, you use your phone on it, or leave after sitting for only ten minutes - then all the records will be cleared, and you can only start again from #1 next time.
As this chain continues to lengthen and proof of work continues to accumulate , the value of this imaginary seat will increase again and again. When this "focused task chain" grows to #10, #20, and #30 , the constraints of this rule will become substantial - you may even become cautious and dare not even breathe, for fear of the slightest disrespect for the rules.
(You are smart and must have thought of the problem that this rule may break down and you are unwilling to sit on it. Don’t worry, this is exactly what Sections 8 and 9 will solve.)
(In order to avoid misunderstandings from many people, it needs to be stated here that this is not the final version of the method. What really works is the mathematical mechanism, and it has nothing to do with the so-called "morality", "sense of ritual", "psychological suggestion" and "now or never". I will explain it in detail later)
6
This magical restraining force actually comes from humans' innate obsession with "keeping records."
Many fitness or learning apps have a "continuous check-in" function. Many people will force themselves to memorize words for 5 minutes no matter how sleepy or tired they are, just to maintain the "365-day continuous check-in" record on Duolingo; people who have quit smoking and drinking for 10 consecutive days will find it harder to give up when they see the number 10 days than on the first day.
——Just an imaginary record is enough to create an almost absurd constraint.
In detail, this constraint actually comes from two points:
On the one hand, the longer the record is kept, the more time and energy you have to spend to maintain it. Behind every successful task node on the chain is a real "proof of work" and sunk cost;
On the other hand, this kind of record is often accompanied by an indescribable "future value expectation": you think this record is valuable → you are afraid of losing this record → it will have a binding force → this binding force will help you control yourself in the future, and your future self-control will rely on it → this record is more valuable. The higher the value, the stronger the binding force; the stronger the binding force, the higher the future expectation; the higher the future expectation, the higher the value...
However, all "records" are inherently "all lose if one loses": once you break the record, all the hard-earned sunk costs and future expectations will immediately and suddenly disappear. τ=0\tau=0 is completely lost!
This is the mathematical essence behind the "Holy Seat Principle" :V(τ) Nonlinear compression transform of V(\tau) .
When you sit in this seat, the past investment value of all nodes in the entire task chain and the expected value in the future will be sharply compressed in the value function of the option "give up focus" and condensed into a value extremely close to the origin ( τ=0\tau=0 )-Any short-term temptation to break the rules will in fact immediately face the challenge of the accumulated and future value of the entire chain.
And the best part is that this continues to hold true for every moment of task focus. When the sunk costs accumulate to a certain level, there will no longer be any short-term temptation to challenge such a formidable barrier.
7
This principle of value bundling, where everyone loses when one loses, is also the reason why many classic self-control strategies really work:
For example, in the Pomodoro Technique, each Pomodoro is actually equivalent to a "small sacred seat": it packs and bundles the sunk costs and future expectations of the entire focus period into a whole "pomodoro". Once you slack off or give up during the Pomodoro, you will immediately bear the huge cost of losing the entire "pomodoro" in the present moment.
In this way, the choice you make at each moment is no longer a comparison between "the temptation in front of you and the current task", but becomes a competition between "the temptation in front of you and the total value of the entire compressed tomato" - this is the truly effective core mechanism behind the Pomodoro Technique.
For example, many people have had this experience:
One day you learned a new self-control method that made a lot of sense, so you put it into practice with great enthusiasm. At the beginning, even if the method was actually useless, you would find it worked like a charm.
But the strange thing is that after a few days, the initial novelty gradually fades, you start to violate the rules frequently, the effectiveness of the method rapidly decreases, and eventually it becomes completely ineffective and you abandon it.
This is an illusion similar to the "newbie protection period": any self-control method seems to be useful in the first few days, but what really works may not be the method itself, but your "future expectations" of this method.
When you place your hopes for future self-control on this method, this "placement" temporarily and truly gives it binding force. However, if the method itself is inefficient, this temporary binding force will ultimately not be able to support its long-term survival in a complex real environment. Once any violation or wear and tear occurs, the credibility and value of the entire method will quickly collapse with the broken windows effect.
The most wonderful thing about the "sacred seat" design is that it is naturally a distributed and decentralized design in terms of time. It is only responsible for the selected and purified state of the person sitting on that seat, without being exposed to all the time and responsible for the long-term state, thus maximizing the avoidance of wear and tear.
In other words, after completing task #1 in full condition, you can go out for dinner, drink, play games, and waste a few days or even a week, but when you walk into the study room again to start #2, the sacred seat will still be the sacred seat, and its binding force will not weaken at all.
8
Of course, even in this selected state, the Divine Seat Principle is not entirely without flaws.
As mentioned earlier, once you sit in this position, you must concentrate on studying for an hour in the "best state". So the question is, how do you define this "best state"?
If I go to the bathroom in the middle of a session, is that still considered “my best”?
If I answer a phone call or pick up a package, would that be considered my “best condition”?
If someone sends me a message and I reply, would that be considered my “best state”?
If all these are counted, then playing with my phone and watching a few short videos in the middle can also be considered the "best state", right?
You will find that the "ideal state" does not exist at all. Once any self-control strategy enters actual combat, it must face complex and changing real situations. On the surface, a self-control strategy is just a simple constraint; but in actual application, all self-control strategies are equivalent to a large number of implicit, tiny "sub-constraints" , each of which can be tested and challenged:
If you allow yourself to exercise discretion, the binding force of the method will be worn away by the fluke mentality of "this time is special, it won't happen again", resulting in a broken window effect, and eventually being corroded and riddled with holes;
But if you don't allow any exceptions at all, this method will become rigid and fragile. When you encounter an unbearable situation, the rules may collapse instantly and completely.
This "gradual broken window effect" is the fundamental reason why most methods similar to "gamification" or "setting rules" will fail sooner or later. It doesn't matter what game the athletes play, but the logic of separating athletes and referees is important.
To solve this problem, let us introduce the more subtle second core principle: the " next must be an example " principle.
(Note that it is the case of "bì" (must) rather than "bù" (not) that is used as an example)
Since you know that if you do not do it for the first time, you will inevitably do it countless times. So, why not do the opposite and force yourself to do it for the next time - that is, demand yourself to do it in the future!
Specifically, when you are faced with any suspected violation, just like the "case law" in the Western legal system, you can only choose one of the following two options:
Immediately judge that the rules of this "sacred seat" have been completely violated, the chain has been broken, all task records have been completely cleared, and the binding force has been completely reduced to zero; next time, honestly start again from #1;
The behavior is judged to be allowed, but as long as it is allowed this time, it must be allowed in all future similar situations. During the entire life cycle of this task link, it will completely lose its binding force on the behavior.
Your "best state" is not defined by any subjective or objective standard, but is dynamically defined by countless "cases":
Does going to the toilet during the trip still count as "the best state"? Yes, but as long as it counts this time, it must count in the future;
Does replying to a message midway still count as "the best state"? Yes, but as long as it counts this time, it must count in the future;
Does watching a two-minute short video in the middle of a video count as "the best state"? Yes, but as long as it counts this time, it must count in the future;
In this way, you are no longer considering an isolated choice, but whether to permanently abandon the constraints of this rule on this behavior at this moment. The price before you is the real long-term cost of allowing this behavior.
Now in your eyes:
As for behaviors that should not be allowed in reason (such as playing with mobile phones), you who are sitting in the seat know very well: as long as this "exception" is allowed today, then every time you sit in this seat in the future, you will cite this behavior as a "precedent" to cheat (besides, the rules require you to cheat), and you will no longer be able to redefine this behavior as a violation;
As for situations that should be allowed rationally (such as going to the toilet), your future self will be able to give yourself permission with peace of mind, without any concerns about the credibility of the rules.
In the end, the decision you make is truly the most rational decision in the long run, because you (as an athlete) who is thinking about the present, and you (as a referee) who is thinking about the future, have wonderfully reached a consensus under this mechanism - just like that , you and yourself have reached a Nash equilibrium across time.
When this method is run for a long time, the constraint boundaries of the rules will not gradually erode and collapse like the traditional method, but will slowly andϵ−n\epsilon-n -like, with an accuracy of "always just enough", it gradually converges to the boundary closest to rational decision-making: it can allow truly necessary exceptions while effectively preventing unnecessary self-indulgence.
This is the beauty of the "next must be an example" principle.
9
OK, now we have a perfect "holy seat". But a new problem arises:
The more sacred the seat is, the more you will dare not sit on it.
Indeed, the state of sitting on this "sacred seat" is perfect. However, this seat is so sacred and perfect that the commitment of "sitting on the seat" is too heavy, and you will be less and less willing to sit on it, which is often called the "starting difficulty" problem.
This is why people always like to say that "perfectionism leads to procrastination" (strictly speaking, the essence of procrastination is not perfectionism, but the increased switching costs caused by excessive expectations).
At this point, it comes to our third core principle: the “ linear delay principle ”.
From a mathematical point of view, this principle can solve the problem of "procrastination" that has plagued countless people for a long time in an elegant way.
Let's do another simple thought experiment:
Suppose you find a typical procrastinator and ask him, "Would you like to start studying right now?" He will probably shake his head and refuse.
But if you ask him in a different way: "Would you like to start studying tomorrow afternoon?" You don't even have to wait that long: "How about starting in 15 minutes?" This time, he will most likely agree! Moreover, the longer the delay, the higher the probability that this person will agree.
Why does this strange phenomenon occur?
Let's go back to the previous mathematical model to explain:
When you consider whether to start learning now, as analyzed above, the value functionV(τ)The negative value of V(\tau) is just in the weight functionW(τ)W(\tau) has a high weight in the near future, so you will feel strong resistance;
However, when you consider whether you are willing to start studying in 15 minutes, the situation is completely different: this is actually equivalent toV(τ)V(\tau) moves backward 15 minutes on the timeaxisto W(τ)W(\tau) is relatively flat! This almost eliminates its huge short-term disadvantage compared to the indulgent option.
This also explains a very common phenomenon in daily life: we always have blindly optimistic illusions about our future self-control.
In the hyperbolic discount function, the weight decreases steeply at first and then slowly over time - the weight after 10 minutes may be very different from the weight after 1 hour, but the weight after 1 day and 10 minutes is almost the same as the weight after 1 day and 1 hour.
When we think ahead about future options, we are usingV(τ−Δτ)V(\tau-\Delta\tau) is translated and then summedW(τ)W(\tau) integral, the longer the translation distance, the more actually involved in the integralW(τ)The flatter W(\tau) , the closer it is to a rational state. This is why we are often blindly optimistic when making summer vacation plans, and why we naively think we can really use our phones for five minutes before putting them down.
And here comes the climax of this method:
We can set up a parallel "auxiliary chain" in addition to the main task chain, which is also protected by the principle of the Holy Seat and also "applies as an example" to all situations . The constraints it defends are very simple:
Set a simple action as an appointment signal, such as snapping your fingers;
Once this signal is triggered, you must sit! on! that! sacred! seat! within! 15! minutes!
As the saying goes, it is better to loosen up than to block. The best way to overcome procrastination is to first acknowledge and respect it.
Since the threshold of "starting in 15 minutes after the appointment" is much lower than the threshold of "starting immediately", you can now start the appointment without any pressure.
After 15 minutes, the past and future value of that auxiliary chain will gradually increase.τ=0\tau=0 is compressed and condensed into a sharp firing pin that breaks through the sky - it forcibly pierces the door of "switching cost" and "starting difficulty".
(By the way, an embarrassing thing happened: when I just came up with this method, I couldn’t sleep one night and accidentally snapped my fingers. I had no choice but to get up at 3 a.m. and study for an hour.)
Now, combining the previous three core principles, we finally get the complete first generation of automatic control technology:
Chained Time-Delay Protocol (CTDP)
10
Here is the complete description of the Chained Time Delay Protocol (CTDP):
CTDP is a behavior constraint strategy based on three core principles (the principle of the sacred seat, the principle of the next example, and the principle of linear delay). Specifically, it requires you to build two parallel task chains (the main chain and the auxiliary chain) and strictly follow the following steps:
Main Chain (task chain):
First, designate a specific object as a symbol, as the "sacred seat" (in fact, the sacred seat is just a metaphor, it can be anything, a specific chair, a special pen, a hat, or even a message sent to your specific WeChat account)
Once you trigger this sign, you must complete a clear focused task in your “best state”;
Every time you successfully complete a focus task, you can record a node in the main chain: the first success is #1, the second success is #2, the third success is #3, and so on;
If during any mission you seem to have acted in a way that is not in line with your "best state", you must choose one of the following two options (the "next must be the same" principle):
The entire main task chain is judged to have failed immediately, and all currently accumulated node records are cleared. The next time, you can only restart from #1;
The current behavior is judged to be allowed, but from now on, this behavior must be permanently allowed in subsequent tasks and must no longer be considered a violation;
Auxilary Chain
Define a simple appointment signal, such as snapping your fingers or turning on an alarm, to indicate that the main task will start in 15 minutes;
Once you trigger this appointment signal, within the next 15 minutes, you must trigger the sign corresponding to that sacred seat to start a main chain task;
If you do not trigger the flag within 15 minutes after the appointment is triggered, the same "next must be the case" principle applies:
Either completely clear the record of the reservation chain and admit that the reservation chain has failed;
Either the current situation is allowed, but henceforth the reservation chain will completely lose all binding force on the situation;
So far, we have hacked away the impact of startup difficulties, broken window effect, and short-sighted decision-making by using the triple clever mechanism of nonlinear value compression (the sacred seat) + case law constraints (the following example will be used as an example) + linear time shift (appointment mechanism).
With just a few thought experiments, we have constructed a self-control strategy that can extract huge gains for rational behavior almost out of thin air without external supervision or even insufficient willpower. Moreover, its binding force comes entirely from the proof of work of each node, and it is only responsible for these distributed selected nodes, and will not be exposed to long-term state fluctuations and be corrupted.
Relying on the CTDP strategy, we can make any important task easy to start, easy to stick to, and not ineffective in the long run - we want all of this seemingly impossible triangle.
11
Of course, the CTDP described above is more of an idealized version. In real life, its application is far more flexible and interesting than what has just been described.
For example, does the “holy seat” really have to be a seat?
In fact, it is not necessarily the case. It can be any specific and easily distinguishable symbol. For example, the symbol I usually use is a special WeChat account. Every time I start a task, I send a message to trigger the task and also record the node and goal declaration.
Secondly, the task chain itself does not necessarily have to be a strict linear advancement (#1, #2, #3...), you can completely construct a top-down hierarchical organization:
For example, three unit-level nodes can form a 3-hour task group, denoted as ##1;
Three task groups form a day-level task group, denoted as ###1;
Three task groups form a three-day task column, recorded as ####1;
Two or three columns can be managed by a weekly task cluster, denoted as #####1;
Each unit can have its own requirements. For example, the ## task group can require the first two # units to execute a reservation signal after completion, so that the three # units can be linked together. In this way, we organize the huge task chain from top to bottom in a "three-three system" like a military command sequence!
Similarly, the content of the task unit does not necessarily have to be monotonous and focused on learning. To be more specific, it can also be divided into different "arms".
For example, studying, doing experiments, reading papers = "assault unit", information collection = "reconnaissance unit", making plans = "command unit", handling chores = "special service unit", exercise = "engineering unit", preparing meals = "cooking unit"...
A large task group or task column can be composed of multiple arms, just like a modern combined arms force.
A dedicated combat mission group can be composed of 7 assault units + 2 reconnaissance units;
A logistics task group for a weekend break can be a combination of 1 command unit + 3 special service units + 2 engineering units + 1 cooking unit;
During the holidays, a three-day column that takes into account both sports and self-study can be composed of 6 assault groups + 3 engineering groups.
You will find that this task chain structure naturally completes the so-called "goal decomposition". At the same time, "gamification" does not need to be deliberately designed - because when you are really facing a big task, the style is like this:
Please note the schedule. I will make the following adjustments:
The fourth task group, the eleventh task group plus the fifteenth and sixteenth independent reconnaissance groups will strengthen the DDL defense line for next week's homework; the second, third, seventh, eighth, and ninth task groups, plus the sixteenth assault group of the sixth task group, will concentrate on completing the notes; the tenth assault group plus one assault group will be on the front line of TOEFL and GRE to block the words during the review time; the twelve task group will cooperate with the twelve independent units to encircle and eliminate the knowledge gaps found before; the five task groups and the two reconnaissance groups of the sixth task group will search for information; the fourteenth task group will be the general reserve and will not move!
Some off-topic remarks:
In practice, using only a simple appointment signal is still too fragile, because sometimes when 15 minutes are up, I may be in the toilet/outdoors, and it is not realistic to start the task. Therefore, I will design two start signals as a buffer:
Scheduled start signal: The signal is a snap of the fingers. Once triggered, the "immediate start signal" must be executed within 14 minutes and 30 seconds (retain a buffer to prevent the 15 minutes from being used up);
Immediate start signal: The signal is three snaps of the fingers. Once triggered, the task must be started as soon as possible based on the existing conditions.
In addition, this strategy of "sacred seat + next must be an example + linear delay" can be easily extended to any aspect of life:
For example, to start a running exercise, you can use a specific gesture as a reservation signal. Doing n actions means you must run for 5×n minutes.
In order to solve the common problem of ADHD people procrastinating on bathing and going out, you can also set up corresponding bathing appointment signals and going out appointment signals;
You can even expand this to the point where you can almost "remotely control" your daily life with simple gestures or movements - completely eradicating the problem of procrastination in an extremely easy and elegant way.
12
So far, the content of the first generation of automatic control technology has been fully presented.
Looking back, this technology has far surpassed the superficial motivational slogans, to-do lists or gamification designs on the market in terms of principle design, practical implementation, and the sophistication of the method.
Sure enough, a few years after the birth of CTDP, it produced amazing results on me.
I must be honest, my self-control foundation is really bad: from elementary school to high school, I was addicted to games day and night, suffered from ADHD for many years, and my study habits and life order were a mess. I couldn't concentrate on a class from childhood to adulthood, and finally got into a 985 university by luck. In the early days of college, after the environment was relaxed, I couldn't even review during the exam week. The example mentioned above that the closer the deadline was, the less I could study. This is the real me in the past, which can be said to be the floor of the self-control foundation.
But since the birth of CTDP, I have achieved continuous self-discipline for weeks or even months for the first time in my life. With this technology, I can start tasks extremely easily and maintain high concentration for a whole day without any burden. The problem of ADHD seems to have been alleviated overnight. Not only did my grades improve significantly in the later period of college, I even went abroad for exchange, published papers, successfully conquered TOEFL and GRE, and finally enrolled in a hardcore master's program.
Especially when I was at my best (for example, when preparing for TOEFL and GRE), it allowed me to mobilize efficiently for 8-10 hours every day for two consecutive months, and more than a dozen task clusters and hundreds of task units were able to follow orders, advance and retreat in a reasonable manner, and function seamlessly.
There was even one time when I caught a cold for three days in the middle of the exam review period, but I was still able to calmly calculate its impact on the exam fifteen days later, and calmly deployed eight or nine ## task groups from the general reserve force deployed seven days later to fill the vacancies.
(For example, this is a schedule for more than a month before an exam, and each cell represents a ## level task group)
In the face of such unprecedented high efficiency and self-discipline, I was once optimistic that the building of self-control had already been built, and all that was left was just some repairs.
However, I soon discovered that CTDP does not work at all times. Specifically, I observed an obvious polarization phenomenon:
When the big state is conducive to self-control, especially when there are clear pressures and goals such as DDL, such as the review period before the exam, when there are a lot of urgent tasks, CTDP can indeed maximize the use of these pressures, allowing self-control and discipline to reach an unprecedented level, and almost every hour can be controlled with ease;
However, when the general state is not conducive to self-control, such as when I am in idle time at home, physically and mentally exhausted, and have no clear task goals, I often lack the desire to trigger the appointment start signal. Even if I force it to start many times, I will collapse after just a few #tasks.
Therefore, in the following three or four years, I have tried to improve and upgrade it countless times. However, CTDP seems to have exhausted the limit of the self-control strategy from the micro perspective of "single behavior at the scale of 1 hour". No matter how it was modified, it has not made any progress in a few years, and the self-control state has always shown strong stage and environment dependence.
This bottleneck troubled me for several years. It was not until five years later that I finally found a deeper perspective to explain it all.
——Do not focus on the traditional "motivation", "rewards and punishments" and "constraints", but look at the entire system from the perspective of "scale"!
13
In this new perspective, our daily lives are no longer simply isolated and based on the immediate V(τ)V(\tau) is a single decision node. In fact, it is more like a "behavior tree" consisting of countless continuous and interwoven decision nodes.
In this behavior tree, the direction of each node is highly dependent on the choice of the previous node; and its macroscopic direction at each time scale is highly determined by various large and small scale factors in life. Most of these nodes are not suitable for our limited free will or self-control strategies driven by free will to intervene.
Let's recall the example of the phone trap at the beginning:
One day after dinner, you lay on the sofa with the mentality of "just browsing for a while" and casually opened the short video app:
As analyzed above, every time you finish watching a video, you are faced with the micro-choice of "watching the next video" or "putting down the phone". But unfortunately,短视频放下手机I(短视频)≫I(放下手机)The relationship between I(short video)\gg I(put down your phone) holds true for every micro node, so you can't put down your phone at every node.
If you calculate the probability of you eventually going to one of the two branches based on all your past choices, you will find that the gap between the two is extremely large - perhaps as high as 99% to 1% (this is just a conceptual example and does not require actual statistics).
Now, we can introduce a key assumption - the "limited free will" assumption :
Free will exists, but it is also limited. The smaller the difference between the options, the higher the probability that free will can effectively intervene. If the difference is 60%:40%, free will can still intervene; but when the difference between the options is too large, such as reaching 99%:1%, free will is almost powerless.
In fact, the moment you lie on the sofa and open the short video app, your behavior is like a fighter jet locked by a missile radar, trapped in an inescapable "no escape zone". In this area, you cannot break free from it by just a tiny struggle of free will, and you can only sit and wait for death in a statistical sense .
You can only successfully change this state when the anxiety of staying up late gradually grows, the numbness of scrolling through the phone gradually grows, and the gap in the tendencies of the two options gradually narrows to the range where free will can interfere.
In other words, the moment you choose to pick up your phone and lie down on the sofa, you have actually determined how you will waste the next few hours on a larger scale.
Based on the above observations, we can also propose a further definition:
When the probability gap of a series of decision nodes exceeds a certain threshold (for example, 90%: 10%), we can determine that it is beyond the scope of free will intervention. Then, we directly ignore the options with lower probabilities and roughly granularize these micro nodes into a whole "no escape zone".
If we look at it from a larger scale, within these “inescapable zones”, those seemingly independent decision nodes have been statistically determined by larger scale factors (such as immediate temptation, current emotions, physical fatigue, and deep-rooted habits, etc.). On the other hand, small-scale, free-will choices, such as watching videos or playing games, and which short video to watch, have become unimportant.
14
This phenomenon of the increasing and decreasing importance of various influencing factors at different scales is not limited to the field of automatic control, but is widely present in various complex systems.
In statistical physics, there is an elegant and profound theory called "Renormalization Group Theory " that describes this phenomenon. In 1966, American physicist Leo Kadanoff proposed the following idea:
When you change the scale of observation, the degrees of freedom within the system will continue to be merged (coarse-grained), the system's macroscopic behavior will gradually be dominated by a few key variables, while a large number of microscopic variables will gradually become less important.
To simply understand this method, imagine a small game like this:
You have a large chessboard in front of you, and each square has an arrow on it, pointing either up (↑) or down (↓). We can design some rules to affect the direction of the arrow, such as:
A rule can be added so that each arrow tends to be consistent with its neighbors;
Or conversely, make the arrow tend to be opposite to its neighbor;
You can also add local noise and disturbance;
For another example, sometimes external intervention can be imposed as a whole;
Under this dense and intertwined local rule, many tiny disturbances, even a tiny change in a grid, can spread to the surrounding area, forming a complex chain reaction. You might think that in order to analyze the overall rule, you have to count every arrow clearly.
But Kadanov said: It doesn't have to be so complicated. As long as we are willing to be a little "blurry" and look at it more roughly, the rules will automatically emerge.
He devised a clever "coarse-grained" game rule:
Merge adjacent 2×2 small grids into one large grid;
Use a "winner takes all" approach to determine the direction of the large grid (for example, if there are three ↑s and one ↓, then the entire block is recorded as ↑. If the number is the same, a direction is randomly selected, or handled according to some simple rules);
Next, repeat this merging operation with the newly obtained large grid...
After rounds of coarse-graining through "merging", our perspective becomes increasingly blurred. The arrow map that initially looked like a mess of details finally only had a few large areas, and perhaps most of the directions were converging.
What happened in this process?
First, those local rules that are small in scope and independent of each other will be submerged or averaged out when they merge, and eventually disappear completely on a large scale. The resulting structures will gradually fade away;
On the contrary, those trends with broader scope and stronger consistency can survive each layer of coarsening and gradually stand out at larger scales;
Therefore, this coarsening will not only change the grid, but also change the strength of the "rule" itself! Whether a rule is important depends on the scale at which you look at it. Factors that are crucial at the microscopic scale may not be found at the macroscopic scale; and those seemingly weak but consistent trends may become the ultimate dominant force of the system.
Let's consider a simple example.
Suppose, on this chessboard, there are two forces competing with each other:
One is a short-range exchange interaction , which is very strong but only affects the nearest neighbor grid, making the adjacent arrows tend to be consistent;
The other is a long-range dipole interaction , which is relatively weak but can act over longer distances, trying to make the arrows of the regions point in opposite directions.
(Note: In the real world, there may be complex factors such as anisotropic energy, thermal fluctuations, defects, stress, etc. This is just a very simplified heuristic discussion)

In the absence of outside intervention, a tug-of-war began between the two forces:
The exchange interaction tends to pull the local lattice all in one direction, forming a large number of small regions with local orientation;
The dipole interaction does not allow the entire system to be in the same direction for too long. It tries to "pull" these regions from a distance, causing them to alternate in direction and form an arrangement that cancels out each other.
Finally, with the competition and compromise between these two forces , a delicate balance was reached: you will see that large groups of arrows automatically gathered to form "islands" of different sizes and directions. The directions inside each island are unified, while the directions between different islands are different. These islands are intertwined and eventually stabilized into a complex and stable "puzzle" structure.
In the real world, this phenomenon is called "magnetic domains" , which is the result of the classic short-range exchange-long-range dipole competition model.
15
So, what happens when we look at this chessboard at different scales?
At the microscopic scale, we mainly see local areas with highly uniform directions: the exchange interaction is very strong at this time, which quickly makes the surrounding lattice directions consistent; while the dipole effect is too weak to be detected from this microscopic perspective.
But when we coarse-grained to a larger scale, what we saw were large blocks alternating in different directions: although the exchange interaction was strong at this time, its range of action was too short and could only affect the unified trend of the internal small scales, and its voice would not increase with coarse-graining ; on the contrary, the distance of the dipole interaction was far enough that at a large scale, its weak effect continued to superimpose and slowly accumulate, gradually occupying a dominant position and forming an obvious macroscopic structure.
This phenomenon is the most profound point revealed by the renormalization group idea.
The importance of a factor often depends on the scale of observation you are at. Rules that are crucial at a small scale may be completely averaged out at a larger scale. However, long-range trends that are almost imperceptible at a small scale may accumulate layer by layer in the process of coarsening and ultimately dominate the macroscopic direction of the system.

This is one of the most universal laws in the entire physical world:
A glass of water is a complex and disordered collision of countless molecules at the microscopic level, but at the macroscopic level it can be completely described by just a few simple variables: temperature, pressure, and volume.
The magnetic moment of a magnet is chaotic at the atomic scale, but it condenses into clear north and south magnetic poles at the macroscopic scale.
Each atom of a spring obeys complex quantum mechanics, but from the outside, it looks like a simple Hooke's law.F=kxF = kx .
In the financial market, no matter how complex and turbulent the trends of small cycles are, they will ultimately obey the script set by the trends of large cycles.
As the scale of observation continues to expand, most local fluctuations, short-range fluctuations, and microscopic rules will automatically be merged, offset, and absorbed into other variables. The ultimate macroscopic behavior of the system depends only on a very small number of variables that have a wide range, strong consistency, and can continue to accumulate and survive on a larger scale.
(Note: These two sections are only metaphors for inspiration of life, not rigorous reasoning)
16
The scary thing is that renormalization ideas like this can actually be extended to life itself.
Let’s review the rules set earlier: when the statistical probability gap between options exceeds a certain threshold (for example, 90%), we believe that free will cannot be interfered with and regard it as a "no escape zone" - you will find that most of our lives are made up of large and small "no escape zones", like magnetic domains.
When you lie in bed late at night and watch short videos, it is a negative "no escape zone" and you are unlikely to suddenly put down your phone in the middle of the video.
When you eat, it is still an inescapable zone. You are unlikely to stop eating and go to the library to study.
When you are studying, it is a benign inescapable zone. Once you enter the state, you will not stop easily unless you encounter resistance.
Maybe you’re saying, “Wait, there always seem to be times when we have a choice, right?”
But unfortunately, when viewed from a larger scale, those apparent freedoms are shrouded in a larger "inescapable zone":
You only slept for three hours last night, so your script is likely to be wasted for the whole day. Even if you choose to start studying, it will be difficult to stick to it. You will be stuck in the "inescapable zone" of checking your phone while feeling tired, relying on the high stimulation of the phone to stay awake.
If you have not had any DDL in the past few days and have been sitting at home doing nothing, then you will most likely fall into a continuous cycle of "alternating between mobile phone and games", which is also a "no escape zone" on a macro scale;
With the final exams approaching, your script for this month is often to procrastinate at the beginning, but then easily get into the study state a few days before the exam, and finally regret why you didn't start earlier;
My friend, the Russian doll-like nesting of scripts, loops, and no-escape zones is the reality of our lives.
How many ways can we spend our one minute? Perhaps there are hundreds or thousands of ways: you may be watching short video A, or watching short video B, or reading a book, or doing a question.
But if we expand the time scale a little, we may have only dozens of ways to spend an hour; it may be watching short videos/studying/commuting/eating;
In a day, there may be only a dozen ways to live it; in a month, there may be only seven or eight typical patterns; if you look at it from the scale of a year, you may be surprised to find that there are only three or four general scripts left for our year.
As the perspective is extended, we gradually coarsen the entire chain of events, and the influence of small-scale details becomes smaller and smaller, and is gradually averaged out. The entire system is increasingly influenced by the corresponding large-scale factors, such as environment, work, habits, personality, etc. Small-scale variables control small-scale scripts, and large-scale variables control large-scale scripts:
Your state of attention controls your thoughts every second;
What is going on right now grabs your attention every minute.
Today’s schedule determines exactly what you do every hour;
Your energy and emotional state in recent days will affect your specific daily schedule;
Going further up, your daily routine, emotional cycles, and environmental conditions control your overall life state in the near future;
Your identity, personality, and social situation determine your long-term rhythm and ultimately shape the course of your entire life.
At each specific scale, once the corresponding factors are given, they will become a stable "boundary condition", allowing the system to spontaneously form a number of clear and stable "steady states" within the current scale.
Going downwards, each steady state is composed of a number of smaller-scale steady states arranged and combined; going upwards, several such steady states are further combined to form a larger steady-state structure.
These steady states are highly stable, highly repetitive, and highly dependent on the key factors of the corresponding scale. It is these large and small influencing factors and steady states that are intertwined together to form the "ecological environment" of our daily lives.
From this perspective, "free will" at the bottom level is really the biggest illusion in life.
We have complete free will every second, seemingly every minute, more free will every hour, and less free will every day and every week. In the macro sense, we are just a piece of duckweed floating in the vast ocean of environment, habits, circumstances, personality, and interests.
The essence of all human self-control strategies is actually to use the smallest scale of free will, with the support of various rules, to overcome large-scale unfavorable factors from the bottom up and leverage larger-scale behaviors. This is undoubtedly as difficult as climbing to the sky.
All efforts, slogans, and CTDP are nothing more than trying to stir up a pale and powerless wave in this vast ocean. This is the tragedy that we, who have weak self-control, will naturally encounter.
17
Now, we can finally look back and more clearly understand the bottlenecks that CTDP, as a typical local behavioral intervention strategy, has encountered.
The first bottleneck is the “scale limitation problem”: CTDP can only affect local behaviors by nature and cannot leverage the long-term factors behind these behaviors.
The core principle of CTDP is to amplify rational tendencies and reduce starting resistance in a short period of time (such as a one-hour focused task) through a series of sophisticated mechanisms, thereby greatly improving the execution of a single behavior. At the micro scale, it has indeed shown amazing effectiveness.
But here’s the thing: our lives aren’t made up of a series of isolated hours. “Not wanting to start a task right now” is just a surface phenomenon, and the larger scale factors that cause this “reluctance to start”, such as energy state, mood swings, a clear sense of purpose, rhythm of life, and even long-term habits, are the real essence.
When these large-scale factors are working against you, even the most sophisticated strategies to force yourself to study for an hour in the present moment are often of limited effectiveness:
If you have been staying up late frequently and are physically and mentally exhausted recently, you will not even want to study rationally, and CTDP cannot solve the problem of staying up late itself;
If you have no clear tasks or urgent DDL in the near future, then CTDP will not be able to urge you to take the initiative to study on your own, after all, you have never had the habit of self-study.
Therefore, the first fatal limitation of CTDP is that it can only affect micro nodes, but cannot shake the macro factors that determine the tone of our lives - it cannot make you energetic, nor can it make you mentally stable, and it is even less likely to shake your habits or life patterns.
The second bottleneck is the "steady-state regression problem": even if we temporarily change our living conditions, the system will spontaneously fall back to its original stable mode.
When factors such as energy state, life rhythm, mentality, habits, etc. on a larger scale are determined, the system will naturally form a series of highly stable behavior patterns. For example, when you first return home for the holiday, "playing games" and "watching videos" may be the easiest lifestyle to maintain, so your daily life will most likely cycle back and forth between "playing games" and "watching videos".
The problem with CTDP is that it can only temporarily interfere with certain nodes of the system. Even if it forcibly replaces a few hours with efficient learning mode, what does it matter?
The steady state of the system is still the same steady state as before! If we extend the time scale, the efficiency of these few hours will still be drowned in the large amount of playing games and watching videos.
For the larger-scale (daily, weekly) steady state determined by habits, work and rest schedules, and emotional states, these few hours of efficiency won’t even make a ripple, let alone any changes to the steady state itself.
This spontaneous decline mechanism is the fundamental reason why the phenomena of "rebound", "recovery to original state" and "intermittent efficiency" occur frequently.
The third bottleneck, which is also the most fundamental and desperate, is the "constraint dissipation problem": any attempt to deviate from the steady state is consuming resources (such as willpower) to maintain a metastable state, which is bound to be unsustainable.
When a person's behavioral system forms a certain stable structure on a large scale, it is like a self-consistent ecosystem with a natural "steady-state attraction." Any effort to get out of this steady state essentially means the need to continuously invest various resources:
Maybe it’s your precious willpower;
Perhaps it is the sunk cost of the CTDP’s ingenious design;
Maybe it’s various restrictive rules, clocking in, and supervision;
But the problem is: these resources are limited . When you try to maintain a "metastable state" that deviates from the original steady state, even if you are very successful at the beginning, it will be worn out by continuous negative factors, and eventually the constraints will collapse and fall back to the original state. Even if you radically change the entire state at once, the resources and power required are difficult to maintain.
Is it possible that there is a “next better steady state” that is higher than the current steady state?
It is possible. Sometimes, even when the big picture is not in your favor, you can stumble into a few days of productive time. But unfortunately, this state of being productive is rare, and before long, your life will go back to how it was before.
The saddest thing is that almost all self-control methods do not have the power to change the entire steady state in a long-term, holistic, and overnight manner.
These self-control methods are only using limited resources to support those local constraints, but they are unable to migrate the whole. When the migration fails, these strategies fall into the dilemma of "pressing down one thing but causing another": they can restrict learning on one side, but the order of life on the other side is disrupted; they can have a regular work and rest on one side, but the emotional state on the other side collapses.
This is because a large negative steady state is often formed by the intersection of multiple small negative steady states: staying up late leads to lack of energy, and being listless makes it easier to indulge in games, and being addicted to games makes you chase excitement, and chasing excitement makes you stay up even later. This comprehensive negative state is like a snake in Changshan. If you hit its head, its tail will come, if you hit its tail, its head will come, and if you hit the middle, both its head and tail will come.
Eventually, you have to admit a harsh reality:
Under the premise of limited resources, all attempts to escape from the current ground state almost face the same fate – a rapid decline after a brief success, and an inability to achieve a true steady-state transition.
In summary, these three bottlenecks together constitute the insurmountable theoretical limit of CTDP. To fundamentally break through these limitations, we must inevitably need a new second-generation method with a more global perspective.
It was not until several years later that I found the key to truly solving these problems.
18
It was a rainy night. I had just received an offer for my PhD application, but was stuck with visa issues. I was bored and clicked on a chess commentary video. It was about the famous game of "single horse capture king" between Li Yiting and Chen Deyuan in a performance match of Sichuan chess players visiting Wuhan in 1960.
At the end, the narrator mentioned:
At this point, the red chess piece has formed a "three-step kill" situation.
What does the so-called "three-step kill" mean?
This means that after entering this situation, the black piece is actually doomed to die. No matter how it responds, the red piece can always kill the black piece within three rounds. All the struggles of the black piece can only delay the time of its own death. If the struggle is not done properly, it may even bring this time forward.
So, why did Black end up in this situation?
Of course, it is because the previous move was wrong. If the black chess piece can regret and go back to the previous move, it may be able to avoid being trapped in the deadlock of "three-step kill"; if it still cannot avoid being killed by "four-step kill" when it goes back to the previous move, it means that the previous move was wrong. And so on.
You will find that if you keep backtracking, you will definitely be able to backtrack to a certain step, which will make the hope of black chess winning reappear! (At least within the computer's search range, red chess cannot find a sure-fire way to win).
This is precisely the key to the second-generation method.
Let’s go back to the example of lying on the sofa and playing with the phone at the beginning.
When we lie on the sofa and watch videos, we have inevitably entered an endless loop. But what if we go back?
To avoid starting to watch videos (99%:1%), we have to avoid holding our phones on the sofa (80%:20%).
To avoid holding our phones on the couch, we need to avoid bringing our phones to the couch (60%:40%).
We can continue. In order to avoid taking our phones to the sofa, we have to avoid being in a state where we are prone to taking our phones to the sofa (50%:50%)…
——Often when tracing back this series of events, the further you go back to the earlier nodes, the smaller the tendency gap between the two options becomes!
Finally, we discovered an exciting pattern:
For any seemingly inescapable "no escape zone", we can definitely trace back to a certain node - at this node, the tendency gap between the two options is small enough to enter the effective intervention range of free will, allowing us to truly avoid entering the final negative dead end.
In other words, every seemingly powerful, large-scale negative steady state can be mapped to a weak, small-scale effective intervention node!
And this node is the real boundary of the "no escape zone".
19
Now comes the interesting part:
As mentioned earlier, the types of negative states we face in our lives are actually extremely limited and highly repetitive; and for any negative state, we can always map it into an effective intervention node by backtracking.
Then, if we impose precise constraints on it, we can achieve the effect of "a little effort to achieve a great result", prevent problems before they happen, and avoid entering those negative states from the beginning. This constraint is the local optimal solution to this situation.
What’s even more exciting is that since negative states themselves are repeatable, the “local optimal solutions” for these states are naturally also repeatable!
Here are some examples:
“Not bringing a mobile phone into the bedroom from the beginning” is many times easier than “resisting yourself from using the mobile phone after bringing it into the bedroom”, so not bringing a mobile phone into the bedroom from the beginning is naturally the best solution to the problem of “using the mobile phone before going to bed”.
"Taking a shower immediately when you have nothing to do after entering the house" is much easier than "forcing yourself to get up and take a shower while lying on the sofa playing with your phone". Therefore, the best solution to the problem of "shower procrastination" is to stipulate that you must take a shower within 15 minutes of entering the house .
(Where the binding force of this regulation comes from will be explained later)
In chess, go and other board games, this optimal operation plan for a specific situation that can be repeatedly applied is called a "pattern".
The so-called fixed pattern is actually the local optimal solution summarized by countless predecessors after in-depth research: in a specific local situation, both black and white chess players must play chess strictly according to the fixed pattern; once either side deviates from the pattern, it will inevitably leave flaws.
Even in an ever-changing chess game, chess players can greatly improve their overall chess skills by mastering each local pattern.
Just like chess players learn patterns, we can use the logic of the “divide and conquer algorithm” to solve the negative states in life one by one:
First, we can identify typical negative problems in life;
Each negative problem can be further broken down into several negative steady states;
And each negative steady state can be mapped to the corresponding effective intervention node through backtracking;
Finally, for each intervention node, we can tailor a precise “pattern” to crack it .
In this way, we can accurately concentrate our limited self-control resources on those truly critical nodes, and downscale those highly repetitive negative states one by one.
If four ounces can move a thousand pounds, then why not use eight ounces to move two thousand pounds, twelve ounces to move three thousand pounds, and sixteen ounces to move four thousand pounds?
20
What’s even more interesting is that eight ounces of force may move more than two thousand pounds.
Suppose we really find a local formula and are willing to invest certain resources (such as method design, willpower or external constraints) to implement it, and successfully "ban" a certain negative state from our lives - for example, we will never lie on the sofa with our mobile phone, or take a shower immediately after returning home.
Then, because its action scale is long enough, this "pattern" itself will be added to the many large-scale factors that affect the steady state of our lives and become one of them!
Is the steady state at this time still the original steady state?
Obviously not. If the initial steady state is recorded asE0E_0 , then, when the first formula is successfully introduced, your life will gradually enter a slightly improved metastable stateE1E_1 . In the new steady stateE1In E_1 , since the state has improved a little bit overall, it is easier to introduce the second formula. Similarly, after the second formula is added, you enter a more optimized metastable state.E2E_2 , inE2Introducing the third formula in E_2 becomes even easier.
For example, if you stop playing with your phone on the sofa, it will be a little easier to take a shower immediately after returning home; if you take a shower immediately after returning home every day to feel refreshed, it will be a little easier to stop checking your phone before going to bed because you have solved the procrastination of taking a shower.
Each new pattern you introduce may produce a "1+1>2" effect, and ultimately improve your life incrementally in a "sausage slicing" manner, gradually transitioning to a better long-term steady state.
As Sun Tzu’s Art of War says: “He who is good at fighting is better than he who is easy to defeat.”
After solving the simple problem, the original medium-difficulty problem becomes a simple problem; if we can then solve this new simple problem, then the originally difficult problem will also become a simple problem.
From beginning to end, we solve simple problems.
Well, we have found a series of formulas, each of which can avoid entering a negative "no escape zone" from the beginning. But if we want to completely achieve steady-state migration on a large scale, we will eventually have to face the most fundamental challenge - the problem of constraint limitations.
As analyzed above, all self-control strategies are essentially using limited resources to support certain local constraints. This inevitably leads to the dilemma of "pressing down one problem but causing another to pop up":E0E_0 toE1E_1 ,E1E_1 toE2E_2 sounds nice,but what may actually happen is that you simply don't have enough energy to maintain these patterns.
When you try to add a new pattern, an old pattern may suddenly break down;
Or you force a formula that is too demanding to implement right from the start, and before it improves the overall situation, all your resources are used up in maintaining it.
Or maybe you introduce a new pattern that is incompatible with the existing ones, and the whole system crashes instantly.
In other words, the order in which you introduce the patterns is also crucial. Not any random order will support you in reaching the next steady state. In addition, where does the constraint of so many patterns come from?
In order to completely solve this problem, we also need to introduce a new and ingenious algorithm - the "recursive backtracking algorithm".
twenty one
The so-called recursive backtracking algorithm is actually a classic algorithm idea widely used in computer science. It is usually used to solve such problems: when you are faced with a system with extremely large possibilities (such as walking through a maze or playing chess), how to find a feasible or even optimal path under the premise of limited resources.
The most typical example is the maze problem :
Imagine you are trapped in a maze. You don't know which way leads to the exit, and you don't have a map. All you can do is keep trying every possible path:
Every time you come to a fork in the road, you first choose a path at random;
If you find that this road is blocked (dead end), go back to the previous intersection and try another possible road;
If the new path doesn't work, keep backing up and changing directions until you finally find a path that leads to the exit.
This is the core logic of the "recursive backtracking algorithm": try → fail → withdraw → change the path → continue trying until you succeed. Just like when playing chess, you make sure you find a winning path by constantly retracting your moves.
So, how do we apply the recursive backtracking algorithm to design the order of adding these "patterns"?
Suppose we design a series of corresponding patterns for various negative states in life, such as:
Make sure to take a shower when you get home (A);
Make sure you don’t bring your phone to the sofa (B);
Make sure you don’t check Xiaohongshu at night (C);
Make sure to wash the dishes as soon as possible after eating (D)
…
At this point, we can use the following " fixed tree " method to organize and manage them:
Rules for adding patterns: You can only add one new pattern as a child node to the pattern tree every day. For example, if I find that the H pattern is highly related to the existing F pattern, I can add the H pattern as a child node of F; if I find that the E pattern looks like a completely new field, I can also directly create a new branch.
Deletion rules of patterns: Manage with a "stack structure". Once a pattern is deleted, all its child nodes will be deleted at the same time. For example, if pattern C fails to execute once, it means that the C pattern and the subsequent F and H combination are not stable. Then delete it generously, and delete the subsequent F and H patterns at the same time. (Of course, you can try to add the C pattern to the end of the tree again in the future)
After repeated iterations like this, we naturally solved the search problem of the "fixed joining order" and the problem of "the source of constraints"!
On the one hand, the more "natural" the addition, the easier it is to maintain the pattern, and the easier it is to stay stably at the roots of the tree.
The reason is: in such an iterative process, if the maintenance cost of a certain pattern is too high and it cannot be stably maintained in the current state, it will naturally collapse and retreat back to the tip of the tree; on the contrary, those high-quality patterns that are easy to join, have no burden to maintain, and have obvious improvements on the overall state can more easily be retained at the root of the tree.
Over time, the first nodes of this pattern tree are all small rules that seem trivial but can bring huge positive effects. Can you guess what my first root node pattern is now? It's just "wash the dishes as soon as possible after eating at home." Although it sounds insignificant, it can effectively avoid a greater state of decadence.
It is these simple and tiny improvements accumulated at the roots that make the foundation of the entire tree more and more solid. Just like the "passive stacking" in the game, it continuously provides tiny improvements to the big state and supports the whole tree to move forward step by step. This is the real "better than easy".
On the other hand, this stack structure will naturally provide strong constraints for newly added patterns.
Think about it, since it takes one day of effort to introduce each pattern, if a pattern with four sub-nodes suddenly fails, it means that your five days of effort and five effective self-control patterns are instantly wasted - and this loss will happen the moment you are ready to give up.
This is its second beauty: the closer a pattern is to the root and the more child nodes it has, the greater the cost of loss, and therefore the better it is protected.
Eventually, those deep-rooted patterns will gradually become your habits because they have been executed for a long time, and the resources required to maintain them will become less and less, and eventually almost negligible. At this time, the constraints saved can be invested in the development of new patterns.
In this way, we finally achieved the miracle of self-control strategy - by integrating local optimal solutions into the fixed pattern tree and conducting recursive backtracking exploration, we can accumulate and magnify the small-scale free will to a level that can affect the overall situation.
And from here, you can really start to change your daily routine, cultivate your energy, improve your health, regulate your pace of life, quit the habits you want to quit, develop the habits you want to develop, and let these large-scale factors that seemed unshakable before start to work for you.
This method is called Recursive Stabilization Iteration Protocol (RSIP).
twenty two
Of course, in actual application, if you want to gamify RSIP, it is also easy.
Because, in reality, this method already has an almost identical counterpart - the National Focus Tree in the famous strategy game "Hearts of Iron" series :
In this game, each country has a huge "national policy tree" filled with various national policy nodes:
Some are used for industrial expansion;
Some strengthened military construction;
Some determine the direction of diplomacy;
Each national policy can provide players with a variety of benefit effects, and choosing and designing your own country's national policy tree is one of the greatest charms of this game.
In real-life applications, I also like to use mind-mapping software (such as MindMaster) to manage such a national policy tree; the process of designing various national policies for RSIP is actually very interesting:
For example, to address the problem of shower procrastination at night, I designed a "national policy": using the automation function of the Apple phone, once you return home from outside at night, it will automatically start a 15-minute countdown, and you must enter the bathroom and start showering before the countdown ends;
For example, in order to deal with the problem of playing with mobile phones after getting up in the morning, which leads to a dull day, the "national policy" I designed is: it is strictly forbidden to use mobile phones in the first 30 minutes after getting up. Mobile phones can only be used to do some serious work, such as washing, organizing, eating breakfast or reading emails, so as to activate the state of the day;
In order to ensure that the RSIP system itself can continue to run stably, I also specially designed a "root national policy" located at the root: When you wake up in the morning, you must open the mind map page of the national policy tree, and you must add a new national policy every day .
And so on.
In fact, most of the current online discussions about self-control are actually unsystematic and extremely fragmented "suggestions". But in fact , they can also be designed into "national policies" and absorbed into this system. You can freely explore the following applications.
postscript
twenty three
The inspiration for this article came from a note about ADHD written by @Allvinn that I saw on Xiaohongshu. At that time, I just wrote some comments about my own struggles with ADHD and self-control issues over the years. I didn't expect that this would give me the idea of organizing my thoughts accumulated over the years into a text.
Frankly speaking, I am not a particularly outstanding person. I am just an ordinary student who grew up in the doting of my parents and was addicted to games. I have long been troubled by bad habits and serious self-control problems. In my past life, my grades often fluctuated between the bottom and the top. Fortunately, I stumbled along the way and studied for a master's degree and a doctorate, becoming an ordinary scientific research worker, and was able to quietly do what I like to do on a laboratory table.
In fact, most people have been thinking about things related to "self-discipline" to a greater or lesser extent since the ignorant days, and I just happened to continue this thinking a little longer.
However, at the end of the article, I actually want to convey a message:
I am extremely disgusted by the excessive deification and admiration of "self-discipline" and "methodology" by many marketing accounts and knowledge bloggers. It seems that as long as you learn their methodology and participate in their training camp, you will be self-disciplined, work hard, change your fate, and start to counterattack.
I don't think so.
In fact, self-control methods have never been a panacea that can make your life better.
Even if you achieve absolute self-discipline, so what? It is just one of the countless pieces of your life puzzle, and it may not even be the most important piece.
In life, there are many more important pieces of the puzzle than this - physical and mental health, family environment, social resources, personal character, interpersonal relationships, and the unpredictable luck, and even the era you live in. They can be like the reefs of fate, like the nails on Dalton's board, pushing the efforts of a smart and self-disciplined young man into an unknown distance.
In our time, many people do suffer from a lack of self-discipline, but there are also many people who are overly self-disciplined, suffering from internal pressure and obsession with "excellence". Self-discipline is never something that everyone needs; what many people need more is to relax and do what they like in a healthy and free way.
Therefore, the effort of this article is actually just to try to help some people, to some extent, and solve some problems.
Not everyone needs self-discipline;
Not everyone who needs self-discipline is suitable for the methods proposed in this article;
Not all the right people will understand it;
Not everyone who understands it can truly benefit from it.
But I have personally experienced, over the past decade or so, the regrets, missed opportunities, internal friction and powerlessness that have resulted from a lack of self-discipline.
In this vast sea of people, is there another me? If there is such a person, I am willing to hold up a small umbrella for him/her.
Even if this umbrella can only help one person who is similar to me in the past out of every thousand people in the world, then it will all be worth it.
twenty four
After talking about individuals, let’s talk about society.
Today, society has a certain degree of understanding and tolerance for depression. However, in comparison, the understanding and tolerance of problems such as "lack of self-control", "procrastination", and "ADHD" are still far from enough.
Today, more and more open-minded people are beginning to realize that depression is a real mental illness, not just "hypocrisy" or "unable to think straight." They are beginning to try to understand the real pain that patients face, such as attention collapse, loss of interest, sleep disorders, emotional dullness, and even somatization.
However, people’s attitudes toward people with ADHD and chronic self-control deficits are often still rude, harsh, and lacking in empathy:
Their procrastination, laziness, and difficulty in taking action are often caricatured as “lazy people making excuses”;
They fall into distractible, addictive behavior patterns, but are seen as having “poor self-control” and “can’t control themselves”;
Even many ADHD patients who struggle to make progress by various means, while regretting their own loss of control, are often met with cold-blooded ridicule as "self-satisfied", "impatient", "just pretending to work hard", and "suitable for working in a factory".
In such a cultural atmosphere, people refuse to respect the objective fact that "human self-control has limitations." Many people are unwilling to admit that human subjective initiative is not unlimited, and it is limited by many objective factors such as neural structure, hormone levels, psychological state, external environment, and long-term habits.
"Lack of self-control" has never been a conceptual problem that can be solved by motivation or persuasion, but rather an objective engineering problem, a system problem, and even a medical problem.
Trying to solve such problems by shouting slogans, giving chicken soup for the soul, motivating and encouraging yourself, or "telling you how to do this and that" is as ridiculous as trying to comfort a depressed patient by saying "you are too fragile", "be more optimistic", or "why don't you just look at it more positively?"
I like this quote from The Great Gatsby:
“Whenever you feel like criticizing someone, remember that not everyone in this world has the same advantages that you have.
Just as human joys and sorrows are not shared, when it comes to self-discipline, human conditions are not shared.
Some people have developed a good habit foundation since childhood. In an environment full of progressive atmosphere, they may only need to make a schedule and smile in front of the mirror a few times every day to easily achieve self-discipline of Easy difficulty.
However, some people have been addicted to mobile phones since childhood, have lazy habits, and come from a decadent family environment. Under such circumstances, achieving self-discipline may be a difficult task.
Self-control is a common problem we have faced since childhood, and almost everyone has thought about it to some extent. So, if the above people have thought about the question of "how to control oneself", what will be the result?
This has formed the general perception of "self-discipline" in society - either stop after fifty steps, or stop after a hundred steps.
The essence of self-control methods is nothing more than compensating for the lack of self-control, just like crutches and wheelchairs are compensations for the motor ability of disabled people. Healthy people do not need crutches or wheelchairs, and patients who have recovered from injuries will no longer need crutches or wheelchairs.
Once this gap is filled, people will no longer pursue more advanced crutches and wheelchairs. So the answer becomes "Little Pony Crossing the River": People with Easy difficulty will say that you just need to smile in the mirror; people with Medium difficulty will suggest that you put down your phone and make a plan; people with Hard difficulty believe that you must rely on external supervision, or even live broadcast learning.
And an extremely counterintuitive social phenomenon originates from this.
Self-discipline is only one of many factors in personal achievement. Those players who play in the Easy difficulty level, in addition to being more self-disciplined, also have a better habit base, environmental help, and social resources, which makes it easier for them to achieve success. Therefore, if you observe those who have achieved success, you will find that they are most likely to have come from the "Easy difficulty level". In their eyes, self-discipline is as easy as doing push-ups in an elevator.
This creates a weird survivor bias: the more successful people are, the more likely they are to use ineffective self-discipline methods, just as the healthier people are, the less real experience they have with using crutches and wheelchairs.
What’s even more frightening is that when society sets these Easy mode players as role models for everyone and grants them supreme voice, it creates an extremely cruel and “why don’t you eat the meat?” atmosphere toward those who truly lack self-control.
I don't know how many comments I've seen from so-called "experienced people" who scoffed at the idea of self-control methods. They said, "Back then, we had only one way to go. We just had to work hard. Why were there so many twists and turns?"
Another time, I saw an interview video of a college entrance examination champion. When the host asked, "What do you think of many students who are decadent, depressed, unmotivated, and unable to discipline themselves?", the champion paused for a moment, raised his proud and childish face, and said sincerely and puzzledly:
"To be honest, I don't understand why anyone would be unmotivated."
Perhaps in his world, self-discipline is as simple as breathing and walking, and motivation is as natural as the sun every day. So, he may not have any malicious intent, but simply cannot understand.
More often, we can see many marketing accounts and self-media bloggers with glamorous titles sharing so-called "self-discipline experience". Clicking in, the whole article is actually about "because of love" and "because of high energy", which is still normal. More bloggers are teaching classes, talking about "thinking models" and "cognitive upgrades", Taoism, Zen, energy sublimation, and spiritual healing.
I'm not criticizing those who are having their way, it's not their fault.
I am just trying to point out a fact - human experiences are not naturally connected; human subjective efforts can never exist independently of objective conditions.
Of course, we cannot expect everyone to understand the complex psychological mechanisms and behavioral sciences, and we cannot even expect everyone to be kind enough.
However, I am willing to start from the lowest starting point and embark on the long march of thousands of steps as an ADHD patient with a very poor self-control foundation, apart from those who have taken 50 or 100 steps. So behind these two generations of methods are hundreds of failed ideas. I have thought about, tried, and analyzed each of the hundreds of methodologies that have appeared on the Internet and in related books. After that, I finally reached the end - I finally had the self-discipline level of a normal person.
Maybe, maybe this road is only suitable for me. But dear stranger, if it can help even one person, even just you, then I think it is all worth it.
I hope that it can take a completely different new path belonging to "technology", beyond the current dazzling array of vague "advice" such as "put down your phone", "make plans", "imagine the future", and "tell yourself".
"Genghis Khan's cavalry had an attack speed comparable to that of 20th century armored forces; the Northern Song Dynasty's crossbows had a range of 1,500 meters, which was about the same as 20th century sniper rifles; but these were still just ancient cavalry and crossbows, and could not possibly compete with modern forces. Basic theory determines everything, and the futurist school clearly sees this."
This is the real reason why I wrote this article.
25
Finally, I would like to say that this content is not intended for any profit, does not build any community, and does not require attention or appreciation. Money can be earned or spent, attention can be gathered or dispersed, but ideas and technology are always there.
If this article can benefit someone, my only wish is not your likes and rewards, but that more people can have more respect and empathy for those who have less conditions, habits, achievements, and academic qualifications than themselves. I will be satisfied then.
Finally, let me end with an MIT License:
Anyone has the right to use, reproduce, modify, merge, publish, distribute, sublicense and/or sell copies of the content in this article without restriction and for commercial purposes, without having to ask for permission or pay any fees to the author, provided that the original author is credited.