A Modern Reinterpretation of Keynes’ Bancor and the International Clearing Union
Abstract
Our work reevaluates John Maynard Keynes’ proposal of an international currency and the Clearing Union through the lens of modern economics and reinforcement learning. It explores the historical shift from the Bretton Woods System to the current floating exchange rate regime, focusing on Keynes’ Bancor plan designed to address global trade imbalances. The paper introduces a mathematical model of the Bancor system, emphasizing its unique trade balancing mechanisms. It models nations as independent reinforcement learning agents, which may cooperate to minimize global economic disparities. This approach integrates economic simulations and international relations dynamics, presenting a novel framework for managing global economic challenges and enhancing international economic cooperation.
A Brief History of World Currencies
The current international trade is based on the Kingston System, where the exchange rate between currencies may float and be subject to change. This was not always the case - after the US rose as the world’s economic superpower by the end of World War II, the Bretton Woods System, established in July 1944, fixed the exchange rate between the dollar and each nation’s currency, and pegged the value of the dollar to a fixed amount of gold. It was only after the Nixon Shock in August 1971, when the US suspended the convertibility of its dollar to gold, that the floating exchange rate was used.
In the Bretton Woods Conference, there were two main figures - John Maynard Keynes from the United Kingdom and Harry Dexter White from the United States. Keynes proposed the use of an International Currency named Bancor, whereas White suggested the use of the US dollar, in international trade settings.
Keynes’ proposal consisted of the Bancor currency and the International Clearing Union (ICU). As the name of the union suggests, Keynes’ plan aimed to prevent excessively large trade imbalances between countries. We will refer to it as the Clearing Union, since the international setting is obvious. In this system, countries are given quotas to limit the amount of trade deficit or surplus, and the negative control loop of the Bancor minimizes the trade imbalances, and ideally, zeros them out. In fact, Keynes mentions:
The problem of maintaining equilibrium in the balance of payments between countries has never been solved … the failure to solve this problem has been a major cause of impoverishment and social discontent and even of wars and revolutions. John Maynard Keynes (1941)
It is now widely acknowledged that large trade deficits, and corresponding massive capital inflows, preceded and contributed to fuel economic bubbles, according to Hirai et. al. (2013). Although the exact cause-and-effect relationship still remains unknown, Hirai et. al., in their paper, argue that the issue of excessive trade deficits and foreign exchange reserves are a structural problem, which can be alleviated by reconsidering Keynes’ proposal in the Bretton Woods Conference.
However, with the US winning both World Wars and holding more than 70% of the world’s gold back in 1944, White’s plan to peg the dollar to gold and establishing it as an international currency, was better accepted by the participating nations. The use of the dollar in international trades, allowed the US to utilize it as a source of seigniorage, which is the free revenue generated by issuing the currency. It also made possible the reduction in interest rates for their treasury bonds, since the stable value of the dollar attracted more investors.
Keynes’ Plan
But the privilege, while yielding seigniorage, came with its own cost, such as the risk of accruing massive trade deficits and the challenge of maintaining effective monetary policy. Most importantly, it allowed the accumulation of global imbalances which may lead to escalated global tensions and destabilized global economy.
One such example of a persistent imbalance would be the foreign exchange reserves, and the largest of them being China’s. The Chinese foreign exchange reserve is unreasonably large at 3.12 trillion USD, which is already the largest in the world, and is speculated to have an additional 3 trillion USD worth of ‘hidden’ reserves across the world, according to Setser (2023). If these assets are released abruptly to the market, it may cause significant economic disruptions worldwide. In fact, Setser mentions in his article that China’s massive purchase of the US Agency bonds pushed private investors into riskier securities, fueling the 2007-2008 Financial Crisis. The adoption of Keynes’ proposal will have significant implications for South Korea as well, because it has a history of being heavily hit by the Asian Financial Crisis in 1997, and since then have maintained a huge foreign exchange reserve.
The problem of global imbalances can potentially be addressed by Keynes’ Proposal. There are two key differences that set aside the Bancor-ICU system from other world currencies:
- It is created ex nihilo - The Bancor is created only when trade imbalance is occurred, and is destructed when it is reabsorbed.
- The burden of adjustment is distributed symmetrically between debtors and creditors.
Our contribution
We model Keynes’ Bancor and the Clearing Union mathematically, and set up a Multi-Agent Reinforcement Learning (MARL) objective to study the Bancor’s potential advantages and drawbacks. Studying Keynes’ plan cannot be done solely through data, as Bancor is a hypothetical currency that has never been put to use in practice. We discover that the Bancor and the Clearing Union’s goal can be expressed as a minimization problem, which we will approach to solve using cooperative MARL.
Notations
- For time dependent variables, $t$ is used to denote time, whether continuous or discrete. (ex: $y(t),\;Z(t)$)
- Any variable with one or two indices implies a vector or a matrix, and vice versa. For example, the variable $y_i$ with one index $i$ implies the existence of a column vector $y=(y_i)$, and the variable $Z_{ij}$ implies the matrix $Z=(Z_{ij})$.
-
For a matrix $Z$, $Z^D$ is the diagonal part of $Z$, $Z^*=Z-Z^D$ is the matrix obtained by zeroing its diagonals, and $Z^S=(\sum_{j}Z_{ij})$ is the vector created by summing over each row of $Z$. Also, given the vector $z=(z_i)$, we define the operation $\mathrm{diag}(z)=(\mathrm{diag}(z)_{ij})$ and its inverse operation as:
\[\begin{align*} \mathrm{diag}(z)_{ij}=\begin{cases} z_i, & \text{if } i = j\newline 0, & \text{if } i \neq j \end{cases}\;,\;\;\mathrm{diag}^{-1}(Z)_i=(Z_{ii}). \end{align*}\] - $\star$ denotes the element-wise multiplication of tensors with equal size.
- The indices $i$ and $j$ run through ${1,2,\cdots,n}$, unless otherwise mentioned.
The Bancor
A mathematical description of the Bancor is given below.
Suppose $n$ countries have joined the Clearing Union. Each nation is associated with one account, which is initialized to zero. The account is denominated in a new, international unit of account, the ‘Bancor’. Letting $i \in {1,2,\cdots,n}$, and writing the amount of Bancor in the $i$-th country’s account at time $t$ as $b_i(t)$,
\[b_i(t)=0 \;\; \text{for} \; \;t\le s_i.\]Where $s_i$ is the time the $i$-th country joined the clearing union. Let’s narrow our focus to two countries, $i$ and $j$. When the cumulative net export $NX_{ij}(t)$ from $i$ to $j$ is positive, $i$ is said to have gained a trade surplus, and $j$ is said to be running a trade deficit. The accounts keep track of these imbalances - country $i$ gains Bancors and the country $j$ loses Bancors. Assuming $t>s_i$ for all $i$, and denoting the cumulative export from $i$ to $j$ during the interval $[\max{s_i,s_j},t]$ in Bancors as $X_{ij}(t)$,
\[NX_{ij}(t)=X_{ij}(t)-X_{ji}(t),\newline b_i(t)=NX^S_i(t)=\sum_{j=1}^n {NX_{ij}(t)}.\]Minimization of Global Imbalances
$NX(t)$ is antisymmetric. Since the summation of Bancors over all accounts, or $\sum_i b_i(t)$, is equal to the summation over all elements of the matrix $NX_{ij}(t)$ (which is zero), it is indeed true that the Bancor is created ex nihilo. Hirai et. al. clarifies that Keynes’ system does not block capital movement and efficient allocation of savings, and instead addresses the problem of ‘persistent’ imbalances - the credits and debits that have accumulated without connection to actual investments or repayment. This is easily seen from the equation, since $X_{ij}(t)$ and $X_{ji}(t)$ need not be zero for $NX_{ij}(t)$ to be zero.
Because the accounts are centralized at the Clearing Union, the credit and debit is multilateral, meaning the credits gained due to a surplus in one country can be spent with any other member countries. Therefore, exporting to any member country will reduce a country’s debit. In this way, the Clearing Union can finance international trade without any given amount of money, provided that the system is trusted globally.
Keynes viewed global imbalances as the root cause of many of the world’s problems. D.H. Robertson, the British economist who participated with Keynes in the Bretton Woods Conference stated:
It is arguable that the proudest day in the life of the Manager of the Clearing Union would be that on which, as a result of the smooth functioning of the correctives set in motion by the Plan, there were no holders of international money—on which he was able to show a balance sheet with zero on both sides of the account. Robertson (1943)
Mathematically speaking, $b_i(t)=0$ for all $i$ is preferable. To realize this, both debtors and creditors are subject to the payment of periodic charges, proportional to the size of the imbalance. This might sound counterintuitive at first, especially because not only the debtor but the creditor has to pay for net exports. But, when we frame Keynes’ idea as an optimization problem, it becomes evident why this is necessary. Let $\mathbf{b}(t)\in\mathbb{R}^n$ denote the ordered tuple of account balances of all countries. Then the objective becomes
\[\text{minimize}\int_0^\infty ||b(t)||_1dt=\int_0^\infty \left(\sum_{i=1}^n|b_i(t)|\right)dt,\]Provided that the integral converges. A discretized version of this would be:
\[\text{minimize}\sum_{t=0}^\infty||b(t)||_1=\sum_{t=0}^\infty\sum_{i=1}^n |b_i(t)|\]We again assume that the summation converges. Since this is the minimization of a sum of absolute values, it is minimized if and only if $b_i(t)=0$ for all $i\in{1,2,\cdots,n}$ and $t\in{0,1,2\cdots}$. To accomplish this, we can penalize each nation for accumulating debits or credits. We can see from this objective how leaving the creditor free of charge would not solve the problem of accumulating trade imbalances. Any country that can issue a world currency can use expansionary monetary policies and leave the financial burden behind to the rest of the world, which may lead to negative consequences, as Keynes’ pointed out in his works. He attempts to solve exactly this, with the Bancor and the Clearing Union.
In addition to this ‘carrying cost’, The balances in each account are bounded from above and below - in other words, a country’s limit, or the ‘quota’ is given by the Clearing Union, in proportion to the trade volume of that country. Once the upper limit is reached, its national currency is revaluated to incentivize the import from other countries. Once the lower limit is reached, the currency is devaluated to promote the export of the country’s goods.
Possible extensions
Although Keynes proposed the periodic charge to be proportional to the size of the imbalance, i.e., proportional to the $L^1$ norm, choosing a different minimization objective might be more beneficial. For example, one can use the sum of squares loss, shown below. The minimization objective is valid as long as the minimum value exists, and the objective $b_i(t)=0 \;\; \text{for } i=1,2,\cdots, n$ is met at the minimum.
\[\text{minimize}\sum_{t=0}^\infty||b(t)||_2^2=\sum_{t=0}^\infty\sum_{i=1}^n b_i^2(t)\]Modeling the global imbalances problem as an RL objective
Realism, a popular ideology in international politics, finds its ideological roots in Thomas Hobbes’ seminal work “Leviathan.” In realism, the nation is seen as the primary unit in international politics, since there is no higher authority than the state. Also, it assumes that the behavior of states is more significantly influenced by the international environment than by domestic conditions, and that the primary differentiator among nations is their relative power. Lastly, the realist perspective views international competition as inherently zero-sum, which leads to a high potential for extreme escalations in conflicts.
Since each country is driven by its own interest, it is natural to model each nation as independent RL agents. Modeling each nation as an agent, let’s set the RL objective as follows:
\[\underset{\pi_i\in\mathcal{P_i}}{\text{maximize}}\; \sum_{t=0}^\infty\gamma^t \hat{R}_i(t) = {\sum_{t=0}^\infty\gamma^t\left(\tilde{R}_i(t)-\zeta|b_i(t)|\right)}\]| Here, $\pi_i$ is the fiscal policy of the country $i$ and $\mathcal{P}_i$ is the space of all possible monetary policies for country $i$. Given some discount factor $\gamma\in(0,1]$, each country will attempt to maximize the cumulative discounted returns, where the final reward at timestep $t$ is given by $\hat{R}_i(t)=\tilde{R}_i(t)-\zeta | b_i(t) | $. $\tilde{R}_i(t)$ represents the GDP of country $i$ at timestep $t$, and $\zeta\in \mathbb{R}_0^+$ adjusts the level of penalty given per unit Bancor and unit time. |
Economics simulation
We set our goal to calculate the GDP $\tilde{R}_i(t)$ and start with the demand. Suppose the demand of the country $i$ before shock is given by $D_i(t)$. When some shock $\beta_i(t)>-1$ is present, the demand after shock becomes
\[\tilde{D}_i(t)=(1+\beta_i(t))D_i(t)\]As an action, each country $i$, or agent $i$, selects its interest rate $I_i(t) >0$ at time $t$. Then, the total demand is reduced to
\[\hat{D}_i(t)=\frac{1}{1+I_i(t)} \tilde{D}_i(t)\]The Theory of Uncovered Interest Rate Parity (UIRP) postulates that the expected relative change of nominal foreign exchange rate between two countries $i$ and $j$ over some time interval $[t, t+\tau] \;(\tau>0)$ is equal to the difference in nominal interest rates:
\[\frac{\mathbb{E}[E_{ij}(t+\tau)|E_{ij}(t)]}{E_{ij}(t)}=\frac{1+I_i(t)}{1+I_j(t)}\]The conditional expectation on the left hand side is known to be hard to calculate, hence we will approximate it using some heuristics. If $\tau$ is large enough, we can expect the nominal exchange rate to be centered around the ratio of commodity prices between $i$ and $j$, since its economic value should be equal. Denoting the commodity price of country $i$ (denominated in its own national currency unit) at time $t$ as $P_i(t)$, We can approximate the expectation as shown in the following:
\[\mathbb{E}[E_{ij}(t+t_0)|E_{ij}(t)]\approx P_i(t)/P_j(t)\]Substituting the expectation with the above price ratio and rearranging gives
\[\begin{align} \tag{∗} E_{ij}(t)=\frac{P_i(t)/P_j(t)}{(1+I_i(t))/(1+I_j(t))} \end{align}\]Real commodity prices are the commodity prices denominated in same units. The real exchange rate $RE_{ij}(t)$, as distinguished from the nominal exchange rate $E_{ij}(t)$, is the ratio of real commodity prices and is given by:
\[RE_{ij}(t)=E_{ij}(t)(P_j(t)/P_i(t))\]Substituting $(∗)$ simply gives the interest rate difference:
\[RE_{ij}(t)=\frac{1+I_j(t)}{1+I_i(t)}\]How will the one country’s demand be distributed across products from different countries? A simple answer would be that more demand is allocated to countries with cheaper real commodity prices. We assume that when $i$ is fixed, the demand for products in country $j$ is inversely proportional to the real exchange rate $RE_{ij}(t)$. This leads to the definition of the trade coefficient:
\[\alpha_{ij}(t)=\frac{1/RE_{ij}(t)}{\sum_{j=1}^{n}1/RE_{ij}(t)}=\frac{RE_{ji}(t)}{\sum_{j=1}^{n}RE_{ji}(t)}\]Note that $1/RE_{ij}(t)=RE_{ji}(t)$, and that $\sum_{j=1}^n \alpha_{ij}(t)=1$. Now, the total demand is then split according to the ratio $\alpha_{ij}(t)$, which then creates the flow of goods $F_{ji}(t)$ from $j$ into $i$. Here, the $\Delta$ operator is the discrete difference operator, i.e., $\Delta x(t)=x(t)-x(t-1)$.
\[\begin{aligned}F_{ji}(t)&=\hat{D}_i(t)\alpha_{ij}(t) \newline \Delta X_{ij}(t)&=F_{ij}(t)=\alpha_{ji}(t)\hat{D}_j(t)\;\;\;(i\ne j)\newline \Delta NX_{ij}(t)&=\Delta X_{ij}(t)-\Delta X_{ji}(t)\end{aligned}\]Country $i$’s instantaneous export, import and net export at time $t$ is given by:
\[\begin{aligned} EX_i(t)&=\sum_{j \ne i}\Delta X_{ij}(t)=\sum_{j \ne i}\alpha_{ji}(t)\hat{D}_j(t),\;\;EX(t)=(\alpha^*)^T(t)\hat{D}(t) \newline IM_i(t)&=\sum_{j \ne i}\Delta X_{ji}(t)=\sum_{j \ne i}\alpha_{ij}(t)\hat{D}_i(t),\;\;IM(t)=(\alpha^*)^S(t)\star \hat{D}(t) \newline NET_i(t)&=EX_i(t)-IM_i(t)\end{aligned}\]The GDP $\tilde{R}_i(t)$ is then the sum of total demand and the net export:
\[\tilde{R}_i(t)=\hat{D}_i(t)+ NET_i(t)\]And the amount of Bancors in the $i$-th country’s account is the cumulative sum of the net export:
\[b_i(t)=\sum_{\tau=0}^t{NET_i(t)}\]The commodity prices $P_i(t)$, the inflation rates $\pi_i(t)$, and the shocks $\beta_i(t)$ are updated according to the following:
\[\begin{aligned} P_i(t+1) &= P_i(t) \left(1+ \frac{NX_i (t)}{\hat{D} _i(t)}\right)\left(\frac{1}{1+I_i(t)}\right) \newline \pi_i{(t+1)} &= \frac{P_i{(t+1)}-P_i{(t)}}{P_i{(t)}} \newline D_i{(t+1)} &= \left(\frac{1}{1+\pi_i{(t+1)}}\right)D_i{(t)} \newline \beta_i(t+1) &= \max\{\epsilon-1,\rho \beta_i(t)+\eta_i(t)\},\newline &\text{ where }\epsilon,\rho>0 \text{ and }\eta_i(t) \sim \mathcal{N}(0,\sigma^2)\end{aligned}\]Incorporating International Relations
International relations between countries can be implemented using the following reward sharing mechanism.
Let $A_{ij}(t)$ denote the affinity between two countries $i$ and $j$. $A(t)=(A_{ij}(t))$ is initialized to the identity matrix $I$ at time $t_s=\max{s_i}{i=1,2,\cdots, n}$.Two countries can increase the affinity between each other, by sending invitations and receiving them. When country $i$ sends an invitation to another country $j$, and the $j$ accepts the invitation, the affinity $A{ij}(t)$ is increased by some $\delta>0$ in the next timestep.
| The matrix $A(t)$ is set to be symmetric, with $0 \le A_{ij}(t) \le 1$ and $A_{ii}(t)=1$ for all $i,j$ and $t\ge t_s$. Multiplying $A(t)$ to the reward vector $\hat{R}(t)$ gives the shared reward $R’(t)=A(t)\hat{R}(t)=A(t)(\tilde{R}(t)-\zeta | b(t) | )$, since it is a linear combination of the rewards $R_j(t)$ with the weights of each reward proportional to $A_{ij}(t)$. |
Because $A_{ii}(t)=1$ for all time $t$, the weight of the reward $R_i(t)$ is $1$. We then replace the reward in the RL objective with the shared reward $R_i’(t)$:
\[\begin{aligned}\underset{\pi_i\in\mathcal{P_i}}{\text{maximize}}\; \sum_{t=0}^\infty{\gamma^t R_i'(t)} &= \sum_{t=0}^\infty{\gamma^t(A(t) \hat{R}(t))_i} \newline &= \sum_{t=0}^\infty{\sum_{j=1}^n{A_{ij}(t)\left(\tilde{R}_j(t)-\zeta|b_j(t)|\right)}}\end{aligned}\]How, specifically, does the invite-and-accept mechanism work? The action space for country $i$ when there are $n$ countries is $\mathcal{A}_i=\mathbb{R}\times\mathbb{R}^{n}\times\mathbb{R}^{n}$. When $a_i(t)=(I_i(t), P_i(t),Q_i(t))\in\mathcal{A}_i$ is selected, $I_i(t)$ is used as the interest rate, and then the invitation probabilities and the acceptance probabilities are calculated using the preferences $P_i(t)$ and $Q_i(t)$:
\[p_{ij}(t)=\mathrm{softmax}(P_i)_j(t)\;\;\text{and}\;\;q_{ij}(t)=\mathrm{sigmoid}(Q_i)_j(t)\]In each timestep, country $i$ sends out a single invitation to one of the countries $j$, with probability $p_{ij}(t)$. Then, $q_{ij}(t)$ determines the probability that country $i$ will accept country $j$’s invitation, given that the country $j$ selected $i$. Let $U_{ij}(t)\sim (p_{ij}(t)){j=1,2,\cdots,n}$ denote the boolean invitation matrix of whether the invitation was given from $i$ to $j$, and $V{ij}(t)\sim \mathrm{Bernoulli}(q_{ij}(t))$ be the boolean acceptance matrix of whether $j$ accepted $i$’s invitation, if there were any. Then, the agreements made from the sending the invitation from $i$ to $j$ and accepting it, can be represented as the matrix $M_{ij}(t)$:
\[M_{ij}(t)=\sum_j{U^*_{ij}(t)V^*_{ij}}(t)\]Where the diagonals of $U$ and $V$ are intentionally set to zero to ignore the invitation to oneself. Symmetrizing this matrix and zeroing out the diagonal gives the increment of the affinity matrix:
\[\Delta A_{ij}(t)=\delta(M_{ij}(t)+M_{ji}(t))\]In matrix form,
\[\begin{aligned} M(t)&=U^*(t)(V^*)^T(t) \newline \Delta A(t) &= \delta(M(t)+M^T(t)) \end{aligned}\]The maximum increment is $2\delta$, since the invitation can be sent and received in both directions. To be certain that $0 \le A_{ij}(t) \le 1$, $\delta>0$ is chosen so that $(2\delta) T\le1$ for the total number of timesteps, $T$.
Code
economic_env.py: The economics environment for the RL agents is implemented using PettingZoo.
politics_env.py: The reward sharing mechanism is tested for randomly generated rewards.
APPO.py: The RL agent is created and trained using Ray RLlib, using APPO+LSTM combination.
load_and_play_ui.py: The trained model is loaded and tested in the economics environment, with one of the agent’s action replaced with human input.
PreprocessingInternationalRelationsData: The affinity matrix $A_{ij}(t)$, which is initialized to the identity matrix $I$, may be replaced with the IGO_adjmat.csv or DCAD_adjmat.csv, which are created by preprocessing the Intergovernmental Organization (IGO) dataset or the Defense Cooperation Agreement Dataset (DCAD), respectively.
Experiments
Training results are not as expected, and debugging is in progress. Further investigation of the simulation is necessary.