We can start with the shallow water equations: \(\require{cancel}\)
\begin{align*} \frac{Du}{Dt} -fv &= -g\frac{\partial h}{\partial x} \\
\frac{Dv}{Dt}+fu &= -g\frac{\partial h}{\partial y} \\
\frac{Dh}{Dt} + h\nabla \cdot \vec{u} &= 0
\end{align*}
where \(\vec{u}=(u,v)\) is the horizontal velocity, \(h=h(x,y,t)\) is the depth of the fluid, \(g\) is the gravitational acceleration, and \(f\) is the Coriolis parameter. We are on a \(\beta\)-plane, so let \(f = f_0+\beta y\).
We can break the velocities into a geostrophic and ageostrophic component:
\begin{align*} u &= u_0 + \epsilon u_1 \\ v &= v_0 + \epsilon v_1
\end{align*}
and the height field into a constant mean height and a surface deviation field:
\begin{align*} h &= H + h'
\end{align*}
Note that while we will assume \(\frac{h'}{H}\) is order-\(\epsilon\), \(h'\) itself is not necessarily small compared with the geostrophic terms, and it should not be viewed as the ageostrophic height component.
In order to group the terms of the equations by orders of \(\epsilon\), one would need to non-dimensionalize the equations by choosing the appropriate scales for length, velocity, etc. In QG, \(\epsilon\) refers to the Rossby number, defined as
$$ \epsilon = \frac{U}{f_0 L}$$
A key scaling assumption of QG is that rotation is a fast process compared with lateral advection, so \(\epsilon\) is a small parameter. Variations in the Coriolis parameter with latitude are also assumed to be small, so terms with \(\beta\) are order-\(\epsilon\).
When non-dimensionalizing, under the assumptions of QG one would find that in the momentum equation, the advective derivative and \(\beta\) terms are scaled by \(\epsilon\), so they drop out of the zeroth order equations. At order zero, the height gradient is in balance with the Coriolis term multiplying \(f_0\) and the horizontal divergence is zero, so the following equations hold:
\begin{align*}
-f_0v_0 &= -g\frac{\partial h}{\partial x} =-g\frac{\partial h'}{\partial x}\\
f_0u_0 &= g\frac{\partial h}{\partial y} =-g\frac{\partial h'}{\partial y}\\
\end{align*}
And in the height evolution equation, the advective term drops out, leaving us with the statement that the divergence of geostrophic velocity is zero:
$$ \nabla \cdot \vec{u_0} = 0 $$
At order \(\epsilon\), after canceling out the appropriate terms and taking the curl of the momentum equation, we arrive at these order-\(\epsilon\) evolution equations:
\begin{align*}
\frac{D_0\zeta}{Dt} +f_0\left(\frac{\partial u_1}{\partial x} +\frac{\partial v_1}{\partial y}\right) + \beta v_0 &= 0 \\
\frac{D_0h}{Dt} + h\left(\frac{\partial u_1}{\partial x} +\frac{\partial v_1}{\partial y}\right) &= 0\\
\frac{D_0(\cancel{H}+h')}{Dt} + (H+\cancel{h'})\left(\frac{\partial u_1}{\partial x} +\frac{\partial v_1}{\partial y}\right) &= 0\\
\frac{D_0h'}{Dt} + H\left(\frac{\partial u_1}{\partial x} +\frac{\partial v_1}{\partial y}\right) &= 0
\end{align*}
where \(\frac{D_0}{Dt} = \vec{u_0} \cdot \nabla\) and \(\zeta = (\nabla \times \vec{u_0}) \cdot \hat{k}\)
We can replace the divergence term with an expression we get from the mass conservation equation:
\begin{align*}
\frac{\partial u_1}{\partial x} +\frac{\partial v_1}{\partial y} &= -\frac{1}{H}\frac{D_0h'}{Dt}\\
\frac{D_0\zeta}{Dt} -\frac{f_0}{H}\frac{D_0h'}{Dt}+ \beta v_0 &= 0
\end{align*}
Our goal now is to combine these terms into an evolution equations for PV. Note that
$$ \beta v_0 = \beta \frac{Dy}{Dt} = \frac{D}{Dt}(\beta y) $$
So we end up with
\begin{align*}
\frac{D_0}{Dt} \left(\zeta + \beta y - \frac{f_0 h'}{H}\right) = \frac{D_0 q}{Dt} = 0
\end{align*}
And this is the shallow water QG equation!
Because we're in a state of rest, we only need to consider the order-\(\epsilon\) terms we found before. Let us look at the evolution equation after rewriting everything in terms of the geostrophic streamfunction \(\psi\):
$$\left( \frac{D}{Dt}\left[\nabla^2 - \frac{f_0^2}{gH}\right] + \beta \frac{\partial}{\partial x} \right)\psi = 0$$
To linearize, we simply get rid of the non-linear advective term, so we are left only with a partial derivative with respect to time:
$$\left( \frac{\partial}{\partial t}\left[\nabla^2 - \frac{f_0^2}{gH}\right] + \beta \frac{\partial}{\partial x} \right)\psi = 0$$
To study waves in this system, we will substitute a plane-wave solution via a Fourier series expansion:
$$ \psi(\vec{x}) = \sum\limits_{k,l,\omega}\tilde{\psi}_k\text{exp}\left[i\left(kx+ly - \omega t\right)\right] $$
Because the equation is linear, we can study individual modes and set the streamfunction equal to just one term in this sum (physically it must equal the real part of this term, but by linearity the coefficient will simply drop out). Differentiation becomes multiplication under this transform.
Substituting,
\begin{align*}\left( -\cancel{i}\omega\left[-(k^2+l^2) - \frac{f_0^2}{gH}\right] + \cancel{i}k\beta \right)\tilde{\psi} &= 0\end{align*}
Let \(k_d = \frac{f_0}{\sqrt{gH}}\). This is the inverse of \(L_d\), the Rossby radius of deformation. So
\begin{align*} 0 &=\left(-\omega\left[-k^2-l^2 - k_d^2\right] + k\beta \right)\tilde{\psi} \\
\omega &= -\frac{\beta k}{k^2+l^2+k_d^2} \end{align*}
Let \(\vec{K} = (k,l)\), the wavenumber vector. The phase velocity is
\begin{align*}
\vec{v}_{p} &= \frac{\omega\vec{K}}{|\vec{K}|^2} = -\frac{\beta k\vec{K}}{|\vec{K}|(|\vec{K}|^2+k_d^2)}\\
\end{align*}
For the group velocity, we need to calculate the derivatives with respect to \(k\) and \(l\).
\begin{align*}
u_g &= \frac{\partial \omega}{\partial k} \\
&= -\beta \frac{(k^2+l^2+k_d^2) - k(2k)}{(k^2+l^2+k_d^2)^2} \\
&=-\beta \frac{(-k^2+l^2+k_d^2)}{(k^2+l^2+k_d^2)^2}\\
v_g &= \frac{\partial \omega}{\partial l}\\
&= \frac{2\beta k l}{(k^2+l^2+k_d^2)^2}
\end{align*}
So
\begin{align*}
\vec{v}_g &= \frac{\beta }{(k^2+l^2+k_d^2)^2} \begin{pmatrix} k^2-l^2-k_d^2 \\ 2kl\end{pmatrix}
\end{align*}
We found the meridional group velocity of the waves in the last part:
$$ v_{g} = \frac{2\beta k l}{(k^2+l^2+k_d^2)^2}$$
So this is proportional to \(kl\), and we need to show that the direction of the eddy momentum flux is proportional to \(-kl\).
Since we linearized about a state of rest, \(u=u'\) and \(v=v'\). Recall the relationship between the streamfunction and velocities:
\begin{align*}
u' &= -\frac{\partial \psi}{\partial y}\\
v' &= \frac{\partial \psi}{\partial x}
\end{align*}
which gives the following relationship between their Fourier coefficients:
\begin{align*}
\tilde{u} &= -il \tilde{\psi} \\
\tilde{v} &= ik \tilde{\psi}
\end{align*}
To compute \(\overline{u'v'}\), we can use the following formula: if
\begin{align*}
A &= \Re\left\{\tilde{A} \exp(i\theta)\right\}\\
B &= \Re\left\{\tilde{B} \exp(i\theta)\right\}
\end{align*}
where \( \theta = kx + ly - \omega t\), then
\begin{align*}
\overline{AB} &= \frac{1}{2}\Re\left\{ \tilde{A}^{*} \tilde{B}\right\}
\end{align*}
We use this to arrive at our desired expression:
\begin{align*}
\overline{u'v'} &= \frac{1}{2}\Re\left( il \tilde{\psi}^{*} ik \tilde{\psi}\right)\\
&=-\frac{|\tilde{\psi}|^2 kl}{2}
\end{align*}
and so this is in the opposite direction of the meridional group velocity.