I am trying to understand how they use the random oracle to solve the CDH. For example, in the security proof on page 7 of the following paper; A Lightweight Message Authentication Scheme for Smart Grid Communications, Fouda et al. 2001. The authors used a random oracle to break the security of an authenticated DH protocol. They assumed an adversary who can break the semantic security of the shared key and used this adversary along with a random oracle to solve CDH.
Answers to your questions:
1) Random oracle - is a replacement for hash-function. Its input is any bitstring, and output - random string. In some sense - it's an ideal hash-function. But in order to evaluate it, you (attacker) need to explicitly ask the oracle.
2) "Why would the adversary query g^ab to the random oracle. Is it part of the game?" - no, adversary of course isn't obligated anything for you, and it isn't obligated to ask ROM for $g^{ab}$. But, easy to see, that if the attacker didn't ask oracle for $g^{ab}$, he has only a negligible probability of success. In the article it's simply expressed as "has no idea on the session key"; it's indeed true - recall that oracle will return some random string for this input, regardless of all previous actions/queries of the attacker. While it's not a rigorous mathematical statement, it still correctly describes what happens.
In addition, I would try to explain why this ROM model good for modeling hash function. In this proof we essentially use the fact that any attacker will explicitly ask the oracle for hash-values, and we can track all these queries. While in real life, for real hash function, we can't track attacker actions. But, if hash-function is good (ideal), to get hash-value everyone needs to honestly calculate it by applying hash-function to some input (e.g., you can't derive hash-value of some input from hash-values of other, even somehow related, inputs). So, if hash function is good enough, any successful attacker at some point should calculate hash from $g^{ab}$, i.e. this attacker should know $g^{ab}$ at some step. So, this attacker is able to solve CDH. This is a meaning of this proof in ROM. It doesn't mean that for any hash-function it's true - it's only true for good hash functions, which are really unpredictable and behave like random oracles.
External links referenced by this document: