Hi all,
The official paper (http://arxiv.org/abs/2307.09288) mentions “LLama 2-Chat temporal Perception” abilities (p33) and figure 22 illustrates “Time awareness” of the model.
But how did they provide yearly context to the model?
I’d like to replicate a similar benchmark for Llama2 and other LLMs but it isn’t clear how to replicate the same setup.
Commenting to follow, this is fascinating.