Thistleknot@alien.topB to LocalLLaMAEnglish · 1 year agoThe Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic datamessage-squaremessage-square8fedilinkarrow-up11arrow-down10file-text
arrow-up11arrow-down1message-squareThe Q* hypothesis: Tree-of-thoughts reasoning, process reward models, and supercharging synthetic dataThistleknot@alien.topB to LocalLLaMAEnglish · 1 year agomessage-square8fedilinkfile-text
minus-squareFeztopia@alien.topBlinkfedilinkEnglisharrow-up1·1 year agoI just can’t wait until one of the wrong Q* hypotheses turn out to be even better than Q*
I just can’t wait until one of the wrong Q* hypotheses turn out to be even better than Q*