You must log in or register to comment.
I wonder if this could be used to identify security cam events and notify me
This is awesome. I’ve been processing individual frames of video at a time.
Yes! I’ve been waiting for progress in video for a while! Imagine dyi automated classification for the sake of compilations and edits. This is going to be sick! Can’t wait and see an implementation on llamacpp
Holy shit. I’ve been holding off on looking too deeply into LLaVA given how many things are always popping up. But that’s just too cool to pass up on. The amount of potential applications, if it works as well as I’m hoping, is wild.
If only we could use custom LLM models to write descriptions
Amazing!
Does it work with .ggml or quantized 4-bit gptq?