I just think you have to set proper expectations. I use llava with my security cameras and it does what I want. Which is to know when something interesting is happening like when it sees someone. Llava gave me this from one of my security cameras earlier this morning.
The image features a person walking on a street, captured through a fisheye lens, which distorts the perspective of the scene. The person appears to be carrying a bag, possibly a backpack, while walking down the sidewalk.
Would love to use this for handling remote security camera footage.
Tried with LLAVA with little success. Has anyone successfully applied any of the Open Vision models to the problem of security?
I just think you have to set proper expectations. I use llava with my security cameras and it does what I want. Which is to know when something interesting is happening like when it sees someone. Llava gave me this from one of my security cameras earlier this morning.
Which IMO is very useful.