ShareGPT4V - New multi-modal model, improves on LLaVA

Cradawx@alien.top · 2 years ago

ShareGPT4V - New multi-modal model, improves on LLaVA

yahma@alien.top · 2 years ago

Would love to use this for handling remote security camera footage.

Tried with LLAVA with little success. Has anyone successfully applied any of the Open Vision models to the problem of security?

fallingdowndizzyvr@alien.top · 2 years ago

I just think you have to set proper expectations. I use llava with my security cameras and it does what I want. Which is to know when something interesting is happening like when it sees someone. Llava gave me this from one of my security cameras earlier this morning.

The image features a person walking on a street, captured through a fisheye lens, which distorts the perspective of the scene. The person appears to be carrying a bag, possibly a backpack, while walking down the sidewalk.

Which IMO is very useful.