@Chaosdrifer

Chaosdrifer@alien.top · 2 years ago

You can try something like Claude.ai which has long context and is free to use.

You can use a python script to load the model, split the text into chunks, and ask the model to translate per chunk, then you don’t need a model with 64K context window (which will take up a lot of memory and are not that common).

It also depends on the language you are trying to translate, it would be best to find models that has been trained in the original language, most models have a large english corpus, with many finetuned with chinese data, but there are specialty models for German/arabic/japanese, try google search or find on hugging face.

Chaosdrifer@alien.top · 2 years ago

In the case of petals where any client can drop off at anytime, each client would need multiple layers for redundancy, maybe not the full weight but at least 20-30% so if someone drops off, another one can take over instantly

Chaosdrifer@alien.top · 2 years ago

Yes, you are right. Although I guess it can work in petals as well if each person has the full model downloaded, then the GPU can be instructed to load the next weights locally when it is done with the current one ?

Chaosdrifer@alien.top · 2 years ago

Isn’t that how things like petals.dev work ?

Chaosdrifer@alien.top · 2 years ago

https://continue.dev. It supports many LLMs.

Chaosdrifer@alien.top · 2 years ago

You might want to look into llamaIndex’s SECinsight repo. https://github.com/run-llama/sec-insightsz they do a lot of parsing on financial documents.

Chaosdrifer@alien.top · 2 years ago

Cost is really the main issue. You can train a local LLM, or you can train ChatGPT as well. I wouldn’t be surprised if someone is already making a custom GPT for helping with unity of unreal engine projects.

For Privacy, company with money will use a private instance from Azure, it is like 2-3 times the cost , but your data is safe as you have a contract with MS to keep it safe and private, with large financial penalties if it isn’t.

Also, running LLM locally isn’t 0 cost, depending on the electricity price of your area. GPU consume a LOT of power. The 4090 is like 460 watts.

Chaosdrifer@alien.top · 2 years ago

Please look up fine tuning and LoRA, those are the method to “evolve “ a model after it is born.

Chaosdrifer@alien.top · 2 years ago

It does exist, but really only works when you have very high speed, low latency connections between the machine. Like infiniteband.

Chaosdrifer@alien.top · 2 years ago

If you just want to try it out, install privateGPT on your local PC/Mac, no GPU required.