Translate to and from 400+ languages locally with MADLAD-400

jbochi@alien.top · 2 years ago

Translate to and from 400+ languages locally with MADLAD-400

a_beautiful_rhind@alien.top · 2 years ago

If anything needed some minimalist app, this would be it.

redditmias@alien.top · 2 years ago

Nice, I will check madlad later. Now, I thought seamless4MT was the best translation model from meta, I didnt even know about this NLLB existed. Does anyone have used both and can point out the difference? seamless4mt seemd amazingly good in my experience, but have less languages perhaps, idk

phoneixAdi@alien.top · 2 years ago

Nice thank you!! Tried in space. Works well for me. Noob question. Can I run this with llama.cpp? Since it’s gguf. Can I download this and run it locally?

vasileer@alien.top · 2 years ago

I tested the 3B model for Romanian, Russian, French, and German translations of the “The sun rises in the East and sets in the West.” and it works 100%: it gets 10/10 from ChatGPT

yugaljain1999@alien.top · 2 years ago

@jbochi , Is it possible to run cargo example for batch inputs?

cargo run --example t5 --release --features cuda – \ –model-id “jbochi/madlad400-3b-mt” \ –prompt “<2de> How are you, my friend?” \ –temperature 0

Thanks

fractal83@alien.top · 2 years ago

Yes, I would be interested to know if this is possible

cygn@alien.top · 2 years ago

Could you please convert the other versions as well or release the code you used ?

cygn@alien.top · 2 years ago

I tested two sentences: one from hindi to english, which it translated fine. Another was romanized hindi which it couldn’t handle: input: Sir mera dhaan ka fasal hai Output was the same as input. Both ChatGPT and Google Translate can handle this.

Townsiti5689@alien.top · 2 years ago

Not sure if this has been asked yet, but how good are the translations from this model compared to normal GPT-3.5 and Claude?

Thanks.

jbochi@alien.top · 2 years ago

Good question. ALMA compares itself against NLLB and GPT3.5, and the 13B barely surpasses GPT3.5. MADLAD-400 probably beats GPT3.5 on lower resource languages only.

Inevitable_Emu2722@alien.top · 2 years ago

Hi, i have the following error while trying to run it from transformers copying the code provided in huggingface

Traceback (most recent call last):

File “/home/XXX/project/translation/translateMADLAD.py”, line 10, in

tokenizer = T5Tokenizer.from_pretrained(‘jbochi/madlad400-3b-mt’)

File “/home/lXXX/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py”, line 1841, in from_pretrained

return cls._from_pretrained(

File “/home/lXXX/.local/lib/python3.10/site-packages/transformers/tokenization_utils_base.py”, line 2060, in _from_pretrained

raise ValueError(

ValueError: Non-consecutive added token ‘’ found. Should have index 256100 but has index 256000 in saved vocabulary.

danigoncalves@alien.top · 2 years ago

What would be the equivalent models based on open source and free for commercial use?

lowkeyintensity@alien.top · 2 years ago

Meta’s NLLB is supposed to be the best translator model, right? But it’s for non-commercial use only. How does MADLAD compare to NLLB?

jbochi@alien.top · 2 years ago

The MADLAD-400 paper has a bunch of comparisons with NLLB. MADLAD beats NLLB in some benchmarks, it’s quite close in others, and it loses some. But the largest MADLAD is 5x smaller than the original NLLB. It also supports more 2x more languages.

HaruSosake@alien.top · 2 years ago

NLLB has horrible performance, I’ve done extensive testing with it and wouldn’t even translate a children’s book with it. Google Translator does a much better job and that’s saying something. lol

Serious-Commercial10@alien.top · 2 years ago

For most people, they only need a few languages, such as en cn jp. If there are multiple combination versions, I will use it to develop my own translation application

jbochi@alien.top · 2 years ago

es, such as en cn jp. If there are multiple combination versions, I will use it to develop my own translation applic

Check the OPUS models by Helsinki-NLP: https://huggingface.co/Helsinki-NLP?sort_models=downloads#models

Background_Aspect_36@alien.top · 2 years ago

n00b here. can it run in oobabooga?

jbochi@alien.top · 2 years ago

It should. Support for T5 based models was added in https://github.com/oobabooga/text-generation-webui/pull/1535

Igoory@alien.top · 2 years ago

Yes, it indeed works. I managed to run the 10B model on CPU, it uses 40GB of ram, but somehow I felt like your 3b space gave me a better translation.

cygn@alien.top · 2 years ago

How do you load the model? I pasted jbochi/madlad400-3b-mt in the download model field and used “transformers” model loader, but it can’t handle it. OSError: It looks like the config file at ‘models/model.safetensors’ is not a valid JSON file.

Igoory@alien.top · 2 years ago

I think I did exactly like you say, so I have no idea why you got an error.

Blobbloblaw@alien.top · 2 years ago

Why the shitty name?

lowkeyintensity@alien.top · 2 years ago

Gibberish names have been a things since the 90s. It’s hard coming up with a name when everyone is racing to create the next Big Thing. Also, I think techies are more tolerant of cumbersome names/domains.

justynasty@alien.top · 2 years ago

The koboldcpp-1.46.1 (from October) says ERROR: Detected unimplemented GGUF Arch. It’s best to get the newest version of the backend.