Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid parameters error FAMultiMapPack_fixed.cpp #138

Open
Kaelorn opened this issue Jan 17, 2022 · 0 comments
Open

Invalid parameters error FAMultiMapPack_fixed.cpp #138

Kaelorn opened this issue Jan 17, 2022 · 0 comments

Comments

@Kaelorn
Copy link

Kaelorn commented Jan 17, 2022

I am new to BlingFire and I try to make the tokenizer of this model: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment
I followed the following tutorial: https://github.com/Microsoft/BlingFire/wiki/How-to-add-a-new-BERT-tokenizer-model
The compile process took so long I decided to follow the advise of this issue: #92
I ended up with this script:

# Preparing a new tokenizer
​
#	Initial Steps
​
mkdir bert_multi_uncased
cp bert_base_tok/* bert_multi_uncased
cd bert_multi_uncased
chmod 777 ./options.small
sed -i "5s#.*#OUTPUT = bert_multi_uncased.bin#" "./options.small"
sed -i "9s#.*#opt_build_wbd = --dict-root=. --full-unicode --no-min#" "./options.small"
​
#	Enable Normalization
​
python gen_charmap.py > charmap.utf8
​
#	Add a New vocab.txt
​
wget https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment/resolve/main/vocab.txt -O ./vocab.txt
python3 vocab_to_fa_lex.py
​
chmod 777 ./wbd.lex.utf8
sed -i "21s#.*#_include bert_multi_uncased/vocab.falex#" "./wbd.lex.utf8"
​
#	Compile Your New Model
​
cd ..
make -f Makefile.gnu lang=bert_multi_uncased all

But after 5 days of compilation I have the following error:

ERROR: Invalid parameters. in /home/.../BlingFire/blingfirecompile.library/src/FAMultiMapPack_fixed.cpp at line 131 in program fa_fsm2fsm_pack

What did I miss?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant