We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I am new to BlingFire and I try to make the tokenizer of this model: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment I followed the following tutorial: https://github.com/Microsoft/BlingFire/wiki/How-to-add-a-new-BERT-tokenizer-model The compile process took so long I decided to follow the advise of this issue: #92 I ended up with this script:
# Preparing a new tokenizer # Initial Steps mkdir bert_multi_uncased cp bert_base_tok/* bert_multi_uncased cd bert_multi_uncased chmod 777 ./options.small sed -i "5s#.*#OUTPUT = bert_multi_uncased.bin#" "./options.small" sed -i "9s#.*#opt_build_wbd = --dict-root=. --full-unicode --no-min#" "./options.small" # Enable Normalization python gen_charmap.py > charmap.utf8 # Add a New vocab.txt wget https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment/resolve/main/vocab.txt -O ./vocab.txt python3 vocab_to_fa_lex.py chmod 777 ./wbd.lex.utf8 sed -i "21s#.*#_include bert_multi_uncased/vocab.falex#" "./wbd.lex.utf8" # Compile Your New Model cd .. make -f Makefile.gnu lang=bert_multi_uncased all
But after 5 days of compilation I have the following error:
ERROR: Invalid parameters. in /home/.../BlingFire/blingfirecompile.library/src/FAMultiMapPack_fixed.cpp at line 131 in program fa_fsm2fsm_pack
What did I miss?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
I am new to BlingFire and I try to make the tokenizer of this model: https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment
I followed the following tutorial: https://github.com/Microsoft/BlingFire/wiki/How-to-add-a-new-BERT-tokenizer-model
The compile process took so long I decided to follow the advise of this issue: #92
I ended up with this script:
But after 5 days of compilation I have the following error:
What did I miss?
The text was updated successfully, but these errors were encountered: