Lernapparat

German LM for the Fast AI model zoo (work in progress)

June 4, 2018

At the excellent fast.ai course and website, they are training a language model zoo.

It's a charming idea and here is (not quite complete yet) code and model I got for German.

I'll write some more (I'm busy and exited with the first steps of my very fine new machine learning consulting company MathInf GmbH), but for now here is the code. It assumes that you have preprocessed wikipedia by this preparation notebook.

The code has been adapted from the fast.ai library and Sylvain Gugger's posts on the fast.ai forum. (Thanks!)

from fastai.learner import *
from fastai.rnn_reg import *
from fastai.rnn_train import *
from fastai.text import *
from fastai.lm_rnn import *
prefix = 'de_wiki'
torch.cuda.set_device(0)
cl=10
bs=48
backwards=False
lr = 8
startat=0
sampled=True
preload=False
bptt=70
em_sz,nh,nl = 400,1150,3
wd=1e-7
print(f'prefix {prefix}; cl {cl}; bs {bs}; backwards {backwards} sampled {sampled} '
          f'lr {lr} startat {startat}')
PRE  = 'bwd_' if backwards else 'fwd_'
PRE2 = PRE
PRE2 = 'bwd_'
IDS = 'ids'

NLPPATH=Path('~/data/nlp/german_lm/data/').expanduser()
PATH=NLPPATH / prefix
opt_fn = partial(optim.SGD, momentum=0.9, weight_decay=wd)
if backwards:
    trn_lm = np.load(PATH / f'tmp/trn_{IDS}_bwd.npy')
    val_lm = np.load(PATH / f'tmp/val_{IDS}_bwd.npy')
else:
    trn_lm = np.load(PATH / f'tmp/trn_{IDS}.npy')
    val_lm = np.load(PATH / f'tmp/val_{IDS}.npy')
trn_lm = np.concatenate(trn_lm)
val_lm = np.concatenate(val_lm)
MAX_TRN_TOKENS=100_000_000
MAX_VAL_TOKENS= 10_000_000
trn_lm = trn_lm[:MAX_TRN_TOKENS]
val_lm = val_lm[:MAX_VAL_TOKENS]
itos = pickle.load(open(PATH / 'tmp/itos.pkl', 'rb'))
vs = len(itos)
vs
trn_dl = LanguageModelLoader(trn_lm[:10_000_000], bs, bptt)
val_dl = LanguageModelLoader(val_lm, bs, bptt)
md = LanguageModelData(PATH, 1, vs, trn_dl, val_dl, bs=bs, bptt=bptt)
drops = np.array([0.25, 0.1, 0.2, 0.02, 0.15])*0.1 
trn_dl = LanguageModelLoader(trn_lm, bs, bptt)
val_dl = LanguageModelLoader(val_lm, bs, bptt)
md = LanguageModelData(PATH, 1, vs, trn_dl, val_dl, bs=bs, bptt=bptt)
learner= md.get_model(opt_fn, em_sz, nh, nl, 
    dropouti=drops[0], dropout=drops[1], wdrop=drops[2], dropoute=drops[3], dropouth=drops[4])
learner.metrics = [accuracy]
learner.clip = 0.2
learner.unfreeze()
learner.fit(lr, 1, cycle_len=10, use_clr_beta=(10,10,0.95,0.85))
learner.save("DE_model_trained2")

And here is the model that was generated: