tiny-llm-demo

tiny-llm-demo - small plain-Python LLM learning demos.
git clone git://git.beep.wimdupont.com/tiny-llm-demo.git
Log | Files | Refs | README | LICENSE

corpus.txt (641B)


      1 language models predict the next token.
      2 language models start as code and random numbers.
      3 language models learn by repeated correction.
      4 
      5 the code defines the model.
      6 the weights start random.
      7 training changes the weights.
      8 prediction creates error.
      9 error updates the weights.
     10 
     11 humans write the tokenizer.
     12 humans write the training loop.
     13 humans choose the architecture.
     14 the model learns the numerical patterns.
     15 
     16 read tokens.
     17 predict next token.
     18 measure error.
     19 adjust weights.
     20 repeat.
     21 
     22 small models show the same loop in a simpler form.
     23 large models scale the same loop up.
     24 the structure is chosen by people.
     25 the behavior is shaped by training.