[[llms]]
# Xtuner
### Single turn and multi-turn converstation dataset.
single turn dataset is effective for simple *FAQ bots and text classification related task.*
multi turn conversation dataset is required for applications needing sustained (continuing for long time) interaction like *customer support, mental health counselling and talkbot robots.*
#### incremental pre-training:
training the llama2 in nepali corpus for boosting the nepali language understanding.
**for instruction tuning reponse generation(output) loss is used for weight updates while the loss of instruction part(system input) is neglected**
*amalgamate:combine*
#### multi-turn conversation dataset sample:
<|system|> You are a helpful assistant.
<|user|> What is the capital of France?
<|assistant|> Paris is the capital of France.
<|user|> What's the population?
<|assistant|> About 67 million people live in France.
<|user|> Who is the president?
<|assistant|> Emmanuel Macron is the current president.
**xtuner uses their own method to deal with multi-turn conversation dataset**
i. concatenate the full converstaion into one sequence.
ii. add special **<|user|> and <|assistant|>** tokens to *mark who said what.*
iii. only computed the loss for *assistant token* (loss mask used: 1 means computer loss, 0 means ignore)
iv. training becomes fast and efficient.
**OpenAI's text-davinci-003 engine for dataset generation, alpaca dataset was generated by that engine.** a single turn dataset.
[arxiv_dataset](https://www.kaggle.com/datasets/Cornell-University/arxiv)
[MOSS: an open conversational llm](https://link.springer.com/article/10.1007/s11633-024-1502-8)
16b parameter model which can perform variety of instructions in *MULTI-TURN INTERACTIONS WITH HUMAN*.
**datasets** are also provided for *sft*
[moss-oo3-sft](https://github.com/OpenLMLab/MOSS/tree/main/SFT_data)
a multi-turn dataset, 1.1 million dialogue samples *(full open-source)*
### Preference-aware training
method to align human preferences explictly to the model training process. (rhlf)
#### Spinning up training job with Xtuner
1. SLURM: Simple linux utility for resource management,
a fault-tolerant and highly-scalable cluster management and job scheduling system.
manages resources (cpu, gpu, ram, nodes in linux machine)
reference command: **srun**
2. Kubernetes
container orchestration platform and used in xtuner for orchestrating the containerized training jobs across multiple nodes.
######################################################################
[**accumulative_counts = 4** *(We do 4 forward/backward passes before stepping the optimizer, so it effectively behaves like batch size = 4.)]*
##### norm-based gradient clipping
rescales the gradient vector value if ||g|| > 1, limiting the magnitude to 1.
if ||g|| <= 1, gradient left unchanged.
Saturday, 6 September 2025
xtuner
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment