transformers: xtuner

[[llms]]

# Xtuner

### Single turn and multi-turn converstation dataset.

single turn dataset is effective for simple *FAQ bots and text classification related task.*

multi turn conversation dataset is required for applications needing sustained (continuing for long time) interaction like *customer support, mental health counselling and talkbot robots.*

#### incremental pre-training:

training the llama2 in nepali corpus for boosting the nepali language understanding.

**for instruction tuning reponse generation(output) loss is used for weight updates while the loss of instruction part(system input) is neglected**

*amalgamate:combine*

#### multi-turn conversation dataset sample:

<|system|> You are a helpful assistant.
<|user|> What is the capital of France?
<|assistant|> Paris is the capital of France.
<|user|> What's the population?
<|assistant|> About 67 million people live in France.
<|user|> Who is the president?
<|assistant|> Emmanuel Macron is the current president.

**xtuner uses their own method to deal with multi-turn conversation dataset**

i. concatenate the full converstaion into one sequence.
ii. add special **<|user|> and <|assistant|>** tokens to *mark who said what.*

iii. only computed the loss for *assistant token* (loss mask used: 1 means computer loss, 0 means ignore)

iv. training becomes fast and efficient.

**OpenAI's text-davinci-003 engine for dataset generation, alpaca dataset was generated by that engine.** a single turn dataset.

[arxiv_dataset](https://www.kaggle.com/datasets/Cornell-University/arxiv)

[MOSS: an open conversational llm](https://link.springer.com/article/10.1007/s11633-024-1502-8)

16b parameter model which can perform variety of instructions in *MULTI-TURN INTERACTIONS WITH HUMAN*.

**datasets** are also provided for *sft*

[moss-oo3-sft](https://github.com/OpenLMLab/MOSS/tree/main/SFT_data)

a multi-turn dataset, 1.1 million dialogue samples *(full open-source)*

### Preference-aware training

method to align human preferences explictly to the model training process. (rhlf)

#### Spinning up training job with Xtuner

1. SLURM: Simple linux utility for resource management,

a fault-tolerant and highly-scalable cluster management and job scheduling system.

manages resources (cpu, gpu, ram, nodes in linux machine)

reference command: **srun**

2. Kubernetes

container orchestration platform and used in xtuner for orchestrating the containerized training jobs across multiple nodes.

######################################################################

[**accumulative_counts = 4** *(We do 4 forward/backward passes before stepping the optimizer, so it effectively behaves like batch size = 4.)]*

##### norm-based gradient clipping

rescales the gradient vector value if ||g|| > 1, limiting the magnitude to 1.

if ||g|| <= 1, gradient left unchanged.

transformers

Pages

Saturday, 6 September 2025

xtuner

No comments:

Post a Comment