Saturday, 6 September 2025

xtuner


[[llms]]

# Xtuner 

### Single turn and multi-turn converstation dataset. 

single turn dataset is effective for simple *FAQ bots and text classification related task.* 

multi turn conversation dataset is required for applications needing sustained (continuing for long time) interaction like *customer support, mental health counselling and talkbot robots.* 


#### incremental pre-training: 

    training the llama2 in nepali corpus for boosting the nepali language understanding.


**for instruction tuning reponse generation(output) loss is used for weight updates while the loss of instruction part(system input) is neglected**


*amalgamate:combine*


#### multi-turn conversation dataset sample: 


    <|system|> You are a helpful assistant.
    <|user|> What is the capital of France?
    <|assistant|> Paris is the capital of France.
    <|user|> What's the population?
    <|assistant|> About 67 million people live in France.
    <|user|> Who is the president?
    <|assistant|> Emmanuel Macron is the current president.



**xtuner uses their own method to deal with multi-turn conversation dataset**

i. concatenate the full converstaion into one sequence. 
ii. add special **<|user|> and <|assistant|>** tokens to *mark who said what.*

iii. only computed the loss for *assistant token* (loss mask used: 1 means computer loss, 0 means ignore)

iv. training becomes fast and efficient.


**OpenAI's text-davinci-003 engine for dataset generation, alpaca dataset was generated by that engine.** a single turn dataset. 



[arxiv_dataset](https://www.kaggle.com/datasets/Cornell-University/arxiv)

[MOSS: an open conversational llm](https://link.springer.com/article/10.1007/s11633-024-1502-8)

16b parameter model which can perform variety of instructions in *MULTI-TURN INTERACTIONS WITH HUMAN*.

**datasets** are also provided for *sft*

[moss-oo3-sft](https://github.com/OpenLMLab/MOSS/tree/main/SFT_data)

    a multi-turn dataset, 1.1 million dialogue samples *(full open-source)*


### Preference-aware training

method to align human preferences explictly to the model training process. (rhlf)


#### Spinning up training job with Xtuner

1. SLURM: Simple linux utility for resource management, 

a fault-tolerant and highly-scalable cluster management and job scheduling system. 

manages resources (cpu, gpu, ram, nodes in linux machine) 

reference command: **srun**

2. Kubernetes

container orchestration platform and used in xtuner for orchestrating the containerized training jobs across multiple nodes. 
 
######################################################################

[**accumulative_counts = 4** *(We do 4 forward/backward passes before stepping the optimizer, so it effectively behaves like batch size = 4.)]*


##### norm-based gradient clipping

rescales the gradient vector value if ||g|| > 1, limiting the magnitude to 1. 

if ||g|| <= 1, gradient left unchanged. 



No comments:

Post a Comment