
See axolotl config
axolotl version: 0.3.0
base_model: codellama/CodeLlama-7b-hf
base_model_config: codellama/CodeLlama-7b-hf
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
hub_model_id: EvolCodeLlama-JS-7b
load_in_8bit: false
load_in_4bit: true
strict: false
datasets:
- path: harryng4869/Evol-Instruct-JS-1k
type: alpaca
dataset_prepared_path: last_run_prepared
val_set_size: 0.02
output_dir: ./qlora-out
adapter: qlora
lora_model_dir:
sequence_len: 2048
sample_packing: true
lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:
gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 3
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002
train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
warmup_steps: 100
eval_steps: 0.01
save_strategy: epoch
save_steps:
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
bos_token: "<s>"
eos_token: "</s>"
unk_token: "<unk>"
EvolCodeLlama-JS-7b
This model is a fine-tuned version of codellama/CodeLlama-7b-hf on the None dataset.
It achieves the following results on the evaluation set:
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 3
Training results
| Training Loss |
Epoch |
Step |
Validation Loss |
| 0.4099 |
0.02 |
1 |
0.4313 |
| 0.5677 |
0.04 |
2 |
0.4313 |
| 0.4255 |
0.08 |
4 |
0.4315 |
| 0.4352 |
0.12 |
6 |
0.4312 |
| 0.4457 |
0.17 |
8 |
0.4312 |
| 0.4705 |
0.21 |
10 |
0.4309 |
| 0.4492 |
0.25 |
12 |
0.4303 |
| 0.5233 |
0.29 |
14 |
0.4294 |
| 0.3795 |
0.33 |
16 |
0.4275 |
| 0.456 |
0.37 |
18 |
0.4248 |
| 0.5132 |
0.41 |
20 |
0.4204 |
| 0.3543 |
0.46 |
22 |
0.4136 |
| 0.4132 |
0.5 |
24 |
0.4046 |
| 0.4219 |
0.54 |
26 |
0.3936 |
| 0.3956 |
0.58 |
28 |
0.3813 |
| 0.3587 |
0.62 |
30 |
0.3697 |
| 0.409 |
0.66 |
32 |
0.3587 |
| 0.3093 |
0.7 |
34 |
0.3483 |
| 0.3717 |
0.75 |
36 |
0.3407 |
| 0.3357 |
0.79 |
38 |
0.3345 |
| 0.2912 |
0.83 |
40 |
0.3289 |
| 0.3171 |
0.87 |
42 |
0.3243 |
| 0.3368 |
0.91 |
44 |
0.3210 |
| 0.3906 |
0.95 |
46 |
0.3180 |
| 0.3491 |
0.99 |
48 |
0.3159 |
| 0.274 |
1.02 |
50 |
0.3133 |
| 0.2474 |
1.06 |
52 |
0.3126 |
| 0.3236 |
1.1 |
54 |
0.3106 |
| 0.3327 |
1.14 |
56 |
0.3092 |
| 0.3153 |
1.18 |
58 |
0.3081 |
| 0.3809 |
1.22 |
60 |
0.3079 |
| 0.2792 |
1.26 |
62 |
0.3072 |
| 0.2465 |
1.31 |
64 |
0.3055 |
| 0.2831 |
1.35 |
66 |
0.3060 |
| 0.408 |
1.39 |
68 |
0.3064 |
| 0.2881 |
1.43 |
70 |
0.3045 |
| 0.2715 |
1.47 |
72 |
0.3018 |
| 0.2686 |
1.51 |
74 |
0.3008 |
| 0.3605 |
1.55 |
76 |
0.3008 |
| 0.2644 |
1.6 |
78 |
0.3002 |
| 0.3479 |
1.64 |
80 |
0.2990 |
| 0.2821 |
1.68 |
82 |
0.2983 |
| 0.3193 |
1.72 |
84 |
0.2980 |
| 0.2857 |
1.76 |
86 |
0.2969 |
| 0.2484 |
1.8 |
88 |
0.2965 |
| 0.236 |
1.84 |
90 |
0.2957 |
| 0.3554 |
1.89 |
92 |
0.2946 |
| 0.2968 |
1.93 |
94 |
0.2931 |
| 0.3792 |
1.97 |
96 |
0.2914 |
| 0.2574 |
2.01 |
98 |
0.2909 |
| 0.3192 |
2.02 |
100 |
0.2915 |
| 0.2519 |
2.06 |
102 |
0.2934 |
| 0.2165 |
2.1 |
104 |
0.2968 |
| 0.2499 |
2.14 |
106 |
0.2960 |
| 0.2243 |
2.18 |
108 |
0.2931 |
| 0.2523 |
2.22 |
110 |
0.2923 |
| 0.2644 |
2.26 |
112 |
0.2943 |
| 0.2048 |
2.31 |
114 |
0.2946 |
| 0.1853 |
2.35 |
116 |
0.2932 |
| 0.2441 |
2.39 |
118 |
0.2927 |
| 0.2494 |
2.43 |
120 |
0.2928 |
| 0.2184 |
2.47 |
122 |
0.2927 |
| 0.2376 |
2.51 |
124 |
0.2932 |
| 0.2496 |
2.55 |
126 |
0.2924 |
| 0.2029 |
2.6 |
128 |
0.2915 |
| 0.2602 |
2.64 |
130 |
0.2908 |
| 0.2137 |
2.68 |
132 |
0.2907 |
| 0.2617 |
2.72 |
134 |
0.2901 |
| 0.2532 |
2.76 |
136 |
0.2901 |
| 0.2743 |
2.8 |
138 |
0.2900 |
| 0.2181 |
2.84 |
140 |
0.2900 |
| 0.254 |
2.89 |
142 |
0.2899 |
| 0.2463 |
2.93 |
144 |
0.2897 |
Framework versions
- PEFT 0.7.2.dev0
- Transformers 4.37.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.16.1
- Tokenizers 0.15.0