Update README.md
Browse files
README.md
CHANGED
|
@@ -25,7 +25,7 @@ The training corpus consists of the following datasets:
|
|
| 25 |
|----------|-------------|
|
| 26 |
| Business & Finance | 736,071,807 |
|
| 27 |
| News | 1,700,662,378 |
|
| 28 |
-
| Education |
|
| 29 |
| Social | 211,000,000 |
|
| 30 |
| Government | 40,492,117 |
|
| 31 |
| Medical | 42,987,587 |
|
|
@@ -34,7 +34,6 @@ The training corpus consists of the following datasets:
|
|
| 34 |
| Research Articles | 4,185,649,758 |
|
| 35 |
| Law | 467,994,847 |
|
| 36 |
| Travel | 6,948,290 |
|
| 37 |
-
| Buddhism | 21,600,000 |
|
| 38 |
| Others | 4,410,619 |
|
| 39 |
|
| 40 |
*Token counts calculated using Qwen3 Tokenizer
|
|
|
|
| 25 |
|----------|-------------|
|
| 26 |
| Business & Finance | 736,071,807 |
|
| 27 |
| News | 1,700,662,378 |
|
| 28 |
+
| Education | 576,489,778 |
|
| 29 |
| Social | 211,000,000 |
|
| 30 |
| Government | 40,492,117 |
|
| 31 |
| Medical | 42,987,587 |
|
|
|
|
| 34 |
| Research Articles | 4,185,649,758 |
|
| 35 |
| Law | 467,994,847 |
|
| 36 |
| Travel | 6,948,290 |
|
|
|
|
| 37 |
| Others | 4,410,619 |
|
| 38 |
|
| 39 |
*Token counts calculated using Qwen3 Tokenizer
|