Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
11.8
TFLOPS
13
6
53
Kshitij Thakkar
PRO
kshitijthakkar
Follow
agenticai54's profile picture
21world's profile picture
merve's profile picture
18 followers
·
80 following
Mandark-droid
kshitij-thakkar-2061b924
AI & ML interests
AI observability + MoE efficiency engineer. Building tools that make GenAI traceable, measurable, and production-ready.
Recent Activity
published
an
article
1 day ago
Scaling Mixture of Experts: Architecture Search for Billion-Parameter Language Models
updated
a collection
1 day ago
Large MoE Architecture Search (1B-2B)
updated
a dataset
1 day ago
kshitijthakkar/large-moe-inference-benchmark
View all activity
Organizations
kshitijthakkar
's models
107
Sort: Recently updated
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-bs4-ctx1024
Updated
4 days ago
•
25
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-bs2-ctx2048
Updated
4 days ago
•
26
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-bs2-ctx1024
Updated
4 days ago
•
23
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-bs1-ctx2048
Updated
4 days ago
•
34
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-bs1-ctx1024
Updated
4 days ago
•
18
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr1e-03
Updated
4 days ago
•
25
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr5e-04
Updated
4 days ago
•
32
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr3e-04
Updated
4 days ago
•
25
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr2e-04
Updated
4 days ago
•
23
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr1e-04
Updated
4 days ago
•
33
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr5e-05
Updated
4 days ago
•
19
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr3e-05
Updated
4 days ago
•
20
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr1e-05
Updated
4 days ago
•
23
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b-lr5e-06
Updated
4 days ago
•
22
kshitijthakkar/moe-1422m-667m-8x2-10L-large-wide-1.5b
Updated
4 days ago
•
11
kshitijthakkar/moe-1687m-781m-12x4-16L-large-deep-1.5b
Updated
4 days ago
•
5
kshitijthakkar/moe-1083m-554m-16x2-8L-large-moe-1.3b-top2
Updated
4 days ago
•
9
kshitijthakkar/moe-1083m-781m-16x8-8L-large-moe-1.3b
Updated
4 days ago
•
17
kshitijthakkar/moe-1002m-399m-8x2-16L-large-moe-1b
Updated
5 days ago
•
33
kshitijthakkar/loggenix-moe-255m-optimized-sft-v2-checkpoints
Updated
5 days ago
kshitijthakkar/loggenix-moe-255m-optimized-sft-v1
Text Generation
•
0.3B
•
Updated
7 days ago
•
29
kshitijthakkar/loggenix-moe-255m-optimized-sft
Text Generation
•
0.3B
•
Updated
7 days ago
•
11
kshitijthakkar/loggenix-moe-255m-optimized-v1
Text Generation
•
0.3B
•
Updated
8 days ago
•
63
kshitijthakkar/loggenix-moe-255m-optimized-test-v1
Updated
8 days ago
kshitijthakkar/loggenix-moe-255m-optimized
Text Generation
•
0.3B
•
Updated
8 days ago
•
21
kshitijthakkar/loggenix-moe-255m-optimized-test
Updated
9 days ago
kshitijthakkar/moe-255m-114m-12x2-12L-full-attention-no-gqa-bs8-ctx512
Updated
9 days ago
•
31
kshitijthakkar/moe-255m-114m-12x2-12L-full-attention-no-gqa-bs4-ctx1024
Updated
9 days ago
•
29
kshitijthakkar/moe-255m-114m-12x2-12L-full-attention-no-gqa-bs4-ctx512
Updated
9 days ago
•
22
kshitijthakkar/moe-255m-114m-12x2-12L-full-attention-no-gqa-bs2-ctx2048
Updated
9 days ago
•
24
Previous
1
2
3
4
Next