@atelier
Qwen3 8B
A dense Qwen3 model with all parameters active on each forward pass. It works well as a general-purpose baseline for reasoning, instruction following, and comparing tuned checkpoints against a stable base model.
Base model
Qwen/Qwen3-8B
Method
Hybrid
Size
8B
Model ID
Qwen/...en3-8B
Actions
Type
Base
First-party entry
Size
8B
Model scale
Architecture
Dense
Activation
Training Type
Hybrid
Optimization