rabbit

Zhaojjiahui
·

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago
AIDC-AI/Marco-o1
liked a model 9 days ago
NexaAIDev/omnivision-968M
liked a model 9 days ago
BAAI/bge-m3
View all activity

Organizations

Chinese LLMs on Hugging Face's profile picture

Zhaojjiahui's activity

Reacted to singhsidhukuldeep's post with 👀 10 days ago
view post
Post
897
It's always exciting to revisit Google's DCN paper—impractical but good!

Deep & Cross Network (DCN) - a groundbreaking approach to click-through rate prediction that's revolutionizing digital advertising!

Key Innovation:
DCN introduces a novel cross-network architecture that automatically learns feature interactions without manual engineering. What sets it apart is its ability to explicitly model bounded-degree feature crossings while maintaining the power of deep neural networks.

Technical Deep Dive:
- The architecture combines a cross network with a deep network in parallel.
- The cross network performs automatic feature crossing at each layer.
- The embedding layer transforms sparse categorical features into dense vectors.
- Cross layers use a unique formula that enables efficient high-degree polynomial feature interactions.
- Memory-efficient design with linear complexity O(d) in the input dimension.

Performance Highlights:
- Outperforms traditional DNN models with 60% less memory usage.
- Achieved 0.4419 logloss on the Criteo Display Ads dataset.
- Consistently performs better than state-of-the-art models like Deep Crossing and Factorization Machines.
- Exceptional performance on non-CTR tasks like Forest Covertype (97.40% accuracy).

Under the Hood:
- Uses embedding vectors of dimension 6 × (category cardinality)^1/4.
- Implements batch normalization and the Adam optimizer.
- The cross network depth determines the highest polynomial degree of feature interactions.
- An efficient projection mechanism reduces cubic computational cost to linear.
- Parameter sharing enables better generalization to unseen feature interactions.

Key Advantages:
1. No manual feature engineering required.
2. Explicit feature crossing at each layer.
3. Highly memory-efficient.
4. Scalable to web-scale data.
5. Robust performance across different domains.

Thoughts on how this could transform digital advertising?
  • 2 replies
·
liked a Space about 1 month ago