Shilong Li's Blog
  • Home
  • Archives
  • Categories
  • Tags
  • About

Entroy的数值稳定计算方法

12345def entropy_from_logits(logits: torch.Tensor): """Calculate entropy from logits.""" pd = torch.nn.functional.softmax(logits, dim=-1) entropy = torch.logsume
2025-05-28
Work
#LLM
A unified perspective of RLHF

A unified perspective of RLHF

Currently popular RLHF Method To this day,the post-training diagram for LLMs is still CPT, SFT and RLHF. There are no signs that this diagram will change currently. Focusing on RLHF, I will attempt t
2024-10-29
Work
#LLM RLHF

Search

Hexo Fluid
Page Views: Unique Visitors: